[Mesa-dev] [PATCH] anv/pipeline: Enable only one dispatch width in case of per sample shading

2016-07-26 Thread Anuj Phogat
Fixes ~45 DEQP sample shading tests:
./deqp-vk --deqp-case=dEQP-VK.pipeline.multisample.min_sample_shading*

Many tests exited with VK_ERROR_OUT_OF_DEVICE_MEMORY without this patch.

Cc: Jason Ekstrand 
Signed-off-by: Anuj Phogat 

---
Another patch enabling the sample shading is required to test this patch.
I'll send out the enabling patch once we pass all the sample shading tests.
Use https://github.com/aphogat/mesa, branch: review to test the patch.
---
 src/intel/vulkan/gen7_pipeline.c |  9 -
 src/intel/vulkan/gen8_pipeline.c | 12 
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/gen7_pipeline.c b/src/intel/vulkan/gen7_pipeline.c
index 8ce50be..23535f5 100644
--- a/src/intel/vulkan/gen7_pipeline.c
+++ b/src/intel/vulkan/gen7_pipeline.c
@@ -249,6 +249,8 @@ genX(graphics_pipeline_create)(
  anv_finishme("primitive_id needs sbe swizzling setup");
 
   emit_3dstate_sbe(pipeline);
+  bool per_sample_ps = pCreateInfo->pMultisampleState &&
+   pCreateInfo->pMultisampleState->sampleShadingEnable;
 
   anv_batch_emit(>batch, GENX(3DSTATE_PS), ps) {
  ps.KernelStartPointer0   = pipeline->ps_ksp0;
@@ -274,7 +276,12 @@ genX(graphics_pipeline_create)(
 
  ps._32PixelDispatchEnable= false;
  ps._16PixelDispatchEnable= wm_prog_data->dispatch_16;
- ps._8PixelDispatchEnable = wm_prog_data->dispatch_8;
+ /* On all hardware generations, the only configurations supporting
+  * persample dispatch are in which only one dispatch width is enabled.
+  */
+ ps._8PixelDispatchEnable = wm_prog_data->dispatch_8 &&
+(!per_sample_ps ||
+ !wm_prog_data->dispatch_16);
 
  ps.DispatchGRFStartRegisterforConstantSetupData0 =
 wm_prog_data->base.dispatch_grf_start_reg,
diff --git a/src/intel/vulkan/gen8_pipeline.c b/src/intel/vulkan/gen8_pipeline.c
index cc10d3a..bde7660 100644
--- a/src/intel/vulkan/gen8_pipeline.c
+++ b/src/intel/vulkan/gen8_pipeline.c
@@ -333,12 +333,19 @@ genX(graphics_pipeline_create)(
   }
} else {
   emit_3dstate_sbe(pipeline);
+  bool per_sample_ps = pCreateInfo->pMultisampleState &&
+   pCreateInfo->pMultisampleState->sampleShadingEnable;
 
   anv_batch_emit(>batch, GENX(3DSTATE_PS), ps) {
  ps.KernelStartPointer0 = pipeline->ps_ksp0;
  ps.KernelStartPointer1 = 0;
  ps.KernelStartPointer2 = pipeline->ps_ksp0 + 
wm_prog_data->prog_offset_2;
- ps._8PixelDispatchEnable   = wm_prog_data->dispatch_8;
+ /* On all hardware generations, the only configurations supporting
+  * persample dispatch are in which only one dispatch width is enabled.
+  */
+ ps._8PixelDispatchEnable   = wm_prog_data->dispatch_8 &&
+  (!per_sample_ps ||
+   !wm_prog_data->dispatch_16);
  ps._16PixelDispatchEnable  = wm_prog_data->dispatch_16;
  ps._32PixelDispatchEnable  = false;
  ps.SingleProgramFlow   = false;
@@ -365,9 +372,6 @@ genX(graphics_pipeline_create)(
 wm_prog_data->dispatch_grf_start_reg_2;
   }
 
-  bool per_sample_ps = pCreateInfo->pMultisampleState &&
-   pCreateInfo->pMultisampleState->sampleShadingEnable;
-
   anv_batch_emit(>batch, GENX(3DSTATE_PS_EXTRA), ps) {
  ps.PixelShaderValid  = true;
  ps.PixelShaderKillsPixel = wm_prog_data->uses_kill;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix move_interpolation_to_top() pass.

2016-07-26 Thread Kenneth Graunke
The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae
was entirely broken.  A missing "break" made the load_interpolated_input
case always fall through to "default" and hit a "continue", making it
not actually move any load_interpolated_input intrinsics at all.
It would only move the simple load_barycentric_* intrinsics, which
don't emit any code anyway, making it basically useless.

The initial version I sent of the pass worked, but I apparently
failed to verify that the simplified version in v2 actually worked.

With the obvious fix applied (so we actually tried to move LIIs),
I discovered a second bug: we weren't moving the offset SSA def
to the top, breaking SSA validation.

The new version of the pass actually moves load_interpolated_input
intrinsics and all their dependencies, as intended.

Papers over GPU hangs on Ivybridge and Baytrail caused by the
recent NIR FS input rework by restoring the old behavior.
(I'm not honestly sure why they hang with PLN not at the top.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 49 
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f9af525..bcd08ac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -6353,38 +6353,45 @@ move_interpolation_to_top(nir_shader *nir)
  continue;
 
   nir_block *top = nir_start_block(f->impl);
+  exec_node *cursor_node = NULL;
 
   nir_foreach_block(block, f->impl) {
  if (block == top)
 continue;
 
- nir_foreach_instr_reverse_safe(instr, block) {
+ nir_foreach_instr_safe(instr, block) {
 if (instr->type != nir_instr_type_intrinsic)
continue;
 
 nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
-switch (intrin->intrinsic) {
-case nir_intrinsic_load_barycentric_pixel:
-case nir_intrinsic_load_barycentric_centroid:
-case nir_intrinsic_load_barycentric_sample:
-   break;
-case nir_intrinsic_load_interpolated_input: {
-   nir_intrinsic_instr *bary_intrinsic =
-  nir_instr_as_intrinsic(intrin->src[0].ssa->parent_instr);
-   nir_intrinsic_op op = bary_intrinsic->intrinsic;
-
-   /* Leave interpolateAtSample/Offset() where it is. */
-   if (op == nir_intrinsic_load_barycentric_at_sample ||
-   op == nir_intrinsic_load_barycentric_at_offset)
-  continue;
-}
-default:
+if (intrin->intrinsic != nir_intrinsic_load_interpolated_input)
+   continue;
+nir_intrinsic_instr *bary_intrinsic =
+   nir_instr_as_intrinsic(intrin->src[0].ssa->parent_instr);
+nir_intrinsic_op op = bary_intrinsic->intrinsic;
+
+/* Leave interpolateAtSample/Offset() where they are. */
+if (op == nir_intrinsic_load_barycentric_at_sample ||
+op == nir_intrinsic_load_barycentric_at_offset)
continue;
-}
 
-exec_node_remove(>node);
-exec_list_push_head(>instr_list, >node);
-instr->block = top;
+nir_instr *move[3] = {
+   _intrinsic->instr,
+   intrin->src[1].ssa->parent_instr,
+   instr
+};
+
+for (int i = 0; i < 3; i++) {
+   if (move[i]->block != top) {
+  move[i]->block = top;
+  exec_node_remove([i]->node);
+  if (cursor_node)
+ exec_node_insert_after(cursor_node, [i]->node);
+  else
+ exec_list_push_head(>instr_list, [i]->node);
+  cursor_node = [i]->node;
+   }
+}
  }
   }
   nir_metadata_preserve(f->impl, (nir_metadata)
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/21] i965/fs: Get rid of fs_visitor::do_dual_src.

2016-07-26 Thread Francisco Jerez
Anuj Phogat  writes:

> On Fri, Jul 22, 2016 at 8:58 PM, Francisco Jerez  
> wrote:
>> This boolean flag was being used for two different things:
>>
>>  - To set the brw_wm_prog_data::dual_src_blend flag.  Instead we can
>>just set it based on whether the dual_src_output register is valid,
>>which will be the case if the shader writes the secondary blending
>>color.
>>
>>  - To decide whether to call emit_single_fb_write() once, or in a loop
>>that would iterate only once, which seems pretty useless.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h   |  1 -
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 --
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 37 
>> +++-
>>  3 files changed, 14 insertions(+), 26 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index fc1e1c4..46b15b4 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -318,7 +318,6 @@ public:
>> fs_reg sample_mask;
>> fs_reg outputs[VARYING_SLOT_MAX];
>> fs_reg dual_src_output;
>> -   bool do_dual_src;
>> int first_non_payload_grf;
>> /** Either BRW_MAX_GRF or GEN7_MRF_HACK_START */
>> unsigned max_grf;
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 50d73eb..2872b2d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -103,12 +103,10 @@ fs_visitor::nir_setup_outputs()
>>   if (key->force_dual_color_blend &&
>>   var->data.location == FRAG_RESULT_DATA1) {
>>  this->dual_src_output = reg;
>> -this->do_dual_src = true;
>>   } else if (var->data.index > 0) {
>>  assert(var->data.location == FRAG_RESULT_DATA0);
>>  assert(var->data.index == 1);
>>  this->dual_src_output = reg;
>> -this->do_dual_src = true;
>>   } else if (var->data.location == FRAG_RESULT_COLOR) {
>>  /* Writing gl_FragColor outputs to all color regions. */
>>  for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); 
>> i++) {
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> index 6d84374..808d8af 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> @@ -437,33 +437,25 @@ fs_visitor::emit_fb_writes()
>> "in SIMD16+ mode.\n");
>> }
>>
>> -   if (do_dual_src) {
>> -  const fs_builder abld = bld.annotate("FB dual-source write");
>> +   for (int target = 0; target < key->nr_color_regions; target++) {
>> +  /* Skip over outputs that weren't written. */
>> +  if (this->outputs[target].file == BAD_FILE)
>> + continue;
>>
>> -  inst = emit_single_fb_write(abld, this->outputs[0],
>> -  this->dual_src_output, reg_undef, 4);
>> -  inst->target = 0;
>> -
>> -  prog_data->dual_src_blend = true;
>> -   } else {
>> -  for (int target = 0; target < key->nr_color_regions; target++) {
>> - /* Skip over outputs that weren't written. */
>> - if (this->outputs[target].file == BAD_FILE)
>> -continue;
>> +  const fs_builder abld = bld.annotate(
>> + ralloc_asprintf(this->mem_ctx, "FB write target %d", target));
>>
>> - const fs_builder abld = bld.annotate(
>> -ralloc_asprintf(this->mem_ctx, "FB write target %d", target));
>> +  fs_reg src0_alpha;
>> +  if (devinfo->gen >= 6 && key->replicate_alpha && target != 0)
>> + src0_alpha = offset(outputs[0], bld, 3);
>>
>> - fs_reg src0_alpha;
>> - if (devinfo->gen >= 6 && key->replicate_alpha && target != 0)
>> -src0_alpha = offset(outputs[0], bld, 3);
>> -
>> - inst = emit_single_fb_write(abld, this->outputs[target], reg_undef,
>> - src0_alpha, 4);
>> - inst->target = target;
>> -  }
>> +  inst = emit_single_fb_write(abld, this->outputs[target],
>> +  this->dual_src_output, src0_alpha, 4);
>> +  inst->target = target;
>> }
>>
>> +   prog_data->dual_src_blend = (this->dual_src_output.file != BAD_FILE);
>> +
> It'll be nice to add this assert here:
> assert(!prog_data->dual_src_blend ||  key->nr_color_regions == 1);
>
Heh, part of my purpose with this was to make the code above less wrong
for dual source blending in combination with multiple render targets --
Though it could be argued that the code is still kind of broken for that
case because the hardware doesn't support sending a src0 alpha payload
in the dual source RT write message, and because there is still a single
dual_src_output register instead of a per-target array of registers, so
I've 

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-26 Thread Alejandro Piñeiro
On 23/07/16 00:31, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> Hi,
>>
>> On 15/07/16 22:46, Francisco Jerez wrote:
>>> Alejandro Piñeiro  writes:
>>>
 On 14/07/16 21:24, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> Without this commit, a image is considered valid if the level of the
>> texture bound to the image is complete, something we can check as mesa
>> save independently if it is "base incomplete" of "mipmap incomplete".
>>
>> But, from the OpenGL 4.3 Core Specification, section 8.25 ("Texture
>> Image Loads and Stores"):
>>
>>   "An access is considered invalid if:
>> the texture bound to the selected image unit is incomplete;"
>>
>> This implies that the access to the image unit is invalid if the
>> texture is incomplete, no mattering details about the specific texture
>> level bound to the image.
>>
>> This fixes:
>> GL44-CTS.shader_image_load_store.incomplete_textures
>> ---
>>
>> Current piglit test is not testing what this commit tries to fix. I
>> will send a patch to piglit in short.
>>
>>  src/mesa/main/shaderimage.c | 14 +++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
>> index 90643c4..d20cd90 100644
>> --- a/src/mesa/main/shaderimage.c
>> +++ b/src/mesa/main/shaderimage.c
>> @@ -469,10 +469,18 @@ _mesa_is_image_unit_valid(struct gl_context *ctx, 
>> struct gl_image_unit *u)
>> if (!t->_BaseComplete && !t->_MipmapComplete)
>> _mesa_test_texobj_completeness(ctx, t);
>>  
>> +   /* From the OpenGL 4.3 Core Specification, Chapter 8.25, Texture 
>> Image
>> +* Loads and Stores:
>> +*
>> +*  "An access is considered invalid if:
>> +*the texture bound to the selected image unit is incomplete;"
>> +*/
>> +   if (!t->_BaseComplete ||
>> +   !t->_MipmapComplete)
>> +  return GL_FALSE;
> I don't think this is correct, AFAIUI a texture having _MipmapComplete
> equal to false doesn't imply that the texture as a whole would be
> considered incomplete according to the GL's definition of completeness.
> Whether or not it's considered complete usually depends on the sampler
> state while you're doing regular texture sampling: If the sampler a
> texture object is used with has any of the mipmap filtering modes
> enabled you need to check _MipmapComplete, otherwise you need to check
> _BaseComplete.  The problem when you attempt to carry over this
> definition to shader images (as the spec implies) is that image units
> have no sampler state as such, and that they can only ever access one
> specified level of the texture at a time (potentially a texture level
> other than the base).  This patch makes image units behave like a
> sampler unit with mipmap filtering enabled for the purpose of texture
> completeness validation, which is almost definitely too strong.
 Yes, I didn't realize that _BaseComplete and _MipmapComplete were not
 checking the state at all. Thanks for pointing it.

> An alternative would be to do something along the lines of:
>
> | if (!_mesa_is_texture_complete(t, >Sampler))
> |return GL_FALSE;
 Yes, that is what I wanted, to return false if the texture is incomplete.

> The problem is that you would then run into problems when some of the
> non-base mipmap levels are missing but the sampler state baked into the
> gl_texture_object says that you aren't mipmapping, so the GL spec would
> normally consider the texture to be complete and
> _mesa_is_texture_complete would return true accordingly, but still you
> wouldn't be able to use any of the missing texture levels as shader
> image if the application tried to bind them to an image unit (that's the
> reason for the u->Level vs t->BaseLevel checks below you're removing).
 Ok, then if I understand correctly, the solution is not about replacing
 the level checks for _mesa_is_texture_complete, but keeping current
 checks, and add a _mesa_is_texture_complete check. Just checked and
 everything seems to work fine (except that now the behaviour is more
 strict, see below). I will send a patch in short.

>>> Yeah, that would likely work and get the CTS test to pass, but it would
>>> still be more strict than the spec says and consider cases that are OK
>>> according to the spec to be incomplete, so I was reluctant to call it a
>>> solution.
>>>
>>> I think the ideal solution would be for the state of an image unit to be
>>> independent from the filtering and sampling state, and depend on the
>>> completeness of the bound level *only*.  Any idea if this CTS (or your
>>> equivalent 

Re: [Mesa-dev] [PATCH] mesa: Make MESA_SHADER_CAPTURE_PATH skip shaders with Name == -1.

2016-07-26 Thread Matt Turner
On Tue, Jul 26, 2016 at 10:03 AM, Kenneth Graunke  wrote:
> Shaders with shProg->Name == ~0 (aka 4294967295) are internal meta
> shaders that we don't really want to capture.
>
> Signed-off-by: Kenneth Graunke 

Reviewed-by: Matt Turner 

There are lots of meta shaders in the internal shader-db that we
should remove. grepping for GL_AMD_vertex_shader_layer will find them.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] configure: fix LLVM 4.0.0svn compilation, add libs for LLVM static linking

2016-07-26 Thread Jan Ziak
Signed-off-by: Jan Ziak (atom-symbol.net) <0xe2.0x9a.0...@gmail.com>
---
 configure.ac | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 5c196a9..58c2db4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2187,6 +2187,7 @@ if test "x$enable_gallium_llvm" = xyes; then
 
 LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker 
instrumentation"
 LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option objcarcopts 
profiledata"
+LLVM_COMPONENTS="${LLVM_COMPONENTS} coverage"
 fi
 DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT 
-DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
 MESA_LLVM=1
@@ -2534,8 +2535,8 @@ if test "x$MESA_LLVM" != x0; then
 AC_MSG_WARN([Building mesa with statically linked LLVM may cause 
compilation issues])
 dnl We need to link to llvm system libs when using static libs
 dnl However, only llvm 3.5+ provides --system-libs
-if test $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; then
-LLVM_LIBS="$LLVM_LIBS `$LLVM_CONFIG --system-libs`"
+if test $LLVM_VERSION_MAJOR -ge 4 -o $LLVM_VERSION_MAJOR -eq 3 -a 
$LLVM_VERSION_MINOR -ge 5; then
+LLVM_LIBS="$LLVM_LIBS `$LLVM_CONFIG --system-libs` $(pkg-config 
--libs ncurses zlib)"
 fi
 fi
 fi
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-26 Thread Francisco Jerez
Alejandro Piñeiro  writes:

> On 23/07/16 00:31, Francisco Jerez wrote:
>> Alejandro Piñeiro  writes:
>>
>>> Hi,
>>>
>>> On 15/07/16 22:46, Francisco Jerez wrote:
 Alejandro Piñeiro  writes:

> On 14/07/16 21:24, Francisco Jerez wrote:
>> Alejandro Piñeiro  writes:
>>
>>> Without this commit, a image is considered valid if the level of the
>>> texture bound to the image is complete, something we can check as mesa
>>> save independently if it is "base incomplete" of "mipmap incomplete".
>>>
>>> But, from the OpenGL 4.3 Core Specification, section 8.25 ("Texture
>>> Image Loads and Stores"):
>>>
>>>   "An access is considered invalid if:
>>> the texture bound to the selected image unit is incomplete;"
>>>
>>> This implies that the access to the image unit is invalid if the
>>> texture is incomplete, no mattering details about the specific texture
>>> level bound to the image.
>>>
>>> This fixes:
>>> GL44-CTS.shader_image_load_store.incomplete_textures
>>> ---
>>>
>>> Current piglit test is not testing what this commit tries to fix. I
>>> will send a patch to piglit in short.
>>>
>>>  src/mesa/main/shaderimage.c | 14 +++---
>>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
>>> index 90643c4..d20cd90 100644
>>> --- a/src/mesa/main/shaderimage.c
>>> +++ b/src/mesa/main/shaderimage.c
>>> @@ -469,10 +469,18 @@ _mesa_is_image_unit_valid(struct gl_context *ctx, 
>>> struct gl_image_unit *u)
>>> if (!t->_BaseComplete && !t->_MipmapComplete)
>>> _mesa_test_texobj_completeness(ctx, t);
>>>  
>>> +   /* From the OpenGL 4.3 Core Specification, Chapter 8.25, Texture 
>>> Image
>>> +* Loads and Stores:
>>> +*
>>> +*  "An access is considered invalid if:
>>> +*the texture bound to the selected image unit is incomplete;"
>>> +*/
>>> +   if (!t->_BaseComplete ||
>>> +   !t->_MipmapComplete)
>>> +  return GL_FALSE;
>> I don't think this is correct, AFAIUI a texture having _MipmapComplete
>> equal to false doesn't imply that the texture as a whole would be
>> considered incomplete according to the GL's definition of completeness.
>> Whether or not it's considered complete usually depends on the sampler
>> state while you're doing regular texture sampling: If the sampler a
>> texture object is used with has any of the mipmap filtering modes
>> enabled you need to check _MipmapComplete, otherwise you need to check
>> _BaseComplete.  The problem when you attempt to carry over this
>> definition to shader images (as the spec implies) is that image units
>> have no sampler state as such, and that they can only ever access one
>> specified level of the texture at a time (potentially a texture level
>> other than the base).  This patch makes image units behave like a
>> sampler unit with mipmap filtering enabled for the purpose of texture
>> completeness validation, which is almost definitely too strong.
> Yes, I didn't realize that _BaseComplete and _MipmapComplete were not
> checking the state at all. Thanks for pointing it.
>
>> An alternative would be to do something along the lines of:
>>
>> | if (!_mesa_is_texture_complete(t, >Sampler))
>> |return GL_FALSE;
> Yes, that is what I wanted, to return false if the texture is incomplete.
>
>> The problem is that you would then run into problems when some of the
>> non-base mipmap levels are missing but the sampler state baked into the
>> gl_texture_object says that you aren't mipmapping, so the GL spec would
>> normally consider the texture to be complete and
>> _mesa_is_texture_complete would return true accordingly, but still you
>> wouldn't be able to use any of the missing texture levels as shader
>> image if the application tried to bind them to an image unit (that's the
>> reason for the u->Level vs t->BaseLevel checks below you're removing).
> Ok, then if I understand correctly, the solution is not about replacing
> the level checks for _mesa_is_texture_complete, but keeping current
> checks, and add a _mesa_is_texture_complete check. Just checked and
> everything seems to work fine (except that now the behaviour is more
> strict, see below). I will send a patch in short.
>
 Yeah, that would likely work and get the CTS test to pass, but it would
 still be more strict than the spec says and consider cases that are OK
 according to the spec to be incomplete, so I was reluctant to call it a
 solution.

 I think the ideal solution would be for the state of an image unit to be
 independent 

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-26 Thread Alejandro Piñeiro
On 23/07/16 00:31, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> Hi,
>>
>> On 15/07/16 22:46, Francisco Jerez wrote:
>>> Alejandro Piñeiro  writes:
>>>
 On 14/07/16 21:24, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> Without this commit, a image is considered valid if the level of the
>> texture bound to the image is complete, something we can check as mesa
>> save independently if it is "base incomplete" of "mipmap incomplete".
>>
>> But, from the OpenGL 4.3 Core Specification, section 8.25 ("Texture
>> Image Loads and Stores"):
>>
>>   "An access is considered invalid if:
>> the texture bound to the selected image unit is incomplete;"
>>
>> This implies that the access to the image unit is invalid if the
>> texture is incomplete, no mattering details about the specific texture
>> level bound to the image.
>>
>> This fixes:
>> GL44-CTS.shader_image_load_store.incomplete_textures
>> ---
>>
>> Current piglit test is not testing what this commit tries to fix. I
>> will send a patch to piglit in short.
>>
>>  src/mesa/main/shaderimage.c | 14 +++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
>> index 90643c4..d20cd90 100644
>> --- a/src/mesa/main/shaderimage.c
>> +++ b/src/mesa/main/shaderimage.c
>> @@ -469,10 +469,18 @@ _mesa_is_image_unit_valid(struct gl_context *ctx, 
>> struct gl_image_unit *u)
>> if (!t->_BaseComplete && !t->_MipmapComplete)
>> _mesa_test_texobj_completeness(ctx, t);
>>  
>> +   /* From the OpenGL 4.3 Core Specification, Chapter 8.25, Texture 
>> Image
>> +* Loads and Stores:
>> +*
>> +*  "An access is considered invalid if:
>> +*the texture bound to the selected image unit is incomplete;"
>> +*/
>> +   if (!t->_BaseComplete ||
>> +   !t->_MipmapComplete)
>> +  return GL_FALSE;
> I don't think this is correct, AFAIUI a texture having _MipmapComplete
> equal to false doesn't imply that the texture as a whole would be
> considered incomplete according to the GL's definition of completeness.
> Whether or not it's considered complete usually depends on the sampler
> state while you're doing regular texture sampling: If the sampler a
> texture object is used with has any of the mipmap filtering modes
> enabled you need to check _MipmapComplete, otherwise you need to check
> _BaseComplete.  The problem when you attempt to carry over this
> definition to shader images (as the spec implies) is that image units
> have no sampler state as such, and that they can only ever access one
> specified level of the texture at a time (potentially a texture level
> other than the base).  This patch makes image units behave like a
> sampler unit with mipmap filtering enabled for the purpose of texture
> completeness validation, which is almost definitely too strong.
 Yes, I didn't realize that _BaseComplete and _MipmapComplete were not
 checking the state at all. Thanks for pointing it.

> An alternative would be to do something along the lines of:
>
> | if (!_mesa_is_texture_complete(t, >Sampler))
> |return GL_FALSE;
 Yes, that is what I wanted, to return false if the texture is incomplete.

> The problem is that you would then run into problems when some of the
> non-base mipmap levels are missing but the sampler state baked into the
> gl_texture_object says that you aren't mipmapping, so the GL spec would
> normally consider the texture to be complete and
> _mesa_is_texture_complete would return true accordingly, but still you
> wouldn't be able to use any of the missing texture levels as shader
> image if the application tried to bind them to an image unit (that's the
> reason for the u->Level vs t->BaseLevel checks below you're removing).
 Ok, then if I understand correctly, the solution is not about replacing
 the level checks for _mesa_is_texture_complete, but keeping current
 checks, and add a _mesa_is_texture_complete check. Just checked and
 everything seems to work fine (except that now the behaviour is more
 strict, see below). I will send a patch in short.

>>> Yeah, that would likely work and get the CTS test to pass, but it would
>>> still be more strict than the spec says and consider cases that are OK
>>> according to the spec to be incomplete, so I was reluctant to call it a
>>> solution.
>>>
>>> I think the ideal solution would be for the state of an image unit to be
>>> independent from the filtering and sampling state, and depend on the
>>> completeness of the bound level *only*.  Any idea if this CTS (or your
>>> equivalent 

Re: [Mesa-dev] Mesa (master): Revert "radeon/llvm: Use alloca instructions for larger arrays"

2016-07-26 Thread Marek Olšák
On Sat, Jul 23, 2016 at 4:07 PM, Nicolai Hähnle  wrote:
> On 22.07.2016 12:08, Michel Dänzer wrote:
>>
>> On 21.07.2016 18:17, Matt Arsenault wrote:

 On Jul 21, 2016, at 01:03, Michel Dänzer > wrote:

 On 21.07.2016 00:04, Michel Dänzer wrote:
>
> On 15.07.2016 05:15, Marek =?UNKNOWN?B?T2zFocOhaw==?= wrote:
>>
>> Module: Mesa
>> Branch: master
>> Commit: f84e9d749fbb6da73a60fb70e6725db773c9b8f8
>> URL:
>>
>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f84e9d749fbb6da73a60fb70e6725db773c9b8f8
>>
>> Author: Marek Olšák >
>> Date:   Thu Jul 14 22:07:46 2016 +0200
>>
>> Revert "radeon/llvm: Use alloca instructions for larger arrays"
>>
>> This reverts commit 513fccdfb68e6a71180e21827f071617c93fd09b.
>>
>> Bioshock Infinite hangs with that.
>
>
> Unfortunately, this change caused the piglit test
> shaders@glsl-fs-vec4-indexing-temp-dst-in-loop (and possibly others) to
> hang my Kaveri. Any ideas for how we can get out of this conundrum?


 The hang was introduced by LLVM SVN r275934 ("AMDGPU: Expand register
 indexing pseudos in custom inserter"). The good/bad (without/with
 r275934) shader dumps and the GALLIUM_DDEBUG=800 dump corresponding to
 the hang are attached.


 BTW, even with Marek's change above reverted, I still see some piglit
 regressions compared to last week, but I'm not sure if those are all
 related to the same LLVM change.


 --
 Earthling Michel Dänzer   |
   http://www.amd.com 
 Libre software enthusiast | Mesa and X developer

 
>>>
>>>
>>> This fixes the verifier error in it: https://reviews.llvm.org/D22616
>>
>>
>> This seems to fix the hang, thanks!
>>
>>
>>> This fixes another issue which may be
>>> related: https://reviews.llvm.org/D22556
>>
>>
>> Even with that applied as well, there are still piglit regressions
>> compared to early last week, see the attached dumps (look for "LLVM
>> triggered Diagnostic Handler:").
>
>
> Looks like the "rewrite undef" part of the Two Address Instruction Pass also
> needs to be adjusted -- I've attached a bugpoint-reduced test case.
>
> Also, the hang that motivated the original revert in Mesa should be fixed
> with https://reviews.llvm.org/D22673 (and the related
> https://reviews.llvm.org/D22675 is also needed for correctness, though
> probably not for fixing the hang).

FYI, I've reverted the revert.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix move_interpolation_to_top() pass.

2016-07-26 Thread Kenneth Graunke
On Tuesday, July 26, 2016 2:12:47 PM PDT Matt Turner wrote:
> On Tue, Jul 26, 2016 at 1:19 PM, Kenneth Graunke  
> wrote:
> > The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae
> > was entirely broken.  A missing "break" made the load_interpolated_input
> > case always fall through to "default" and hit a "continue", making it
> > not actually move any load_interpolated_input intrinsics at all.
> 
> Let's make a rule that non-obvious fallthroughs *must* be marked with
> a /* fallthrough */ comment. That would have lead reviewers to notice
> that something was strange. Coverity also makes noise about this, and
> that would be nice to avoid as well.

I think that's a good idea, but I doubt it would have helped in this
case.  (Given that we didn't notice a missing break, I don't think we
would've noticed a missing /* fallthrough */ either...)

Coverity likely caught this, but I haven't gotten a "New Defects" email
about it yet.  I'm guessing Jenkins just spotted the IVB hangs before
Coverity's periodic email got sent out...

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/35] i965/blorp/clear: Initialize surface info after allocating an MCS

2016-07-26 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index 1bc0dbb..1e00719 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -135,12 +135,6 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
if (!encode_srgb && _mesa_get_format_color_encoding(format) == GL_SRGB)
   format = _mesa_get_srgb_format_linear(format);
 
-   brw_blorp_surface_info_init(brw, , irb->mt, irb->mt_level,
-   layer, format, true);
-
-   /* Override the surface format according to the context's sRGB rules. */
-   params.dst.brw_surfaceformat = brw->render_target_format[format];
-
params.x0 = fb->_Xmin;
params.x1 = fb->_Xmax;
if (rb->Name != 0) {
@@ -218,6 +212,12 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   }
}
 
+   brw_blorp_surface_info_init(brw, , irb->mt, irb->mt_level,
+   layer, format, true);
+
+   /* Override the surface format according to the context's sRGB rules. */
+   params.dst.brw_surfaceformat = brw->render_target_format[format];
+
const char *clear_type;
if (is_fast_clear)
   clear_type = "fast";
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 15/35] i965/blorp: Add an isl_view to blorp_surface_info

2016-07-26 Thread Jason Ekstrand
Eventually, this will be the actual view that gets passed into isl to
create the surface state.  For now, we just use it for the format and the
swizzle.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 38 +++
 src/mesa/drivers/dri/i965/brw_blorp.h | 16 ++-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 34 
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  2 +-
 src/mesa/drivers/dri/i965/gen8_blorp.c| 29 
 5 files changed, 64 insertions(+), 55 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8f7690c..ef256a7 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -43,9 +43,11 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 * using INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, then it had better
 * be a multiple of num_samples.
 */
+   unsigned layer_multiplier = 1;
if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
   assert(mt->num_samples <= 1 || layer % mt->num_samples == 0);
+  layer_multiplier = MAX2(mt->num_samples, 1);
}
 
intel_miptree_check_level_layer(mt, level, layer);
@@ -61,13 +63,27 @@ brw_blorp_surface_info_init(struct brw_context *brw,
   info->aux_usage = ISL_AUX_USAGE_NONE;
}
 
+   info->view = (struct isl_view) {
+  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
+  ISL_SURF_USAGE_TEXTURE_BIT,
+  .format = ISL_FORMAT_UNSUPPORTED, /* Set later */
+  .base_level = level,
+  .levels = 1,
+  .base_array_layer = layer / layer_multiplier,
+  .array_len = 1,
+  .channel_select = {
+ ISL_CHANNEL_SELECT_RED,
+ ISL_CHANNEL_SELECT_GREEN,
+ ISL_CHANNEL_SELECT_BLUE,
+ ISL_CHANNEL_SELECT_ALPHA,
+  },
+   };
+
info->level = level;
info->layer = layer;
info->width = minify(mt->physical_width0, level - mt->first_level);
info->height = minify(mt->physical_height0, level - mt->first_level);
 
-   info->swizzle = SWIZZLE_XYZW;
-
if (format == MESA_FORMAT_NONE)
   format = mt->format;
 
@@ -75,8 +91,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
case MESA_FORMAT_S_UINT8:
   assert(info->surf.tiling == ISL_TILING_W);
   /* Prior to Broadwell, we can't render to R8_UINT */
-  info->brw_surfaceformat = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
-BRW_SURFACEFORMAT_R8_UNORM;
+  info->view.format = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
+  BRW_SURFACEFORMAT_R8_UNORM;
   break;
case MESA_FORMAT_Z24_UNORM_X8_UINT:
   /* It would make sense to use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS
@@ -89,20 +105,20 @@ brw_blorp_surface_info_init(struct brw_context *brw,
* pattern as long as we copy the right amount of data, so just map it
* as 8-bit BGRA.
*/
-  info->brw_surfaceformat = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
+  info->view.format = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
   break;
case MESA_FORMAT_Z_FLOAT32:
-  info->brw_surfaceformat = BRW_SURFACEFORMAT_R32_FLOAT;
+  info->view.format = BRW_SURFACEFORMAT_R32_FLOAT;
   break;
case MESA_FORMAT_Z_UNORM16:
-  info->brw_surfaceformat = BRW_SURFACEFORMAT_R16_UNORM;
+  info->view.format = BRW_SURFACEFORMAT_R16_UNORM;
   break;
default: {
   if (is_render_target) {
  assert(brw->format_supported_as_render_target[format]);
- info->brw_surfaceformat = brw->render_target_format[format];
+ info->view.format = brw->render_target_format[format];
   } else {
- info->brw_surfaceformat = brw_format_for_mesa_format(format);
+ info->view.format = brw_format_for_mesa_format(format);
   }
   break;
}
@@ -111,7 +127,7 @@ brw_blorp_surface_info_init(struct brw_context *brw,
uint32_t x_offset, y_offset;
intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
 
-   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bpb / 8;
+   uint8_t bs = isl_format_get_layout(info->view.format)->bpb / 8;
isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
   info->surf.row_pitch, x_offset, y_offset,
   >bo_offset,
@@ -287,7 +303,7 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
}
 
struct isl_view view = {
-  .format = surface->brw_surfaceformat,
+  .format = surface->view.format,
   .base_level = 0,
   .levels = 1,
   .base_array_layer = 0,
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index e591f41..185406e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -76,6 

[Mesa-dev] [PATCH v2 34/35] isl: Add a #define for DEV_IS_BAYTRAIL

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 68ad8a4..b8b48f0 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -79,6 +79,10 @@ struct brw_image_param;
 #define ISL_DEV_IS_HASWELL(__dev) ((__dev)->info->is_haswell)
 #endif
 
+#ifndef ISL_DEV_IS_BAYTRAIL
+#define ISL_DEV_IS_BAYTRAIL(__dev) ((__dev)->info->is_baytrail)
+#endif
+
 #ifndef ISL_DEV_USE_SEPARATE_STENCIL
 /**
  * You can define this as a compile-time constant in the CFLAGS. For example,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 16/35] isl: Fix get_image_offset_sa_gen4_2d for multisample surfaces

2016-07-26 Thread Jason Ekstrand
The function takes a logical array layer but was assuming it was a physical
array layer.  While we'er here, we also make it not assert-fail on gen9 3-D
surfaces.
---
 src/intel/isl/isl.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 92658ec..a713eeb 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1345,13 +1345,15 @@ isl_buffer_fill_state_s(const struct isl_device *dev, 
void *state,
  */
 static void
 get_image_offset_sa_gen4_2d(const struct isl_surf *surf,
-uint32_t level, uint32_t layer,
+uint32_t level, uint32_t logical_array_layer,
 uint32_t *x_offset_sa,
 uint32_t *y_offset_sa)
 {
assert(level < surf->levels);
-   assert(layer < surf->phys_level0_sa.array_len);
-   assert(surf->phys_level0_sa.depth == 1);
+   if (surf->dim == ISL_SURF_DIM_3D)
+  assert(logical_array_layer < surf->logical_level0_px.depth);
+   else
+  assert(logical_array_layer < surf->logical_level0_px.array_len);
 
const struct isl_extent3d image_align_sa =
   isl_surf_get_image_alignment_sa(surf);
@@ -1359,8 +1361,11 @@ get_image_offset_sa_gen4_2d(const struct isl_surf *surf,
const uint32_t W0 = surf->phys_level0_sa.width;
const uint32_t H0 = surf->phys_level0_sa.height;
 
+   const uint32_t phys_layer = logical_array_layer *
+  (surf->msaa_layout == ISL_MSAA_LAYOUT_ARRAY ? surf->samples : 1);
+
uint32_t x = 0;
-   uint32_t y = layer * isl_surf_get_array_pitch_sa_rows(surf);
+   uint32_t y = phys_layer * isl_surf_get_array_pitch_sa_rows(surf);
 
for (uint32_t l = 0; l < level; ++l) {
   if (l == 1) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Interest in GL_ARB_gl_spirv support?

2016-07-26 Thread oscar bg
Hi,
seems this year 2016 OpenGL ARB update brings a small number of extensions..
seems the most important is GL_ARB_gl_spirv.. seems like SPIRV as a binary
format for OpenGL and Mesa doesn't have any binary format even supporting
ARB_program_binary ext.. a Nvidia driver is already providing support from
day 1 for Linux as always..

just asking how difficult would be to bring support to Mesa drivers.. and
if there is any interest by Mesa devs start working on it soon..

seems already we have SPIRV support in Mesa in Vulkan drivers: Anvil Vulkan
Intel driver and some days ago RADV a open source Vulkan driver for AMD
GPUs has been anounced.. as this drivers already eat SPIRV code seems this
extension would take less work to port to this two vendor GPUs?

would like to hear feedback,
thanks..
Oscar.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix move_interpolation_to_top() pass.

2016-07-26 Thread Matt Turner
On Tue, Jul 26, 2016 at 1:19 PM, Kenneth Graunke  wrote:
> The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae
> was entirely broken.  A missing "break" made the load_interpolated_input
> case always fall through to "default" and hit a "continue", making it
> not actually move any load_interpolated_input intrinsics at all.

Let's make a rule that non-obvious fallthroughs *must* be marked with
a /* fallthrough */ comment. That would have lead reviewers to notice
that something was strange. Coverity also makes noise about this, and
that would be nice to avoid as well.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 32/35] i965/blorp: Simplify depth buffer state setup a bit

2016-07-26 Thread Jason Ekstrand
The data comes in via ISL in a format that's almost directly usable by the
hardware so we can avoid some of the conversion headache.
---
 src/mesa/drivers/dri/i965/gen6_blorp.c | 34 --
 src/mesa/drivers/dri/i965/gen7_blorp.c | 38 +++---
 2 files changed, 17 insertions(+), 55 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
b/src/mesa/drivers/dri/i965/gen6_blorp.c
index 402c219..9e08374 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
@@ -699,11 +699,8 @@ static void
 gen6_blorp_emit_depth_stencil_config(struct brw_context *brw,
  const struct brw_blorp_params *params)
 {
-   uint32_t surfwidth, surfheight;
uint32_t surftype;
-   unsigned int depth = MAX2(params->depth.mt->logical_depth0, 1);
GLenum gl_target = params->depth.mt->target;
-   unsigned int lod;
 
switch (gl_target) {
case GL_TEXTURE_CUBE_MAP_ARRAY:
@@ -714,39 +711,25 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context 
*brw,
* equivalent.
*/
   surftype = BRW_SURFACE_2D;
-  depth *= 6;
   break;
default:
   surftype = translate_tex_target(gl_target);
   break;
}
 
-   const unsigned min_array_element = params->depth.layer;
-
-   lod = params->depth.level - params->depth.mt->first_level;
-
-   if (params->hiz_op != GEN6_HIZ_OP_NONE && lod == 0) {
-  /* HIZ ops for lod 0 may set the width & height a little
-   * larger to allow the fast depth clear to fit the hardware
-   * alignment requirements. (8x4)
-   */
-  surfwidth = params->depth.surf.logical_level0_px.width;
-  surfheight = params->depth.surf.logical_level0_px.height;
-   } else {
-  surfwidth = params->depth.mt->logical_width0;
-  surfheight = params->depth.mt->logical_height0;
-   }
-
/* 3DSTATE_DEPTH_BUFFER */
{
   brw_emit_depth_stall_flushes(brw);
 
+  unsigned depth = MAX2(params->depth.surf.logical_level0_px.depth,
+params->depth.surf.logical_level0_px.array_len);
+
   BEGIN_BATCH(7);
   /* 3DSTATE_DEPTH_BUFFER dw0 */
   OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2));
 
   /* 3DSTATE_DEPTH_BUFFER dw1 */
-  OUT_BATCH((params->depth.mt->pitch - 1) |
+  OUT_BATCH((params->depth.surf.row_pitch - 1) |
 params->depth_format << 18 |
 1 << 21 | /* separate stencil enable */
 1 << 22 | /* hiz enable */
@@ -761,13 +744,13 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context 
*brw,
 
   /* 3DSTATE_DEPTH_BUFFER dw3 */
   OUT_BATCH(BRW_SURFACE_MIPMAPLAYOUT_BELOW << 1 |
-(surfwidth - 1) << 6 |
-(surfheight - 1) << 19 |
-lod << 2);
+(params->depth.surf.logical_level0_px.width - 1) << 6 |
+(params->depth.surf.logical_level0_px.height - 1) << 19 |
+params->depth.view.base_level << 2);
 
   /* 3DSTATE_DEPTH_BUFFER dw4 */
   OUT_BATCH((depth - 1) << 21 |
-min_array_element << 10 |
+params->depth.view.base_array_layer << 10 |
 (depth - 1) << 1);
 
   /* 3DSTATE_DEPTH_BUFFER dw5 */
@@ -784,6 +767,7 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context 
*brw,
   uint32_t offset = 0;
 
   if (hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
+ const unsigned lod = params->depth.view.base_level;
  offset = intel_miptree_get_aligned_offset(hiz_mt,
hiz_mt->level[lod].level_x,
hiz_mt->level[lod].level_y,
diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
b/src/mesa/drivers/dri/i965/gen7_blorp.c
index ac7cf38..420a285 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
@@ -485,12 +485,8 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context 
*brw,
  const struct brw_blorp_params *params)
 {
const uint8_t mocs = GEN7_MOCS_L3;
-   uint32_t surfwidth, surfheight;
uint32_t surftype;
-   unsigned int depth = MAX2(params->depth.mt->logical_depth0, 1);
-   unsigned int min_array_element;
GLenum gl_target = params->depth.mt->target;
-   unsigned int lod;
 
switch (gl_target) {
case GL_TEXTURE_CUBE_MAP_ARRAY:
@@ -501,40 +497,22 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context 
*brw,
* equivalent.
*/
   surftype = BRW_SURFACE_2D;
-  depth *= 6;
   break;
default:
   surftype = translate_tex_target(gl_target);
   break;
}
 
-   min_array_element = params->depth.layer;
-   if (params->depth.mt->num_samples > 1) {
-  /* Convert physical layer to logical layer. */
-  min_array_element /= params->depth.mt->num_samples;
-   }
-
-   lod = params->depth.level - params->depth.mt->first_level;

[Mesa-dev] [PATCH v2 33/35] i965/blorp: Remove unused fields from blorp_surface_info

2016-07-26 Thread Jason Ekstrand
The only reason why we need layer or level is that we need the z-offset for
3-D surfaces.  Let's just have the one field for that.
---
 src/mesa/drivers/dri/i965/brw_blorp.c |  3 ---
 src/mesa/drivers/dri/i965/brw_blorp.h | 16 
 2 files changed, 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 215f765..87d8929 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -168,9 +168,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
   info->z_offset = 0;
}
 
-   info->level = level;
-   info->layer = layer;
-
if (format == MESA_FORMAT_NONE)
   format = mt->format;
 
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 706d53e..076d26d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -81,22 +81,6 @@ struct brw_blorp_surface_info
/* Z offset into a 3-D texture or slice of a 2-D array texture. */
uint32_t z_offset;
 
-   /**
-* The miplevel to use.
-*/
-   uint32_t level;
-
-   /**
-* The 2D layer within the miplevel. Combined, level and layer define the
-* 2D miptree slice to use.
-*
-* Note: if mt is a 2D multisample array texture on Gen7+ using
-* INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, layer is the physical
-* layer holding sample 0.  So, for example, if mt->num_samples == 4, then
-* logical layer n corresponds to layer == 4*n.
-*/
-   uint32_t layer;
-
uint32_t bo_offset;
uint32_t tile_x_sa, tile_y_sa;
 };
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 15/27] i965/meta_util: Convert get_resolve_rect to use ISL

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  5 ++--
 src/mesa/drivers/dri/i965/brw_meta_util.c | 43 +--
 src/mesa/drivers/dri/i965/brw_meta_util.h |  8 ++---
 3 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index e2b1d5a..d242f24 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -326,8 +326,9 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
brw_blorp_to_isl_format(brw, format, true),
true);
 
-   brw_get_resolve_rect(brw, mt, , ,
-, );
+   brw_get_ccs_resolve_rect(>isl_dev, _surf,
+, ,
+, );
 
if (intel_miptree_is_lossless_compressed(brw, mt))
   params.resolve_type = GEN9_PS_RENDER_TARGET_RESOLVE_FULL;
diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index 77c6b83..a81190d 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -585,12 +585,11 @@ brw_meta_get_buffer_rect(const struct gl_framebuffer *fb,
 }
 
 void
-brw_get_resolve_rect(const struct brw_context *brw,
- const struct intel_mipmap_tree *mt,
- unsigned *x0, unsigned *y0,
- unsigned *x1, unsigned *y1)
+brw_get_ccs_resolve_rect(const struct isl_device *dev,
+ const struct isl_surf *ccs_surf,
+ unsigned *x0, unsigned *y0,
+ unsigned *x1, unsigned *y1)
 {
-   unsigned x_align, y_align;
unsigned x_scaledown, y_scaledown;
 
/* From the Ivy Bridge PRM, Vol2 Part1 11.9 "Render Target Resolve":
@@ -598,25 +597,25 @@ brw_get_resolve_rect(const struct brw_context *brw,
 * A rectangle primitive must be scaled down by the following factors
 * with respect to render target being resolved.
 *
-* The scaledown factors in the table that follows are related to the
-* alignment size returned by intel_get_non_msrt_mcs_alignment() by a
-* multiplier. For IVB and HSW, we divide by two, for BDW we multiply
-* by 8 and 16. Similar to the fast clear, SKL eases the BDW vertical 
scaling
-* by a factor of 2.
+* The scaledown factors in the table that follows are related to the block
+* size of the CCS format.  For IVB and HSW, we divide by two, for BDW we
+* multiply by 8 and 16. On Sky Lake, we multiply by 8.
 */
-
-   intel_get_non_msrt_mcs_alignment(mt, _align, _align);
-   if (brw->gen >= 9) {
-  x_scaledown = x_align * 8;
-  y_scaledown = y_align * 8;
-   } else if (brw->gen >= 8) {
-  x_scaledown = x_align * 8;
-  y_scaledown = y_align * 16;
+   const struct isl_format_layout *fmtl =
+  isl_format_get_layout(ccs_surf->format);
+   assert(fmtl->txc == ISL_TXC_CCS);
+
+   if (ISL_DEV_GEN(dev) >= 9) {
+  x_scaledown = fmtl->bw * 8;
+  y_scaledown = fmtl->bh * 8;
+   } else if (ISL_DEV_GEN(dev) >= 8) {
+  x_scaledown = fmtl->bw * 8;
+  y_scaledown = fmtl->bh * 16;
} else {
-  x_scaledown = x_align / 2;
-  y_scaledown = y_align / 2;
+  x_scaledown = fmtl->bw / 2;
+  y_scaledown = fmtl->bh / 2;
}
*x0 = *y0 = 0;
-   *x1 = ALIGN(mt->logical_width0, x_scaledown) / x_scaledown;
-   *y1 = ALIGN(mt->logical_height0, y_scaledown) / y_scaledown;
+   *x1 = ALIGN(ccs_surf->logical_level0_px.width, x_scaledown) / x_scaledown;
+   *y1 = ALIGN(ccs_surf->logical_level0_px.height, y_scaledown) / y_scaledown;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.h 
b/src/mesa/drivers/dri/i965/brw_meta_util.h
index 0929497..7d4e5f6 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.h
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.h
@@ -50,10 +50,10 @@ brw_get_fast_clear_rect(const struct brw_context *brw,
 unsigned *x1, unsigned *y1);
 
 void
-brw_get_resolve_rect(const struct brw_context *brw,
- const struct intel_mipmap_tree *mt,
- unsigned *x0, unsigned *y0,
- unsigned *x1, unsigned *y1);
+brw_get_ccs_resolve_rect(const struct isl_device *dev,
+ const struct isl_surf *ccs_surf,
+ unsigned *x0, unsigned *y0,
+ unsigned *x1, unsigned *y1);
 
 void
 brw_meta_get_buffer_rect(const struct gl_framebuffer *fb, 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 12/27] i965/blorp: Use the isl_surf for more params setup

2016-07-26 Thread Jason Ekstrand
The isl_surf munging doesn't happen until fairly late in the blorp_blit
function.  We can use the isl_surf for the vast majority if not all of our
params setup.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 79 
 1 file changed, 21 insertions(+), 58 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index af75cfa..3ce64e4 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1387,7 +1387,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
/* If the source image is not multisampled, then we want to fetch sample
 * number 0, because that's the only sample there is.
 */
-   if (key->src_samples == 0)
+   if (key->src_samples == 1)
   src_pos = nir_channels(, src_pos, 0x3);
 
/* X, Y, and S are now the coordinates of the pixel in the source image
@@ -1464,7 +1464,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
   * the texturing unit, will cause data to be read from the correct
   * memory location.  So we can fetch the texel now.
   */
- if (key->src_samples == 0) {
+ if (key->src_samples == 1) {
 color = blorp_nir_txf(, , src_pos, key->texture_data_type);
  } else {
 nir_ssa_def *mcs = NULL;
@@ -1547,26 +1547,6 @@ brw_blorp_setup_coord_transform(struct 
brw_blorp_coord_transform *xform,
}
 }
 
-static enum isl_msaa_layout
-get_isl_msaa_layout(unsigned samples, enum intel_msaa_layout layout)
-{
-   if (samples > 1) {
-  switch (layout) {
-  case INTEL_MSAA_LAYOUT_NONE:
- return ISL_MSAA_LAYOUT_NONE;
-  case INTEL_MSAA_LAYOUT_IMS:
- return ISL_MSAA_LAYOUT_INTERLEAVED;
-  case INTEL_MSAA_LAYOUT_UMS:
-  case INTEL_MSAA_LAYOUT_CMS:
- return ISL_MSAA_LAYOUT_ARRAY;
-  default:
- unreachable("Invalid MSAA layout");
-  }
-   } else {
-  return ISL_MSAA_LAYOUT_NONE;
-   }
-}
-
 /**
  * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
  * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
@@ -1797,28 +1777,12 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
struct brw_blorp_blit_prog_key wm_prog_key;
memset(_prog_key, 0, sizeof(wm_prog_key));
 
-   /* texture_data_type indicates the register type that should be used to
-* manipulate texture data.
-*/
-   switch (_mesa_get_format_datatype(src_mt->format)) {
-   case GL_UNSIGNED_NORMALIZED:
-   case GL_SIGNED_NORMALIZED:
-   case GL_FLOAT:
-  wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_F;
-  break;
-   case GL_UNSIGNED_INT:
-  if (src_mt->format == MESA_FORMAT_S_UINT8) {
- /* We process stencil as though it's an unsigned normalized color */
- wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_F;
-  } else {
- wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_UD;
-  }
-  break;
-   case GL_INT:
+   if (isl_format_has_sint_channel(params.src.view.format)) {
   wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_D;
-  break;
-   default:
-  unreachable("Unrecognized blorp format");
+   } else if (isl_format_has_uint_channel(params.src.view.format)) {
+  wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_UD;
+   } else {
+  wm_prog_key.texture_data_type = BRW_REGISTER_TYPE_F;
}
 
/* Scaled blitting or not. */
@@ -1829,21 +1793,20 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
/* Scaling factors used for bilinear filtering in multisample scaled
 * blits.
 */
-   if (src_mt->num_samples == 16)
+   if (params.src.surf.samples == 16)
   wm_prog_key.x_scale = 4.0f;
else
   wm_prog_key.x_scale = 2.0f;
-   wm_prog_key.y_scale = src_mt->num_samples / wm_prog_key.x_scale;
+   wm_prog_key.y_scale = params.src.surf.samples / wm_prog_key.x_scale;
 
if (filter == GL_LINEAR &&
params.src.surf.samples <= 1 && params.dst.surf.samples <= 1)
   wm_prog_key.bilinear_filter = true;
 
-   GLenum base_format = _mesa_get_format_base_format(src_mt->format);
-   if (base_format != GL_DEPTH_COMPONENT && /* TODO: what about depth/stencil? 
*/
-   base_format != GL_STENCIL_INDEX &&
-   !_mesa_is_format_integer(src_mt->format) &&
-   src_mt->num_samples > 1 && dst_mt->num_samples <= 1) {
+   if ((params.src.surf.usage & ISL_SURF_USAGE_DEPTH_BIT) == 0 &&
+   (params.src.surf.usage & ISL_SURF_USAGE_STENCIL_BIT) == 0 &&
+   !isl_format_has_int_channel(params.src.surf.format) &&
+   params.src.surf.samples > 1 && params.dst.surf.samples <= 1) {
   /* We are downsampling a non-integer color buffer, so blend.
*
* Regarding integer color buffers, the OpenGL ES 3.2 spec says:
@@ -1857,18 +1820,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
}
 
/* src_samples and dst_samples are the true sample counts */
-   wm_prog_key.src_samples = 

[Mesa-dev] [PATCH v2 07/27] i965/blorp/blit: Move format work-arounds before surface_info_init

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 007c061..ed68734 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1744,14 +1744,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
if (!encode_srgb && _mesa_get_format_color_encoding(dst_format) == GL_SRGB)
   dst_format = _mesa_get_srgb_format_linear(dst_format);
 
-   struct brw_blorp_params params;
-   brw_blorp_params_init();
-
-   brw_blorp_surface_info_init(brw, , src_mt, src_level,
-   src_layer, src_format, false);
-   brw_blorp_surface_info_init(brw, , dst_mt, dst_level,
-   dst_layer, dst_format, true);
-
/* Even though we do multisample resolves at the time of the blit, OpenGL
 * specification defines them as if they happen at the time of rendering,
 * which means that the type of averaging we do during the resolve should
@@ -1767,15 +1759,12 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
 * (aside from the color space), we choose to blit in sRGB space to get
 * this higher quality image.
 */
-   if (params.src.surf.samples > 1 &&
+   if (src_mt->num_samples > 1 &&
_mesa_get_format_color_encoding(dst_mt->format) == GL_SRGB &&
_mesa_get_srgb_format_linear(src_mt->format) ==
_mesa_get_srgb_format_linear(dst_mt->format)) {
   assert(brw->format_supported_as_render_target[dst_mt->format]);
-  params.dst.view.format =
- (enum isl_format)brw->render_target_format[dst_mt->format];
-  params.src.view.format =
- (enum isl_format)brw_format_for_mesa_format(dst_mt->format);
+  src_format = dst_format = dst_mt->format;
}
 
/* When doing a multisample resolve of a GL_LUMINANCE32F or GL_INTENSITY32F
@@ -1788,12 +1777,21 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
 * R32_FLOAT, so only the contents of the red channel matters.
 */
if (brw->gen == 6 &&
-   params.src.surf.samples > 1 && params.dst.surf.samples <= 1 &&
+   src_mt->num_samples > 1 && dst_mt->num_samples <= 1 &&
src_mt->format == dst_mt->format &&
-   params.dst.view.format == ISL_FORMAT_R32_FLOAT) {
-  params.src.view.format = params.dst.view.format;
+   (dst_format == MESA_FORMAT_L_FLOAT32 ||
+dst_format == MESA_FORMAT_I_FLOAT32)) {
+  src_format = dst_format = MESA_FORMAT_R_FLOAT32;
}
 
+   struct brw_blorp_params params;
+   brw_blorp_params_init();
+
+   brw_blorp_surface_info_init(brw, , src_mt, src_level,
+   src_layer, src_format, false);
+   brw_blorp_surface_info_init(brw, , dst_mt, dst_level,
+   dst_layer, dst_format, true);
+
struct brw_blorp_blit_prog_key wm_prog_key;
memset(_prog_key, 0, sizeof(wm_prog_key));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 30/35] isl: Add asserts for gen8+ X/YOffset rules

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 6febcbf..fb23414 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -414,6 +414,16 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   assert(info->surf->levels == 1);
   assert(info->surf->logical_level0_px.array_len == 1);
   assert(info->aux_usage == ISL_AUX_USAGE_NONE);
+
+  if (GEN_GEN >= 8) {
+ /* Broadwell added more rules. */
+ assert(info->surf->samples == 1);
+ if (isl_format_get_layout(info->view->format)->bpb == 8)
+assert(info->x_offset_sa % 16 == 0);
+ if (isl_format_get_layout(info->view->format)->bpb == 16)
+assert(info->x_offset_sa % 8 == 0);
+  }
+
 #if GEN_GEN >= 7
   s.SurfaceArray = false;
 #endif
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 02/27] i965/miptree: Allow get_aux_isl_surf when there is no aux surface

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8c63aa6..1911eef 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3204,7 +3204,8 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
} else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
   *usage = ISL_AUX_USAGE_CCS_D;
} else {
-  unreachable("Invalid MCS miptree");
+  *usage = ISL_AUX_USAGE_NONE;
+  return;
}
 
/* Figure out the format and tiling of the auxiliary surface */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 09/27] i964/blorp: Set up most aux surfaces up-front

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 43 ---
 src/mesa/drivers/dri/i965/brw_blorp.h |  4 
 2 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index cf1615f..97eddf9 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -136,8 +136,17 @@ brw_blorp_surface_info_init(struct brw_context *brw,
if (mt->mcs_mt) {
   intel_miptree_get_aux_isl_surf(brw, mt, >aux_surf,
  >aux_usage);
+  info->aux_bo = mt->mcs_mt->bo;
+  info->aux_offset = mt->mcs_mt->offset;
+
+  /* We only really need a clear color if we also have an auxiliary
+   * surface.  Without one, it does nothing.
+   */
+  info->clear_color = intel_miptree_get_isl_clear_color(brw, mt);
} else {
   info->aux_usage = ISL_AUX_USAGE_NONE;
+  info->aux_bo = NULL;
+  info->aux_offset = 0;
}
 
info->view = (struct isl_view) {
@@ -341,20 +350,17 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
   surf.dim = ISL_SURF_DIM_2D;
}
 
-   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
-
-   const struct isl_surf *aux_surf = NULL;
-   uint64_t aux_offset = 0;
-   if (surface->mt->mcs_mt) {
-  aux_surf = >aux_surf;
-  assert(surface->mt->mcs_mt->offset == 0);
-  aux_offset = surface->mt->mcs_mt->bo->offset64;
+   /* Blorp doesn't support HiZ in any of the blit or slow-clear paths */
+   enum isl_aux_usage aux_usage = surface->aux_usage;
+   if (aux_usage == ISL_AUX_USAGE_HIZ)
+  aux_usage = ISL_AUX_USAGE_NONE;
 
-  /* We only really need a clear color if we also have an auxiliary
-   * surface.  Without one, it does nothing.
-   */
-  clear_color = intel_miptree_get_isl_clear_color(brw, surface->mt);
-   }
+   /* If we don't have an aux surface, the clear color is meaningless.  Don't
+* bother to set it up in the surface state.
+*/
+   union isl_color_value clear_color = surface->clear_color;
+   if (aux_usage == ISL_AUX_USAGE_NONE)
+  memset(_color, 0, sizeof(clear_color));
 
uint32_t surf_offset;
uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
@@ -362,11 +368,12 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
   _offset);
 
const uint32_t mocs = is_render_target ? ss_info.rb_mocs : ss_info.tex_mocs;
+   uint64_t aux_bo_offset = surface->aux_bo ? surface->aux_bo->offset64 : 0;
 
isl_surf_fill_state(>isl_dev, dw, .surf = , .view = 
>view,
.address = surface->bo->offset64 + surface->offset,
-   .aux_surf = aux_surf, .aux_usage = surface->aux_usage,
-   .aux_address = aux_offset,
+   .aux_surf = >aux_surf, .aux_usage = aux_usage,
+   .aux_address = aux_bo_offset + surface->aux_offset,
.mocs = mocs, .clear_color = clear_color,
.x_offset_sa = surface->tile_x_sa,
.y_offset_sa = surface->tile_y_sa);
@@ -378,15 +385,15 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
dw[ss_info.reloc_dw] - surface->bo->offset64,
read_domains, write_domain);
 
-   if (aux_surf) {
+   if (aux_usage != ISL_AUX_USAGE_NONE) {
   /* On gen7 and prior, the bottom 12 bits of the MCS base address are
* used to store other information.  This should be ok, however, because
* surface buffer addresses are always 4K page alinged.
*/
-  assert((aux_offset & 0xfff) == 0);
+  assert((surface->aux_offset & 0xfff) == 0);
   drm_intel_bo_emit_reloc(brw->batch.bo,
   surf_offset + ss_info.aux_reloc_dw * 4,
-  surface->mt->mcs_mt->bo,
+  surface->aux_bo,
   dw[ss_info.aux_reloc_dw] & 0xfff,
   read_domains, write_domain);
}
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 98a9436..d747880 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -76,8 +76,12 @@ struct brw_blorp_surface_info
uint32_t offset;
 
struct isl_surf aux_surf;
+   drm_intel_bo *aux_bo;
+   uint32_t aux_offset;
enum isl_aux_usage aux_usage;
 
+   union isl_color_value clear_color;
+
struct isl_view view;
 
/* Z offset into a 3-D texture or slice of a 2-D array texture. */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 14/27] i965/blorp: Make the guts of brw_blorp_blit_miptrees miptree-unaware

2016-07-26 Thread Jason Ekstrand
Now that we have the brw_blorp_surf struct, we can start to make bits of
blorp completely miptree-unaware.  To start things off, we split the guts
of brw_blorp_blit_miptrees into a brw_blorp_blit function which knows
nothing about miptrees.
---
 src/mesa/drivers/dri/i965/brw_blorp.h| 14 
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 54 +++-
 2 files changed, 51 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 6ae82c7..576a078 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -61,6 +61,20 @@ brw_blorp_to_isl_format(struct brw_context *brw, mesa_format 
format,
 bool is_render_target);
 
 void
+brw_blorp_blit(struct brw_context *brw,
+   const struct brw_blorp_surf *src_surf,
+   unsigned src_level, unsigned src_layer,
+   enum isl_format src_format, int src_swizzle,
+   const struct brw_blorp_surf *dst_surf,
+   unsigned dst_level, unsigned dst_layer,
+   enum isl_format dst_format,
+   float src_x0, float src_y0,
+   float src_x1, float src_y1,
+   float dst_x0, float dst_y0,
+   float dst_x1, float dst_y1,
+   GLenum filter, bool mirror_x, bool mirror_y);
+
+void
 brw_blorp_blit_miptrees(struct brw_context *brw,
 struct intel_mipmap_tree *src_mt,
 unsigned src_level, unsigned src_layer,
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 40751c8..e328739 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1770,21 +1770,46 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
intel_miptree_check_level_layer(dst_mt, dst_level, dst_layer);
intel_miptree_used_for_rendering(dst_mt);
 
+   struct isl_surf tmp_surfs[4];
+   struct brw_blorp_surf src_surf, dst_surf;
+   brw_blorp_surf_for_miptree(brw, _surf, src_mt, _level, 
_surfs[0]);
+   brw_blorp_surf_for_miptree(brw, _surf, dst_mt, _level, 
_surfs[2]);
+
+   brw_blorp_blit(brw, _surf, src_level, src_layer,
+  brw_blorp_to_isl_format(brw, src_format, false), src_swizzle,
+  _surf, dst_level, dst_layer,
+  brw_blorp_to_isl_format(brw, dst_format, true),
+  src_x0, src_y0, src_x1, src_y1,
+  dst_x0, dst_y0, dst_x1, dst_y1,
+  filter, mirror_x, mirror_y);
+
+   intel_miptree_slice_set_needs_hiz_resolve(dst_mt, dst_level, dst_layer);
+
+   if (intel_miptree_is_lossless_compressed(brw, dst_mt))
+  dst_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_UNRESOLVED;
+}
+
+void
+brw_blorp_blit(struct brw_context *brw,
+   const struct brw_blorp_surf *src_surf,
+   unsigned src_level, unsigned src_layer,
+   enum isl_format src_format, int src_swizzle,
+   const struct brw_blorp_surf *dst_surf,
+   unsigned dst_level, unsigned dst_layer,
+   enum isl_format dst_format,
+   float src_x0, float src_y0,
+   float src_x1, float src_y1,
+   float dst_x0, float dst_y0,
+   float dst_x1, float dst_y1,
+   GLenum filter, bool mirror_x, bool mirror_y)
+{
struct brw_blorp_params params;
brw_blorp_params_init();
 
-   struct isl_surf isl_tmp[4];
-   struct brw_blorp_surf src_surf, dst_surf;
-   brw_blorp_surf_for_miptree(brw, _surf, src_mt, _level, _tmp[0]);
-   brw_blorp_surface_info_init(brw, ,
-   _surf, src_level, src_layer,
-   brw_blorp_to_isl_format(brw, src_format, false),
-   false);
-   brw_blorp_surf_for_miptree(brw, _surf, dst_mt, _level, _tmp[2]);
-   brw_blorp_surface_info_init(brw, ,
-   _surf, dst_level, dst_layer,
-   brw_blorp_to_isl_format(brw, dst_format, true),
-   true);
+   brw_blorp_surface_info_init(brw, , src_surf, src_level,
+   src_layer, src_format, false);
+   brw_blorp_surface_info_init(brw, , dst_surf, dst_level,
+   dst_layer, dst_format, true);
 
struct brw_blorp_blit_prog_key wm_prog_key;
memset(_prog_key, 0, sizeof(wm_prog_key));
@@ -2029,9 +2054,4 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
}
 
brw_blorp_exec(brw, );
-
-   intel_miptree_slice_set_needs_hiz_resolve(dst_mt, dst_level, dst_layer);
-
-   if (intel_miptree_is_lossless_compressed(brw, dst_mt))
-  dst_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_UNRESOLVED;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH v2 19/27] i965/blorp/clear: Stop stomping the destination format

2016-07-26 Thread Jason Ekstrand
The blorp_surface_info_init call above should ste the format for us and
stomping it later does nothing whatsoever.
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index daab5db..4d3fe58 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -132,6 +132,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
struct brw_blorp_params params;
brw_blorp_params_init();
 
+   /* Override the surface format according to the context's sRGB rules. */
if (!encode_srgb && _mesa_get_format_color_encoding(format) == GL_SRGB)
   format = _mesa_get_srgb_format_linear(format);
 
@@ -220,9 +221,6 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
brw_blorp_to_isl_format(brw, format, true),
true);
 
-   /* Override the surface format according to the context's sRGB rules. */
-   params.dst.view.format = (enum isl_format)brw->render_target_format[format];
-
const char *clear_type;
if (is_fast_clear)
   clear_type = "fast";
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 03/27] i965/miptree: Use mcs_mt->qpitch for aux surfaces

2016-07-26 Thread Jason Ekstrand
At one point, we were doing this correctly.  It must have gotten lost in
one of the many rebases.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 1911eef..330291c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3208,6 +3208,9 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
   return;
}
 
+   /* Start with a copy of the original surface. */
+   intel_miptree_get_isl_surf(brw, mt, surf);
+
/* Figure out the format and tiling of the auxiliary surface */
switch (*usage) {
case ISL_AUX_USAGE_NONE:
@@ -3294,7 +3297,8 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
 * in elements of the primary color surface so we have to divide by the
 * compression block height.
 */
-   surf->array_pitch_el_rows = mt->qpitch / 
isl_format_get_layout(surf->format)->bh;
+   surf->array_pitch_el_rows =
+  mt->mcs_mt->qpitch / isl_format_get_layout(surf->format)->bh;
 }
 
 union isl_color_value
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 00/27] i965: Rework the blorp API to use ISL

2016-07-26 Thread Jason Ekstrand
This patch series builds on the previous one I just sent and reworks the
blorp API to be entirely ISL.  The last bits of intel_mipmap_tree are
removed from the ISL internals and shoved into brw_blorp.c/h which simply
serves as a wrapper around the ISL-centric brw_blorp.h file.  Eventually,
the plan is to completely separate the internals of blorp from the i965
driver and share it with the Vulkan driver.  This is just one more step on
the very long road to getting there.  This series can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/blorp-isl-pt2

The best place to start reviewing is by looking at patch 25/27 where we
make the final API changes.  That shows off where things are going.  That
commit can be found on cgit here:

https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=review/blorp-isl-pt2=b9a55af924d9cab317224ccb9b507b9f87b44c5d

Happy Reviewing!

Cc: Topi Pohjolainen 

Jason Ekstrand (27):
  i965/miptree: Support depth in get_isl_clear_color
  i965/miptree: Allow get_aux_isl_surf when there is no aux surface
  i965/miptree: Use mcs_mt->qpitch for aux surfaces
  isl: Add helpers for creating different types of aux surfaces
  i965/miptree: Use the isl helpers for creating aux surfaces
  i965/miptree: Add real support for HiZ
  i965/blorp/blit: Move format work-arounds before surface_info_init
  i965/blorp: Stop using the miptree in state setup for tex/rt surfaces
  i964/blorp: Set up most aux surfaces up-front
  i965/blorp: Set up HiZ surfaces up-front
  i965/blorp: Do gen6 stencil offsets up-front
  i965/blorp: Use the isl_surf for more params setup
  i965/blorp: Add a new brw_blorp_surf intermediate struct
  i965/blorp: Make the guts of brw_blorp_blit_miptrees miptree-unaware
  i965/meta_util: Convert get_resolve_rect to use ISL
  i965/blorp: Pull the guts of resolve_color into a miptree-agnostic
helper
  i965/blorp: Stop calling brw_meta_get_buffer_rect
  i965/meta_util: Only modify the input parameters in
get_fast_clear_rect
  i965/blorp/clear: Stop stomping the destination format
  i965/blorp: Refactor fast-clear logic a bit
  i965/blorp/clear: Move isl_surf setup higher in the function
  i965/meta_util: Convert get_fast_clear_rect to take an isl_surf
  i965/blorp: Break the guts of do_single_blorp_clear into two helpers
  i965/blorp: Factor the guts of blorp_hiz_exec into a helper
  i965: Split brw_blorp.c/h into multiple files
  i965/blorp: brw_blorp_blit.cpp -> blorp_blit.c
  i965/blorp: brw_blorp_clear.cpp -> blorp_clear.c

 src/intel/isl/isl.c   |  121 ++
 src/intel/isl/isl.h   |   15 +
 src/mesa/drivers/dri/i965/Makefile.sources|7 +-
 src/mesa/drivers/dri/i965/blorp.c |  452 ++
 src/mesa/drivers/dri/i965/blorp.h |   92 ++
 src/mesa/drivers/dri/i965/blorp_blit.c| 1662 
 src/mesa/drivers/dri/i965/blorp_clear.c   |  190 +++
 src/mesa/drivers/dri/i965/blorp_priv.h|  393 +
 src/mesa/drivers/dri/i965/brw_blorp.c | 1121 --
 src/mesa/drivers/dri/i965/brw_blorp.h |  376 +
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 2062 -
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  332 
 src/mesa/drivers/dri/i965/brw_meta_util.c |   96 +-
 src/mesa/drivers/dri/i965/brw_meta_util.h |   12 +-
 src/mesa/drivers/dri/i965/gen6_blorp.c|   54 +-
 src/mesa/drivers/dri/i965/gen7_blorp.c|   47 +-
 src/mesa/drivers/dri/i965/gen8_blorp.c|   12 +-
 src/mesa/drivers/dri/i965/intel_copy_image.c  |1 +
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  104 +-
 19 files changed, 3733 insertions(+), 3416 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/blorp.c
 create mode 100644 src/mesa/drivers/dri/i965/blorp.h
 create mode 100644 src/mesa/drivers/dri/i965/blorp_blit.c
 create mode 100644 src/mesa/drivers/dri/i965/blorp_clear.c
 create mode 100644 src/mesa/drivers/dri/i965/blorp_priv.h
 delete mode 100644 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 delete mode 100644 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 01/27] i965/miptree: Support depth in get_isl_clear_color

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index ba06ac9..8c63aa6 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3302,7 +3302,12 @@ intel_miptree_get_isl_clear_color(struct brw_context 
*brw,
 {
union isl_color_value clear_color;
 
-   if (brw->gen >= 9) {
+   if (_mesa_get_format_base_format(mt->format) == GL_DEPTH_COMPONENT) {
+  clear_color.i32[0] = mt->depth_clear_value;
+  clear_color.i32[1] = 0;
+  clear_color.i32[2] = 0;
+  clear_color.i32[3] = 0;
+   } else if (brw->gen >= 9) {
   clear_color.i32[0] = mt->gen9_fast_clear_color.i[0];
   clear_color.i32[1] = mt->gen9_fast_clear_color.i[1];
   clear_color.i32[2] = mt->gen9_fast_clear_color.i[2];
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/27] isl: Add helpers for creating different types of aux surfaces

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl.c | 121 
 src/intel/isl/isl.h |  15 +++
 2 files changed, 136 insertions(+)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 500eb2d..18e95e2 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1241,6 +1241,127 @@ isl_surf_get_tile_info(const struct isl_device *dev,
 }
 
 void
+isl_surf_get_hiz_surf(const struct isl_device *dev,
+  const struct isl_surf *surf,
+  struct isl_surf *hiz_surf)
+{
+   assert(ISL_DEV_GEN(dev) >= 5 && ISL_DEV_USE_SEPARATE_STENCIL(dev));
+
+   /* Multisampled depth is always interleaved */
+   assert(surf->msaa_layout == ISL_MSAA_LAYOUT_NONE ||
+  surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);
+
+   isl_surf_init(dev, hiz_surf,
+ .dim = ISL_SURF_DIM_2D,
+ .format = ISL_FORMAT_HIZ,
+ .width = surf->logical_level0_px.width,
+ .height = surf->logical_level0_px.height,
+ .depth = 1,
+ .levels = surf->levels,
+ .array_len = surf->logical_level0_px.array_len,
+ /* On SKL+, HiZ is always single-sampled */
+ .samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples,
+ .usage = ISL_SURF_USAGE_HIZ_BIT,
+ .tiling_flags = ISL_TILING_HIZ_BIT);
+}
+
+void
+isl_surf_get_mcs_surf(const struct isl_device *dev,
+  const struct isl_surf *surf,
+  struct isl_surf *mcs_surf)
+{
+   /* It must be multisampled with an array layout */
+   assert(surf->samples > 1 && surf->msaa_layout == ISL_MSAA_LAYOUT_ARRAY);
+
+   /* The following are true of all multisampled surfaces */
+   assert(surf->dim == ISL_SURF_DIM_2D);
+   assert(surf->levels == 1);
+   assert(surf->logical_level0_px.depth == 1);
+
+   enum isl_format mcs_format;
+   switch (surf->samples) {
+   case 2:  mcs_format = ISL_FORMAT_MCS_2X;  break;
+   case 4:  mcs_format = ISL_FORMAT_MCS_4X;  break;
+   case 8:  mcs_format = ISL_FORMAT_MCS_8X;  break;
+   case 16: mcs_format = ISL_FORMAT_MCS_16X; break;
+   default:
+  unreachable("Invalid sample count");
+   }
+
+   isl_surf_init(dev, mcs_surf,
+ .dim = ISL_SURF_DIM_2D,
+ .format = mcs_format,
+ .width = surf->logical_level0_px.width,
+ .height = surf->logical_level0_px.height,
+ .depth = 1,
+ .levels = 1,
+ .array_len = surf->logical_level0_px.array_len,
+ .samples = 1, /* MCS surfaces are really single-sampled */
+ .usage = ISL_SURF_USAGE_MCS_BIT,
+ .tiling_flags = ISL_TILING_Y0_BIT);
+}
+
+bool
+isl_surf_get_ccs_surf(const struct isl_device *dev,
+  const struct isl_surf *surf,
+  struct isl_surf *ccs_surf)
+{
+   assert(surf->samples == 1 && surf->msaa_layout == ISL_MSAA_LAYOUT_NONE);
+   assert(ISL_DEV_GEN(dev) >= 7);
+
+   assert(surf->dim == ISL_SURF_DIM_2D);
+   assert(surf->logical_level0_px.depth == 1);
+
+   /* TODO: More conditions where it can fail. */
+
+   enum isl_format ccs_format;
+   if (ISL_DEV_GEN(dev) >= 9) {
+  if (!isl_tiling_is_any_y(surf->tiling))
+ return false;
+
+  switch (isl_format_get_layout(surf->format)->bpb) {
+  case 32:ccs_format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
+  case 64:ccs_format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
+  case 128:   ccs_format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
+  default:
+ return false;
+  }
+   } else if (surf->tiling == ISL_TILING_Y0) {
+  switch (isl_format_get_layout(surf->format)->bpb) {
+  case 32:ccs_format = ISL_FORMAT_GEN7_CCS_32BPP_Y;break;
+  case 64:ccs_format = ISL_FORMAT_GEN7_CCS_64BPP_Y;break;
+  case 128:   ccs_format = ISL_FORMAT_GEN7_CCS_128BPP_Y;   break;
+  default:
+ return false;
+  }
+   } else if (surf->tiling == ISL_TILING_X) {
+  switch (isl_format_get_layout(surf->format)->bpb) {
+  case 32:ccs_format = ISL_FORMAT_GEN7_CCS_32BPP_X;break;
+  case 64:ccs_format = ISL_FORMAT_GEN7_CCS_64BPP_X;break;
+  case 128:   ccs_format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
+  default:
+ return false;
+  }
+   } else {
+  return false;
+   }
+
+   isl_surf_init(dev, ccs_surf,
+ .dim = ISL_SURF_DIM_2D,
+ .format = ccs_format,
+ .width = surf->logical_level0_px.width,
+ .height = surf->logical_level0_px.height,
+ .depth = 1,
+ .levels = surf->levels,
+ .array_len = surf->logical_level0_px.array_len,
+ .samples = 1,
+ .usage = ISL_SURF_USAGE_CCS_BIT,
+ .tiling_flags = ISL_TILING_CCS_BIT);
+
+   return true;
+}
+
+void
 isl_surf_fill_state_s(const struct 

[Mesa-dev] [PATCH v2 05/27] i965/miptree: Use the isl helpers for creating aux surfaces

2016-07-26 Thread Jason Ekstrand
In order for the calculations of things such as fast clear rectangles to
work, we need more details of the auxiliary surface to be correct.  In
particular, we need to be able to trust the width and height fields.
(These are not necessarily what you want coming out of the miptree.)  The
only values state setup really cares about are the row and array pitch and
those we can safely stomp from the miptree.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 52 ---
 1 file changed, 6 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 330291c..f762106 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3189,9 +3189,6 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
struct isl_surf *surf,
enum isl_aux_usage *usage)
 {
-   /* Much is the same as the regular surface */
-   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
-
/* Figure out the layout */
if (_mesa_get_format_base_format(mt->format) == GL_DEPTH_COMPONENT) {
   *usage = ISL_AUX_USAGE_HIZ;
@@ -3217,9 +3214,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
   unreachable("Invalid MCS miptree");
 
case ISL_AUX_USAGE_HIZ:
-  surf->format = ISL_FORMAT_HIZ;
-  surf->tiling = ISL_TILING_HIZ;
-  surf->usage = ISL_SURF_USAGE_HIZ_BIT;
+  isl_surf_get_hiz_surf(>isl_dev, surf, surf);
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -3231,16 +3226,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
   if (brw->gen >= 9)
  assert(mt->halign == 16);
 
-  surf->usage = ISL_SURF_USAGE_MCS_BIT;
-
-  switch (mt->num_samples) {
-  case 2:  surf->format = ISL_FORMAT_MCS_2X;   break;
-  case 4:  surf->format = ISL_FORMAT_MCS_4X;   break;
-  case 8:  surf->format = ISL_FORMAT_MCS_8X;   break;
-  case 16: surf->format = ISL_FORMAT_MCS_16X;  break;
-  default:
- unreachable("Invalid number of samples");
-  }
+  isl_surf_get_mcs_surf(>isl_dev, surf, surf);
   break;
 
case ISL_AUX_USAGE_CCS_D:
@@ -3259,39 +3245,13 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
   if (brw->gen >= 8)
  assert(mt->halign == 16);
 
-  surf->tiling = ISL_TILING_CCS;
-  surf->usage = ISL_SURF_USAGE_CCS_BIT;
-
-  if (brw->gen >= 9) {
- assert(mt->tiling == I915_TILING_Y);
- switch (_mesa_get_format_bytes(mt->format)) {
- case 4:  surf->format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
- case 8:  surf->format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
- case 16: surf->format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
- default:
-unreachable("Invalid format size for color compression");
- }
-  } else if (mt->tiling == I915_TILING_Y) {
- switch (_mesa_get_format_bytes(mt->format)) {
- case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_Y;break;
- case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_Y;break;
- case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_Y;   break;
- default:
-unreachable("Invalid format size for color compression");
- }
-  } else {
- assert(mt->tiling == I915_TILING_X);
- switch (_mesa_get_format_bytes(mt->format)) {
- case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_X;break;
- case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_X;break;
- case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
- default:
-unreachable("Invalid format size for color compression");
- }
-  }
+  isl_surf_get_ccs_surf(>isl_dev, surf, surf);
   break;
}
 
+   /* We want the pitch of the actual aux buffer. */
+   surf->row_pitch = mt->mcs_mt->pitch;
+
/* Auxiliary surfaces in ISL have compressed formats and array_pitch_el_rows
 * is in elements.  This doesn't match intel_mipmap_tree::qpitch which is
 * in elements of the primary color surface so we have to divide by the
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 06/27] i965/miptree: Add real support for HiZ

2016-07-26 Thread Jason Ekstrand
The previous HiZ support was bogus because all of get_aux_isl_surf looked
at mt->mcs_mt directly.  For HiZ buffers, you need to look at either
mt->hiz_buf or mt->hiz_buf->mt.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 41 ++-
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index f762106..40a561f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3189,17 +3189,32 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
struct isl_surf *surf,
enum isl_aux_usage *usage)
 {
-   /* Figure out the layout */
-   if (_mesa_get_format_base_format(mt->format) == GL_DEPTH_COMPONENT) {
+   uint32_t aux_pitch, aux_qpitch;
+   if (mt->mcs_mt) {
+  aux_pitch = mt->mcs_mt->pitch;
+  aux_qpitch = mt->mcs_mt->qpitch;
+
+  if (mt->num_samples > 1) {
+ assert(mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS);
+ *usage = ISL_AUX_USAGE_MCS;
+  } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
+ assert(brw->gen >= 9);
+ *usage = ISL_AUX_USAGE_CCS_E;
+  } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
+ *usage = ISL_AUX_USAGE_CCS_D;
+  } else {
+ unreachable("Invalid MCS miptree");
+  }
+   } else if (mt->hiz_buf) {
+  if (mt->hiz_buf->mt) {
+ aux_pitch = mt->hiz_buf->mt->pitch;
+ aux_qpitch = mt->hiz_buf->mt->qpitch;
+  } else {
+ aux_pitch = mt->hiz_buf->pitch;
+ aux_qpitch = mt->hiz_buf->qpitch;
+  }
+
   *usage = ISL_AUX_USAGE_HIZ;
-   } else if (mt->num_samples > 1) {
-  assert(mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS);
-  *usage = ISL_AUX_USAGE_MCS;
-   } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
-  assert(brw->gen >= 9);
-  *usage = ISL_AUX_USAGE_CCS_E;
-   } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
-  *usage = ISL_AUX_USAGE_CCS_D;
} else {
   *usage = ISL_AUX_USAGE_NONE;
   return;
@@ -3211,7 +3226,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
/* Figure out the format and tiling of the auxiliary surface */
switch (*usage) {
case ISL_AUX_USAGE_NONE:
-  unreachable("Invalid MCS miptree");
+  unreachable("Invalid auxiliary usage");
 
case ISL_AUX_USAGE_HIZ:
   isl_surf_get_hiz_surf(>isl_dev, surf, surf);
@@ -3250,7 +3265,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
}
 
/* We want the pitch of the actual aux buffer. */
-   surf->row_pitch = mt->mcs_mt->pitch;
+   surf->row_pitch = aux_pitch;
 
/* Auxiliary surfaces in ISL have compressed formats and array_pitch_el_rows
 * is in elements.  This doesn't match intel_mipmap_tree::qpitch which is
@@ -3258,7 +3273,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
 * compression block height.
 */
surf->array_pitch_el_rows =
-  mt->mcs_mt->qpitch / isl_format_get_layout(surf->format)->bh;
+  aux_qpitch / isl_format_get_layout(surf->format)->bh;
 }
 
 union isl_color_value
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 07/35] i965/blorp: Get rid of brw_blorp_surface_info::map_stencil_as_y_tiled

2016-07-26 Thread Jason Ekstrand
Now that we're carrying around the isl_surf, we can just modify it
directly instead of passing an extra bit around.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 12 ++---
 src/mesa/drivers/dri/i965/brw_blorp.h| 15 ---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 38 ++--
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 220be83..7a4b94b 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -71,7 +71,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 
info->num_samples = mt->num_samples;
info->array_layout = mt->array_layout;
-   info->map_stencil_as_y_tiled = false;
info->msaa_layout = mt->msaa_layout;
info->swizzle = SWIZZLE_XYZW;
 
@@ -80,11 +79,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 
switch (format) {
case MESA_FORMAT_S_UINT8:
-  /* The miptree is a W-tiled stencil buffer.  Surface states can't be set
-   * up for W tiling, so we'll need to use Y tiling and have the WM
-   * program swizzle the coordinates.
-   */
-  info->map_stencil_as_y_tiled = true;
+  assert(info->surf.tiling == ISL_TILING_W);
+  /* Prior to Broadwell, we can't render to R8_UINT */
   info->brw_surfaceformat = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
 BRW_SURFACEFORMAT_R8_UNORM;
   break;
@@ -290,10 +286,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
   surf.image_alignment_el = isl_extent3d(4, 2, 1);
}
 
-   /* We need to fake W-tiling with Y-tiling */
-   if (surface->map_stencil_as_y_tiled)
-  surf.tiling = ISL_TILING_Y0;
-
union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
 
const struct isl_surf *aux_surf = NULL;
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 0694181..010b760 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -118,21 +118,6 @@ struct brw_blorp_surface_info
 */
uint32_t y_offset;
 
-   /* Setting this flag indicates that the buffer's contents are W-tiled
-* stencil data, but the surface state should be set up for Y tiled
-* MESA_FORMAT_R_UNORM8 data (this is necessary because surface states don't
-* support W tiling).
-*
-* Since W tiles are 64 pixels wide by 64 pixels high, whereas Y tiles of
-* MESA_FORMAT_R_UNORM8 data are 128 pixels wide by 32 pixels high, the 
width and
-* pitch stored in the surface state will be multiplied by 2, and the
-* height will be halved.  Also, since W and Y tiles store their data in a
-* different order, the width and height will be rounded up to a multiple
-* of the tile size, to ensure that the WM program can access the full
-* width and height of the buffer.
-*/
-   bool map_stencil_as_y_tiled;
-
unsigned num_samples;
 
/**
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index a54680e..a68e406 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1737,16 +1737,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
  params.dst.num_samples = 0;
}
 
-   if (params.dst.map_stencil_as_y_tiled && params.dst.num_samples > 1) {
-  /* If the destination surface is a W-tiled multisampled stencil buffer
-   * that we're mapping as Y tiled, then we need to arrange for the WM
-   * program to run once per sample rather than once per pixel, because
-   * the memory layout of related samples doesn't match between W and Y
-   * tiling.
-   */
-  wm_prog_key.persample_msaa_dispatch = true;
-   }
-
if (params.src.num_samples > 0 && params.dst.num_samples > 1) {
   /* We are blitting from a multisample buffer to a multisample buffer, so
* we must preserve samples within a pixel.  This means we have to
@@ -1830,8 +1820,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS)
   wm_prog_key.dst_layout = INTEL_MSAA_LAYOUT_NONE;
 
-   wm_prog_key.src_tiled_w = params.src.map_stencil_as_y_tiled;
-   wm_prog_key.dst_tiled_w = params.dst.map_stencil_as_y_tiled;
/* Round floating point values to nearest integer to avoid "off by one 
texel"
 * kind of errors when blitting.
 */
@@ -1905,7 +1893,22 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   wm_prog_key.use_kill = true;
}
 
-   if (params.dst.map_stencil_as_y_tiled) {
+   if (params.dst.surf.tiling == ISL_TILING_W) {
+  /* We need to fake W-tiling with Y-tiling */
+  params.dst.surf.tiling = ISL_TILING_Y0;
+
+  wm_prog_key.dst_tiled_w = true;
+
+  if (params.dst.num_samples > 1) {
+ /* 

[Mesa-dev] [PATCH v2 01/35] isl: Fix the parameter names for get_intratile_offset

2016-07-26 Thread Jason Ekstrand
It's been in elements for a while but, for whatever reason, the parameter
names in the header file never got updated.
---
 src/intel/isl/isl.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 19673f8..d0bac5d 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -1353,11 +1353,11 @@ isl_tiling_get_intratile_offset_el(const struct 
isl_device *dev,
enum isl_tiling tiling,
uint8_t bs,
uint32_t row_pitch,
-   uint32_t total_x_offset_B,
-   uint32_t total_y_offset_rows,
+   uint32_t total_x_offset_el,
+   uint32_t total_y_offset_el,
uint32_t *base_address_offset,
-   uint32_t *x_offset_B,
-   uint32_t *y_offset_rows);
+   uint32_t *x_offset_el,
+   uint32_t *y_offset_el);
 
 /**
  * @brief Get value of 3DSTATE_DEPTH_BUFFER.SurfaceFormat
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 14/35] i965/blorp: Move intratile offset calculations out of surface state setup

2016-07-26 Thread Jason Ekstrand
Previously we multiplied full x/y offsets, resolved tile aligned buffer
offset and intra tile offset based on that.  Now we let ISL to take into
account the msaa setting and we only multiply the resolved intra tile
offsets.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 24 
 src/mesa/drivers/dri/i965/brw_blorp.h| 15 ++-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  8 
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 48755fc..8f7690c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -66,9 +66,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
info->width = minify(mt->physical_width0, level - mt->first_level);
info->height = minify(mt->physical_height0, level - mt->first_level);
 
-   intel_miptree_get_image_offset(mt, level, layer,
-  >x_offset, >y_offset);
-
info->swizzle = SWIZZLE_XYZW;
 
if (format == MESA_FORMAT_NONE)
@@ -110,6 +107,15 @@ brw_blorp_surface_info_init(struct brw_context *brw,
   break;
}
}
+
+   uint32_t x_offset, y_offset;
+   intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
+
+   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bpb / 8;
+   isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
+  info->surf.row_pitch, x_offset, y_offset,
+  >bo_offset,
+  >tile_x_sa, >tile_y_sa);
 }
 
 
@@ -296,13 +302,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
   ISL_SURF_USAGE_TEXTURE_BIT,
};
 
-   uint32_t offset, tile_x, tile_y;
-   isl_tiling_get_intratile_offset_el(>isl_dev, surf.tiling,
-  isl_format_get_layout(view.format)->bpb 
/ 8,
-  surf.row_pitch,
-  surface->x_offset, surface->y_offset,
-  , _x, _y);
-
uint32_t surf_offset;
uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
   ss_info.num_dwords * 4, ss_info.ss_align,
@@ -311,11 +310,12 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
const uint32_t mocs = is_render_target ? ss_info.rb_mocs : ss_info.tex_mocs;
 
isl_surf_fill_state(>isl_dev, dw, .surf = , .view = ,
-   .address = surface->mt->bo->offset64 + offset,
+   .address = surface->mt->bo->offset64 + 
surface->bo_offset,
.aux_surf = aux_surf, .aux_usage = surface->aux_usage,
.aux_address = aux_offset,
.mocs = mocs, .clear_color = clear_color,
-   .x_offset_sa = tile_x, .y_offset_sa = tile_y);
+   .x_offset_sa = surface->tile_x_sa,
+   .y_offset_sa = surface->tile_y_sa);
 
/* Emit relocation to surface contents */
drm_intel_bo_emit_reloc(brw->batch.bo,
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 7aa67be..e591f41 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -104,19 +104,8 @@ struct brw_blorp_surface_info
 */
uint32_t height;
 
-   /**
-* X offset within the surface to texture from (or render to).  For
-* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not
-* pixels.
-*/
-   uint32_t x_offset;
-
-   /**
-* Y offset within the surface to texture from (or render to).  For
-* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not
-* pixels.
-*/
-   uint32_t y_offset;
+   uint32_t bo_offset;
+   uint32_t tile_x_sa, tile_y_sa;
 
/**
 * Format that should be used when setting up the surface state for this
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 03e4984..fc0aada 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1899,8 +1899,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   params.y1 = ALIGN(params.y1, y_align) / 2;
   params.dst.width = ALIGN(params.dst.width, x_align) * 2;
   params.dst.height = ALIGN(params.dst.height, y_align) / 2;
-  params.dst.x_offset *= 2;
-  params.dst.y_offset /= 2;
+  params.dst.tile_x_sa *= 2;
+  params.dst.tile_y_sa /= 2;
   wm_prog_key.use_kill = true;
}
 
@@ -1924,8 +1924,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   const unsigned x_align = 8, y_align = params.src.surf.samples != 0 ? 8 : 
4;
   params.src.width = ALIGN(params.src.width, x_align) * 2;
   

[Mesa-dev] [PATCH v2 23/35] isl: Take the slice0_extent shortcut for interleaved MSAA

2016-07-26 Thread Jason Ekstrand
The shortcut works just fine for MSAA and the comment even says so.

Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index a9208f6..500eb2d 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -623,7 +623,7 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
 
assert(phys_level0_sa->depth == 1);
 
-   if (info->levels == 1 && msaa_layout != ISL_MSAA_LAYOUT_INTERLEAVED) {
+   if (info->levels == 1) {
   /* Do not pad the surface to the image alignment. Instead, pad it only
* to the pixel format's block alignment.
*
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 21/35] i965/blorp: Use the isl_view from the blorp_surface_info

2016-07-26 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 18 +-
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 78707ca..d9b5554 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -386,22 +386,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
   clear_color = intel_miptree_get_isl_clear_color(brw, surface->mt);
}
 
-   struct isl_view view = {
-  .format = surface->view.format,
-  .base_level = 0,
-  .levels = 1,
-  .base_array_layer = 0,
-  .array_len = 1,
-  .channel_select = {
- ISL_CHANNEL_SELECT_RED,
- ISL_CHANNEL_SELECT_GREEN,
- ISL_CHANNEL_SELECT_BLUE,
- ISL_CHANNEL_SELECT_ALPHA,
-  },
-  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
-  ISL_SURF_USAGE_TEXTURE_BIT,
-   };
-
uint32_t surf_offset;
uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
   ss_info.num_dwords * 4, ss_info.ss_align,
@@ -409,7 +393,7 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
 
const uint32_t mocs = is_render_target ? ss_info.rb_mocs : ss_info.tex_mocs;
 
-   isl_surf_fill_state(>isl_dev, dw, .surf = , .view = ,
+   isl_surf_fill_state(>isl_dev, dw, .surf = , .view = 
>view,
.address = surface->mt->bo->offset64 + 
surface->bo_offset,
.aux_surf = aux_surf, .aux_usage = surface->aux_usage,
.aux_address = aux_offset,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] i965: Delete brw_do_channel_expressions().

2016-07-26 Thread Kenneth Graunke
I re-ran these numbers with my SSO patches for shader-db and my
move_interpolation_to_top() pass fixed.  They're pretty similar:

(didn't run Haswell)

On Broadwell:

total instructions in shared programs: 11632138 -> 11641224 (0.08%)
instructions in affected programs: 1525250 -> 1534336 (0.60%)
helped: 1775
HURT: 5621

total cycles in shared programs: 144771824 -> 144924416 (0.11%)
cycles in affected programs: 115845654 -> 115998246 (0.13%)
helped: 20545
HURT: 36635

total loops in shared programs: 3345 -> 3345 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 2924 -> 3129 (7.01%)
spills in affected programs: 1002 -> 1207 (20.46%)
helped: 0
HURT: 7

total fills in shared programs: 4394 -> 4563 (3.85%)
fills in affected programs: 851 -> 1020 (19.86%)
helped: 0
HURT: 7

LOST:   18
GAINED: 31

On Skylake:

total instructions in shared programs: 11947696 -> 11956226 (0.07%)
instructions in affected programs: 1549219 -> 1557749 (0.55%)
helped: 1757
HURT: 5579

total cycles in shared programs: 134233408 -> 134336690 (0.08%)
cycles in affected programs: 105950250 -> 106053532 (0.10%)
helped: 20278
HURT: 36719

total loops in shared programs: 3209 -> 3209 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 3805 -> 3898 (2.44%)
spills in affected programs: 705 -> 798 (13.19%)
helped: 2
HURT: 61

total fills in shared programs: 5318 -> 5409 (1.71%)
fills in affected programs: 659 -> 750 (13.81%)
helped: 2
HURT: 61

LOST:   30
GAINED: 15


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db v2 2/2] run: Mark shaders with only one stage as separable.

2016-07-26 Thread Kenneth Graunke
There are a couple cases where a single shader might happen:

- compute shaders
  (only one stage, no inputs and outputs; separable shouldn't matter)
- vertex shaders with transform feedback
  (we want to retain outputs, but transform feedback varyings are
   specified via the API, not the shader - setting SSO fixes this)
- old shader_test files captured before we started adding "SSO ENABLED".

In any case, it seems harmless or beneficial to enable SSO for all
.shader_test files containing a single shader.

Based on a patch by Marek.

v2: Ignore VP/FP shaders.
---
 run.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/run.c b/run.c
index 2fed284..e8bc2c1 100644
--- a/run.c
+++ b/run.c
@@ -633,10 +633,15 @@ main(int argc, char **argv)
 }
 ctx_is_core = type == TYPE_CORE;
 
+/* If there's only one GLSL shader, mark it separable so
+ * inputs and outputs aren't eliminated.
+ */
+if (num_shaders == 1 && type != TYPE_VP && type != TYPE_FP)
+use_separate_shader_objects = true;
+
 if (use_separate_shader_objects) {
 for (unsigned i = 0; i < num_shaders; i++) {
-GLuint prog = glCreateShaderProgramv(shader[i].type, 1,
- [i].text);
+glCreateShaderProgramv(shader[i].type, 1, [i].text);
 }
 } else if (type == TYPE_CORE || type == TYPE_COMPAT) {
 GLuint prog = glCreateProgram();
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 26/35] i965/blorp: Rework hiz rect alignment calculations

2016-07-26 Thread Jason Ekstrand
At the moment, the minify operation does nothing because
params.depth.view.base_level is always zero.  However, as soon as we start
using actual base miplevels and array slices, we are going to need the
minification.  Also, we only need to align the surface dimensions in the
case where we are operating on miplevel 0.  Previously, it didn't matter
because it aligned on miplevel 0 and, for all other miplevels, the miptree
code guaranteed that the level was already aligned.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 2cf0f99..bc26e41 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -593,14 +593,21 @@ gen6_blorp_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 * not 8. But commit 1f112cc increased the alignment from 4 to 8, which
 * prevents the clobbering.
 */
-   params.dst.surf.samples = MAX2(mt->num_samples, 1);
-   params.depth.surf.logical_level0_px.width =
-  ALIGN(params.depth.surf.logical_level0_px.width, 8);
-   params.depth.surf.logical_level0_px.height =
-  ALIGN(params.depth.surf.logical_level0_px.height, 4);
-
-   params.x1 = params.depth.surf.logical_level0_px.width;
-   params.y1 = params.depth.surf.logical_level0_px.height;
+   params.x1 = minify(params.depth.surf.logical_level0_px.width,
+  params.depth.view.base_level);
+   params.y1 = minify(params.depth.surf.logical_level0_px.height,
+  params.depth.view.base_level);
+   params.x1 = ALIGN(params.x1, 8);
+   params.y1 = ALIGN(params.y1, 4);
+
+   if (params.depth.view.base_level == 0) {
+  /* TODO: What about MSAA? */
+  params.depth.surf.logical_level0_px.width = params.x1;
+  params.depth.surf.logical_level0_px.height = params.y1;
+   }
+
+   params.dst.surf.samples = params.depth.surf.samples;
+   params.dst.surf.logical_level0_px = params.depth.surf.logical_level0_px;
 
assert(intel_miptree_level_has_hiz(mt, level));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 18/35] i965/blorp: Use ISL to compute image offsets

2016-07-26 Thread Jason Ekstrand
For the moment, we still call the old miptree function; we just assert that
the two are equal.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 94 +--
 1 file changed, 91 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index ef256a7..c8cb41a 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -32,6 +32,88 @@
 
 #define FILE_DEBUG_FLAG DEBUG_BLORP
 
+/**
+ * A variant of isl_surf_get_image_offset_sa() specific to gen6 stencil and
+ * HiZ surfaces.
+ */
+static void
+get_image_offset_sa_gen6_stencil(const struct isl_surf *surf,
+ uint32_t level, uint32_t logical_array_layer,
+ uint32_t *x_offset_sa,
+ uint32_t *y_offset_sa)
+{
+   assert(surf->tiling == ISL_TILING_W || surf->format == ISL_FORMAT_HIZ);
+   assert(level < surf->levels);
+   assert(logical_array_layer < surf->logical_level0_px.array_len);
+
+   const struct isl_extent3d image_align_sa =
+  isl_surf_get_image_alignment_sa(surf);
+
+   const uint32_t W0 = surf->phys_level0_sa.width;
+   const uint32_t H0 = surf->phys_level0_sa.height;
+
+   uint32_t x = 0, y = 0;
+   for (uint32_t l = 0; l < level; ++l) {
+  if (l == 1) {
+ uint32_t W = minify(W0, l);
+
+ if (surf->samples > 1) {
+assert(surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);
+assert(surf->samples == 4);
+W = ALIGN(W, 2) * 2;
+ }
+
+ x += ALIGN(W, image_align_sa.w);
+  } else {
+ uint32_t H = minify(H0, l);
+
+ if (surf->samples > 1) {
+assert(surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);
+assert(surf->samples == 4);
+H = ALIGN(H, 2) * 2;
+ }
+
+ y += ALIGN(H, image_align_sa.h) * surf->logical_level0_px.array_len;
+  }
+   }
+
+   /* Now account for our location within the given LOD */
+   uint32_t Hl = minify(H0, level);
+   if (surf->samples > 1) {
+  assert(surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);
+  assert(surf->samples == 4);
+  Hl = ALIGN(Hl, 2) * 2;
+   }
+   y += ALIGN(Hl, image_align_sa.h) * logical_array_layer;
+
+   *x_offset_sa = x;
+   *y_offset_sa = y;
+}
+
+static void
+blorp_get_image_offset_sa(struct isl_device *dev, const struct isl_surf *surf,
+  uint32_t level, uint32_t layer,
+  uint32_t *x_offset_sa,
+  uint32_t *y_offset_sa)
+{
+   if (ISL_DEV_GEN(dev) == 6 && surf->tiling == ISL_TILING_W) {
+  get_image_offset_sa_gen6_stencil(surf, level, layer,
+   x_offset_sa, y_offset_sa);
+   } else {
+  /* Using base_array_layer for Z in 3-D surfaces is a bit abusive, but it
+   * will go away soon enough.
+   */
+  uint32_t z = 0;
+  if (surf->dim == ISL_SURF_DIM_3D) {
+ z = layer;
+ layer = 0;
+  }
+
+  isl_surf_get_image_offset_sa(surf, level, layer, z,
+   x_offset_sa, y_offset_sa);
+   }
+}
+
 void
 brw_blorp_surface_info_init(struct brw_context *brw,
 struct brw_blorp_surface_info *info,
@@ -125,10 +207,16 @@ brw_blorp_surface_info_init(struct brw_context *brw,
}
 
uint32_t x_offset, y_offset;
-   intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
+   blorp_get_image_offset_sa(>isl_dev, >surf,
+ level, layer / layer_multiplier,
+ _offset, _offset);
+
+   uint32_t mt_x, mt_y;
+   intel_miptree_get_image_offset(mt, level, layer, _x, _y);
+   assert(mt_x == x_offset && mt_y == y_offset);
 
-   uint8_t bs = isl_format_get_layout(info->view.format)->bpb / 8;
-   isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
+   isl_tiling_get_intratile_offset_sa(>isl_dev, info->surf.tiling,
+  info->view.format,
   info->surf.row_pitch, x_offset, y_offset,
   >bo_offset,
   >tile_x_sa, >tile_y_sa);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 23/27] i965/blorp: Break the guts of do_single_blorp_clear into two helpers

2016-07-26 Thread Jason Ekstrand
The helpers are completely miptree-unaware and each fairly cleanly do a
single thing.  This does come at the downside of not doing proper debug
reporting on whether or not we're doing replicated clears.
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 175 --
 1 file changed, 111 insertions(+), 64 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index 6cb28d0..4b4b8af 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -120,33 +120,51 @@ set_write_disables(const struct intel_renderbuffer *irb,
return disables;
 }
 
-static bool
-do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
-  struct gl_renderbuffer *rb, unsigned buf,
-  bool partial_clear, bool encode_srgb, unsigned layer)
-{
-   struct gl_context *ctx = >ctx;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   mesa_format format = irb->mt->format;
 
+static void
+blorp_fast_clear(struct brw_context *brw, const struct brw_blorp_surf *surf,
+ uint32_t level, uint32_t layer,
+ uint32_t x0, uint32_t y0, uint32_t x1, uint32_t y1)
+{
struct brw_blorp_params params;
brw_blorp_params_init();
 
-   /* Override the surface format according to the context's sRGB rules. */
-   if (!encode_srgb && _mesa_get_format_color_encoding(format) == GL_SRGB)
-  format = _mesa_get_srgb_format_linear(format);
+   params.x0 = x0;
+   params.y0 = y0;
+   params.x1 = x1;
+   params.y1 = y1;
 
-   params.x0 = fb->_Xmin;
-   params.x1 = fb->_Xmax;
-   if (rb->Name != 0) {
-  params.y0 = fb->_Ymin;
-  params.y1 = fb->_Ymax;
-   } else {
-  params.y0 = rb->Height - fb->_Ymax;
-  params.y1 = rb->Height - fb->_Ymin;
-   }
+   memset(_inputs, 0xff, 4*sizeof(float));
+   params.fast_clear_op = GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE;
+
+   brw_get_fast_clear_rect(brw, surf->aux_surf, , ,
+   , );
+
+   brw_blorp_params_get_clear_kernel(brw, , true);
+
+   brw_blorp_surface_info_init(brw, , surf, level, layer,
+   surf->surf->format, true);
+
+   brw_blorp_exec(brw, );
+}
 
-   memcpy(_inputs, ctx->Color.ClearColor.f, sizeof(float) * 4);
+
+static void
+blorp_clear(struct brw_context *brw, const struct brw_blorp_surf *surf,
+uint32_t level, uint32_t layer,
+uint32_t x0, uint32_t y0, uint32_t x1, uint32_t y1,
+enum isl_format format, union isl_color_value clear_color,
+bool color_write_disable[4])
+{
+   struct brw_blorp_params params;
+   brw_blorp_params_init();
+
+   params.x0 = x0;
+   params.y0 = y0;
+   params.x1 = x1;
+   params.y1 = y1;
+
+   memcpy(_inputs, clear_color.f32, sizeof(float) * 4);
 
bool use_simd16_replicated_data = true;
 
@@ -156,21 +174,60 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
 *  accessing tiled memory.  Using this Message Type to access linear
 *  (untiled) memory is UNDEFINED."
 */
-   if (irb->mt->tiling == I915_TILING_NONE)
+   if (surf->surf->tiling == ISL_TILING_LINEAR)
   use_simd16_replicated_data = false;
 
/* Constant color writes ignore everyting in blend and color calculator
 * state.  This is not documented.
 */
-   if (set_write_disables(irb, ctx->Color.ColorMask[buf],
-  params.color_write_disable))
-  use_simd16_replicated_data = false;
+   for (unsigned i = 0; i < 4; i++) {
+  params.color_write_disable[i] = color_write_disable[i];
+  if (color_write_disable[i])
+ use_simd16_replicated_data = false;
+   }
+
+   brw_blorp_params_get_clear_kernel(brw, , use_simd16_replicated_data);
+
+   brw_blorp_surface_info_init(brw, , surf, level, layer,
+   format, true);
 
-   bool is_fast_clear = false;
-   if (irb->mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS &&
-   !partial_clear && use_simd16_replicated_data &&
-   brw_is_color_fast_clear_compatible(brw, irb->mt,
-  >Color.ClearColor)) {
+   brw_blorp_exec(brw, );
+}
+
+static bool
+do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
+  struct gl_renderbuffer *rb, unsigned buf,
+  bool partial_clear, bool encode_srgb, unsigned layer)
+{
+   struct gl_context *ctx = >ctx;
+   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+   mesa_format format = irb->mt->format;
+   uint32_t x0, x1, y0, y1;
+
+   if (!encode_srgb && _mesa_get_format_color_encoding(format) == GL_SRGB)
+  format = _mesa_get_srgb_format_linear(format);
+
+   x0 = fb->_Xmin;
+   x1 = fb->_Xmax;
+   if (rb->Name != 0) {
+  y0 = fb->_Ymin;
+  y1 = fb->_Ymax;
+   } else {
+  y0 = rb->Height - fb->_Ymax;
+  y1 = rb->Height - fb->_Ymin;
+   }
+
+   bool 

[Mesa-dev] [PATCH v2 06/35] i965/blorp: Remove compute_tile_offsets

2016-07-26 Thread Jason Ekstrand
We have a handy little function is ISL that does exactly the same thing.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 34 +-
 src/mesa/drivers/dri/i965/brw_blorp.h |  5 -
 2 files changed, 5 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 5889e95..220be83 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -120,34 +120,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 }
 
 
-/**
- * Split x_offset and y_offset into a base offset (in bytes) and a remaining
- * x/y offset (in pixels).  Note: we can't do this by calling
- * intel_renderbuffer_tile_offsets(), because the offsets may have been
- * adjusted to account for Y vs. W tiling differences.  So we compute it
- * directly from the adjusted offsets.
- */
-uint32_t
-brw_blorp_compute_tile_offsets(const struct brw_blorp_surface_info *info,
-   uint32_t *tile_x, uint32_t *tile_y)
-{
-   uint32_t mask_x, mask_y;
-   uint32_t tiling = info->mt->tiling;
-   if (info->map_stencil_as_y_tiled)
-  tiling = I915_TILING_Y;
-
-   intel_get_tile_masks(tiling, info->mt->tr_mode, info->mt->cpp,
-_x, _y);
-
-   *tile_x = info->x_offset & mask_x;
-   *tile_y = info->y_offset & mask_y;
-
-   return intel_miptree_get_aligned_offset(info->mt, info->x_offset & ~mask_x,
-   info->y_offset & ~mask_y,
-   info->map_stencil_as_y_tiled);
-}
-
-
 void
 brw_blorp_params_init(struct brw_blorp_params *params)
 {
@@ -354,7 +326,11 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
};
 
uint32_t offset, tile_x, tile_y;
-   offset = brw_blorp_compute_tile_offsets(surface, _x, _y);
+   isl_tiling_get_intratile_offset_el(>isl_dev, surf.tiling,
+  isl_format_get_layout(view.format)->bpb 
/ 8,
+  surf.row_pitch,
+  surface->x_offset, surface->y_offset,
+  , _x, _y);
 
uint32_t surf_offset;
uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index e23f48b..0694181 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -172,11 +172,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 unsigned int level, unsigned int layer,
 mesa_format format, bool is_render_target);
 
-uint32_t
-brw_blorp_compute_tile_offsets(const struct brw_blorp_surface_info *info,
-   uint32_t *tile_x, uint32_t *tile_y);
-
-
 
 struct brw_blorp_coord_transform
 {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 20/27] i965/blorp: Refactor fast-clear logic a bit

2016-07-26 Thread Jason Ekstrand
This pulls the mcs allocation into the if statement where we initially
determine that we are doing a fast clear and moves the programming of
wm_inputs and figuring out the fast clear rect into it's own if statement.
The next commit will put code inbetween the two.
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index 4d3fe58..a66e955 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -166,22 +166,11 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   params.color_write_disable))
   use_simd16_replicated_data = false;
 
+   bool is_fast_clear = false;
if (irb->mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS &&
!partial_clear && use_simd16_replicated_data &&
brw_is_color_fast_clear_compatible(brw, irb->mt,
   >Color.ClearColor)) {
-  memset(_inputs, 0xff, 4*sizeof(float));
-  params.fast_clear_op = GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE;
-
-  brw_get_fast_clear_rect(brw, irb->mt, , ,
-  , );
-   }
-
-   brw_blorp_params_get_clear_kernel(brw, , use_simd16_replicated_data);
-
-   const bool is_fast_clear =
-  params.fast_clear_op == GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE;
-   if (is_fast_clear) {
   /* Record the clear color in the miptree so that it will be
* programmed in SURFACE_STATE by later rendering and resolve
* operations.
@@ -208,8 +197,20 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
 return false;
  }
   }
+
+  is_fast_clear = true;
}
 
+   if (is_fast_clear) {
+  memset(_inputs, 0xff, 4*sizeof(float));
+  params.fast_clear_op = GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE;
+
+  brw_get_fast_clear_rect(brw, irb->mt, , ,
+  , );
+   }
+
+   brw_blorp_params_get_clear_kernel(brw, , use_simd16_replicated_data);
+
intel_miptree_check_level_layer(irb->mt, irb->mt_level, layer);
intel_miptree_used_for_rendering(irb->mt);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 35/35] isl/state: Add an assertion for IVB multisample array textures

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl_surface_state.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index fb23414..990b763 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -239,6 +239,19 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
switch (s.SurfaceType) {
case SURFTYPE_1D:
case SURFTYPE_2D:
+  /* From the Ivy Bridge PRM >> RENDER_SURFACE_STATE::MinimumArrayElement:
+   *
+   *"If Number of Multisamples is not MULTISAMPLECOUNT_1, this field
+   *must be set to zero if this surface is used with sampling engine
+   *messages."
+   *
+   * This restriction appears to exist only on Ivy Bridge.
+   */
+  if (GEN_GEN == 7 && !GEN_IS_HASWELL && !ISL_DEV_IS_BAYTRAIL(dev) &&
+  (info->view->usage & ISL_SURF_USAGE_TEXTURE_BIT) &&
+  info->surf->samples > 1)
+ assert(info->view->base_array_layer == 0);
+
   s.MinimumArrayElement = info->view->base_array_layer;
 
   /* From the Broadwell PRM >> RENDER_SURFACE_STATE::Depth:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 17/35] isl: Add functions for computing surface offsets in samples

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl.c | 24 
 src/intel/isl/isl.h | 48 
 2 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index a713eeb..f65f9c8 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1475,13 +1475,13 @@ get_image_offset_sa_gen9_1d(const struct isl_surf *surf,
  * @invariant logical_array_layer < logical array length of surface
  * @invariant logical_z_offset_px < logical depth of surface at level
  */
-static void
-get_image_offset_sa(const struct isl_surf *surf,
-uint32_t level,
-uint32_t logical_array_layer,
-uint32_t logical_z_offset_px,
-uint32_t *x_offset_sa,
-uint32_t *y_offset_sa)
+void
+isl_surf_get_image_offset_sa(const struct isl_surf *surf,
+ uint32_t level,
+ uint32_t logical_array_layer,
+ uint32_t logical_z_offset_px,
+ uint32_t *x_offset_sa,
+ uint32_t *y_offset_sa)
 {
assert(level < surf->levels);
assert(logical_array_layer < surf->logical_level0_px.array_len);
@@ -1524,11 +1524,11 @@ isl_surf_get_image_offset_el(const struct isl_surf 
*surf,
   < isl_minify(surf->logical_level0_px.depth, level));
 
uint32_t x_offset_sa, y_offset_sa;
-   get_image_offset_sa(surf, level,
-   logical_array_layer,
-   logical_z_offset_px,
-   _offset_sa,
-   _offset_sa);
+   isl_surf_get_image_offset_sa(surf, level,
+logical_array_layer,
+logical_z_offset_px,
+_offset_sa,
+_offset_sa);
 
*x_offset_el = x_offset_sa / fmtl->bw;
*y_offset_el = y_offset_sa / fmtl->bh;
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index d0bac5d..68ad8a4 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -1323,6 +1323,22 @@ isl_surf_get_array_pitch(const struct isl_surf *surf)
 }
 
 /**
+ * Calculate the offset, in units of surface samples, to a subimage in the
+ * surface.
+ *
+ * @invariant level < surface levels
+ * @invariant logical_array_layer < logical array length of surface
+ * @invariant logical_z_offset_px < logical depth of surface at level
+ */
+void
+isl_surf_get_image_offset_sa(const struct isl_surf *surf,
+ uint32_t level,
+ uint32_t logical_array_layer,
+ uint32_t logical_z_offset_px,
+ uint32_t *x_offset_sa,
+ uint32_t *y_offset_sa);
+
+/**
  * Calculate the offset, in units of surface elements, to a subimage in the
  * surface.
  *
@@ -1359,6 +1375,38 @@ isl_tiling_get_intratile_offset_el(const struct 
isl_device *dev,
uint32_t *x_offset_el,
uint32_t *y_offset_el);
 
+static inline void
+isl_tiling_get_intratile_offset_sa(const struct isl_device *dev,
+   enum isl_tiling tiling,
+   enum isl_format format,
+   uint32_t row_pitch,
+   uint32_t total_x_offset_sa,
+   uint32_t total_y_offset_sa,
+   uint32_t *base_address_offset,
+   uint32_t *x_offset_sa,
+   uint32_t *y_offset_sa)
+{
+   const struct isl_format_layout *fmtl = isl_format_get_layout(format);
+
+   assert(fmtl->bpb % 8 == 0);
+
+   /* For computing the intratile offsets, we actually want a strange unit
+* which is samples for multisampled surfaces but elements for compressed
+* surfaces.
+*/
+   assert(total_x_offset_sa % fmtl->bw == 0);
+   assert(total_y_offset_sa % fmtl->bw == 0);
+   const uint32_t total_x_offset = total_x_offset_sa / fmtl->bw;
+   const uint32_t total_y_offset = total_y_offset_sa / fmtl->bh;
+
+   isl_tiling_get_intratile_offset_el(dev, tiling, fmtl->bpb / 8, row_pitch,
+  total_x_offset, total_y_offset,
+  base_address_offset,
+  x_offset_sa, y_offset_sa);
+   *x_offset_sa *= fmtl->bw;
+   *y_offset_sa *= fmtl->bh;
+}
+
 /**
  * @brief Get value of 3DSTATE_DEPTH_BUFFER.SurfaceFormat
  *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 28/35] i965/blorp: Add a z_offset field to blorp_surface_info

2016-07-26 Thread Jason Ekstrand
The layer field is in terms of physical layers which isn't quite what the
sampler will want for 2-D MS array textures.
---
 src/mesa/drivers/dri/i965/brw_blorp.c|  9 +
 src/mesa/drivers/dri/i965/brw_blorp.h|  3 +++
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 11 ++-
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index bc26e41..64e507a 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -201,6 +201,15 @@ brw_blorp_surface_info_init(struct brw_context *brw,
   },
};
 
+   if (brw->gen >= 8 && !is_render_target && info->surf.dim == 
ISL_SURF_DIM_3D) {
+  /* On gen8+ we use actual 3-D textures so we need to pass the layer
+   * through to the sampler.
+   */
+  info->z_offset = layer / layer_multiplier;
+   } else {
+  info->z_offset = 0;
+   }
+
info->level = level;
info->layer = layer;
 
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 282235d..ec12dfe 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -78,6 +78,9 @@ struct brw_blorp_surface_info
 
struct isl_view view;
 
+   /* Z offset into a 3-D texture or slice of a 2-D array texture. */
+   uint32_t z_offset;
+
/**
 * The miplevel to use.
 */
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index a76d130..a35cdb3 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1779,15 +1779,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
brw_blorp_setup_coord_transform(_inputs.coord_transform[1],
src_y0, src_y1, dst_y0, dst_y1, mirror_y);
 
-   if (brw->gen >= 8 && params.src.mt->target == GL_TEXTURE_3D) {
-  /* On gen8+ we use actual 3-D textures so we need to pass the layer
-   * through to the sampler.
-   */
-  params.wm_inputs.src_z = params.src.layer;
-   } else {
-  /* On gen7 and earlier, we fake everything with 2-D textures */
-  params.wm_inputs.src_z = 0;
-   }
+   /* For some texture types, we need to pass the layer through the sampler. */
+   params.wm_inputs.src_z = params.src.z_offset;
 
if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
   /* We must expand the rectangle we send through the rendering pipeline,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 13/35] i965/blorp: Refactor interleaved multisample destination handling

2016-07-26 Thread Jason Ekstrand
We put all of the code for fake IMS together.  This requires moving a bit
of the program key setup code further down so that it gets the right values
out of the final surface.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 71 +---
 1 file changed, 34 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index c337a86..03e4984 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1698,28 +1698,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   unreachable("Unrecognized blorp format");
}
 
-   if (brw->gen > 6) {
-  /* Gen7's rendering hardware only supports the IMS layout for depth and
-   * stencil render targets.  Blorp always maps its destination surface as
-   * a color render target (even if it's actually a depth or stencil
-   * buffer).  So if the destination is IMS, we'll have to map it as a
-   * single-sampled texture and interleave the samples ourselves.
-   */
-  if (dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
- params.dst.surf.samples = 1;
- params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
-  }
-   }
-
-   if (params.src.surf.samples > 0 && params.dst.surf.samples > 1) {
-  /* We are blitting from a multisample buffer to a multisample buffer, so
-   * we must preserve samples within a pixel.  This means we have to
-   * arrange for the WM program to run once per sample rather than once
-   * per pixel.
-   */
-  wm_prog_key.persample_msaa_dispatch = true;
-   }
-
/* Scaled blitting or not. */
wm_prog_key.blit_scaled =
   ((dst_x1 - dst_x0) == (src_x1 - src_x0) &&
@@ -1759,20 +1737,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
wm_prog_key.src_samples = src_mt->num_samples;
wm_prog_key.dst_samples = dst_mt->num_samples;
 
-   /* tex_samples and rt_samples are the sample counts that are set up in
-* SURFACE_STATE.
-*/
-   wm_prog_key.tex_samples = params.src.surf.samples;
-   wm_prog_key.rt_samples  = params.dst.surf.samples;
-
wm_prog_key.tex_aux_usage = params.src.aux_usage;
 
-   /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline will
-* use to access the source and destination surfaces.
-*/
-   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
-   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
-
/* src_layout and dst_layout indicate the true MSAA layout used by src and
 * dst.
 */
@@ -1809,7 +1775,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   params.wm_inputs.src_z = 0;
}
 
-   if (params.dst.surf.samples <= 1 && dst_mt->num_samples > 1) {
+   if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
   /* We must expand the rectangle we send through the rendering pipeline,
* to account for the fact that we are mapping the destination region as
* single-sampled when it is in fact multisampled.  We must also align
@@ -1822,8 +1788,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
* If it's UMS, then we have no choice but to set up the rendering
* pipeline as multisampled.
*/
-  assert(dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS);
-  switch (dst_mt->num_samples) {
+  assert(params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED);
+  switch (params.dst.surf.samples) {
   case 2:
  params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);
  params.y0 = ROUND_DOWN_TO(params.y0, 4);
@@ -1851,6 +1817,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   default:
  unreachable("Unrecognized sample count in brw_blorp_blit_params 
ctor");
   }
+
+  /* Gen7's rendering hardware only supports the IMS layout for depth and
+   * stencil render targets.  Blorp always maps its destination surface as
+   * a color render target (even if it's actually a depth or stencil
+   * buffer).  So if the destination is IMS, we'll have to map it as a
+   * single-sampled texture and interleave the samples ourselves.
+   */
+  params.dst.surf.samples = 1;
+  params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
+
   wm_prog_key.use_kill = true;
}
 
@@ -1952,6 +1928,27 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   params.src.y_offset /= 2;
}
 
+   /* tex_samples and rt_samples are the sample counts that are set up in
+* SURFACE_STATE.
+*/
+   wm_prog_key.tex_samples = params.src.surf.samples;
+   wm_prog_key.rt_samples  = params.dst.surf.samples;
+
+   /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline will
+* use to access the source and destination surfaces.
+*/
+   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
+   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
+
+   if 

Re: [Mesa-dev] [PATCH] glsl: fix optimization of discard nested multiple levels

2016-07-26 Thread Kenneth Graunke
On Tuesday, July 26, 2016 10:14:12 AM PDT Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> The order of optimizations can lead to the conditional discard optimization
> being applied twice to the same discard statement. In this case, we must
> ensure that both conditions are applied.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/glsl/opt_conditional_discard.cpp | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/opt_conditional_discard.cpp 
> b/src/compiler/glsl/opt_conditional_discard.cpp
> index 1ca8803..a27bead 100644
> --- a/src/compiler/glsl/opt_conditional_discard.cpp
> +++ b/src/compiler/glsl/opt_conditional_discard.cpp
> @@ -72,7 +72,14 @@ opt_conditional_discard_visitor::visit_leave(ir_if *ir)
>  
> /* Move the condition and replace the ir_if with the ir_discard. */
> ir_discard *discard = (ir_discard *) ir->then_instructions.head;
> -   discard->condition = ir->condition;
> +   if (!discard->condition)
> +  discard->condition = ir->condition;
> +   else {
> +  void *ctx = ralloc_parent(ir);
> +  discard->condition = new(ctx) ir_expression(ir_binop_logic_and,
> +  ir->condition,
> +  discard->condition);
> +   }
> ir->replace_with(discard);
>  
> progress = true;
> 

Whoops, thanks for fixing this!

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 22/35] isl: Remove duplicate px->sa conversions

2016-07-26 Thread Jason Ekstrand
In all three cases, we start with width and height taken from
isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
need to do the conversion and doing so gives us an incorrect value.

Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index f65f9c8..a9208f6 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -658,18 +658,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
   uint32_t W = isl_minify(W0, l);
   uint32_t H = isl_minify(H0, l);
 
-  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
- /* From the Broadwell PRM >> Volume 5: Memory Views >> Computing Mip 
Level
-  * Sizes (p133):
-  *
-  *If the surface is multisampled and it is a depth or stencil
-  *surface or Multisampled Surface StorageFormat in
-  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
-  *adjusted as follows before proceeding: [...]
-  */
- isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
-  }
-
   uint32_t w = isl_align_npot(W, image_align_sa->w);
   uint32_t h = isl_align_npot(H, image_align_sa->h);
 
@@ -1370,17 +1358,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf *surf,
for (uint32_t l = 0; l < level; ++l) {
   if (l == 1) {
  uint32_t W = isl_minify(W0, l);
-
- if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
-isl_msaa_interleaved_scale_px_to_sa(surf->samples, , NULL);
-
  x += isl_align_npot(W, image_align_sa.w);
   } else {
  uint32_t H = isl_minify(H0, l);
-
- if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
-isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL, );
-
  y += isl_align_npot(H, image_align_sa.h);
   }
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): Revert "radeon/llvm: Use alloca instructions for larger arrays"

2016-07-26 Thread Matt Arsenault

> On Jul 26, 2016, at 14:37, Marek Olšák  wrote:
> 
> On Sat, Jul 23, 2016 at 4:07 PM, Nicolai Hähnle  > wrote:
>> On 22.07.2016 12:08, Michel Dänzer wrote:
>>> 
>>> On 21.07.2016 18:17, Matt Arsenault wrote:
> 
> On Jul 21, 2016, at 01:03, Michel Dänzer  > wrote:
> 
> On 21.07.2016 00:04, Michel Dänzer wrote:
>> 
>> On 15.07.2016 05:15, Marek =?UNKNOWN?B?T2zFocOhaw==?= wrote:
>>> 
>>> Module: Mesa
>>> Branch: master
>>> Commit: f84e9d749fbb6da73a60fb70e6725db773c9b8f8
>>> URL:
>>> 
>>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f84e9d749fbb6da73a60fb70e6725db773c9b8f8
>>> 
>>> Author: Marek Olšák >
>>> Date:   Thu Jul 14 22:07:46 2016 +0200
>>> 
>>> Revert "radeon/llvm: Use alloca instructions for larger arrays"
>>> 
>>> This reverts commit 513fccdfb68e6a71180e21827f071617c93fd09b.
>>> 
>>> Bioshock Infinite hangs with that.
>> 
>> 
>> Unfortunately, this change caused the piglit test
>> shaders@glsl-fs-vec4-indexing-temp-dst-in-loop (and possibly others) to
>> hang my Kaveri. Any ideas for how we can get out of this conundrum?
> 
> 
> The hang was introduced by LLVM SVN r275934 ("AMDGPU: Expand register
> indexing pseudos in custom inserter"). The good/bad (without/with
> r275934) shader dumps and the GALLIUM_DDEBUG=800 dump corresponding to
> the hang are attached.
> 
> 
> BTW, even with Marek's change above reverted, I still see some piglit
> regressions compared to last week, but I'm not sure if those are all
> related to the same LLVM change.
> 
> 
> --
> Earthling Michel Dänzer   |
>  http://www.amd.com 
> Libre software enthusiast | Mesa and X developer
> 
> 
 
 
 This fixes the verifier error in it: https://reviews.llvm.org/D22616
>>> 
>>> 
>>> This seems to fix the hang, thanks!
>>> 
>>> 
 This fixes another issue which may be
 related: https://reviews.llvm.org/D22556
>>> 
>>> 
>>> Even with that applied as well, there are still piglit regressions
>>> compared to early last week, see the attached dumps (look for "LLVM
>>> triggered Diagnostic Handler:").
>> 
>> 
>> Looks like the "rewrite undef" part of the Two Address Instruction Pass also
>> needs to be adjusted -- I've attached a bugpoint-reduced test case.
>> 
>> Also, the hang that motivated the original revert in Mesa should be fixed
>> with https://reviews.llvm.org/D22673 (and the related
>> https://reviews.llvm.org/D22675 is also needed for correctness, though
>> probably not for fixing the hang).
> 
> FYI, I've reverted the revert.
> 
> Marek


It might be nice if this could be an option, since this was probably the main 
stressor of the register indexing code

-Matt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: free hash tables earlier

2016-07-26 Thread Timothy Arceri
These are only used by get_matching_input() which has been call
at this point so free the hash tables.
---
 src/compiler/glsl/link_varyings.cpp | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index d48c680..91d8974 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -2156,6 +2156,9 @@ assign_varying_locations(struct gl_context *ctx,
   }
}
 
+   hash_table_dtor(consumer_inputs);
+   hash_table_dtor(consumer_interface_inputs);
+
for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
   if (!tfeedback_decls[i].is_varying())
  continue;
@@ -2165,8 +2168,6 @@ assign_varying_locations(struct gl_context *ctx,
 
   if (matched_candidate == NULL) {
  hash_table_dtor(tfeedback_candidates);
- hash_table_dtor(consumer_inputs);
- hash_table_dtor(consumer_interface_inputs);
  return false;
   }
 
@@ -2185,15 +2186,10 @@ assign_varying_locations(struct gl_context *ctx,
 
   if (!tfeedback_decls[i].assign_location(ctx, prog)) {
  hash_table_dtor(tfeedback_candidates);
- hash_table_dtor(consumer_inputs);
- hash_table_dtor(consumer_interface_inputs);
  return false;
   }
}
-
hash_table_dtor(tfeedback_candidates);
-   hash_table_dtor(consumer_inputs);
-   hash_table_dtor(consumer_interface_inputs);
 
if (consumer && producer) {
   foreach_in_list(ir_instruction, node, consumer->ir) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st_glsl_to_tgsi: only skip over slots of an input array that are present

2016-07-26 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Mon, Jul 25, 2016 at 6:08 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> When an application declares varying arrays but does not actually do any
> indirect indexing, some array indices may end up unused in the consuming
> shader, so the number of input slots that correspond to the array ends
> up less than the array_size.
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
> See also the shader_runner Piglit test that I sent out a moment ago.
>
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 7564119..38e2c4a 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -6058,7 +6058,11 @@ st_translate_program(
>inputSemanticName[i], inputSemanticIndex[i],
>interpMode[i], 0, interpLocation[i],
>array_id, array_size);
> -i += array_size - 1;
> +
> +GLuint base_attr = inputSlotToAttr[i];
> +while (i + 1 < numInputs &&
> +   inputSlotToAttr[i + 1] < base_attr + array_size)
> +   ++i;
>   }
>   else {
>  t->inputs[i] = ureg_DECL_fs_input_cyl_centroid(ureg,
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nvc0: fix up TCP header on GM107+

2016-07-26 Thread Ilia Mirkin
On Tue, Jul 26, 2016 at 6:53 PM, Samuel Pitoiset
 wrote:
> The number of outputs patch (limited to 255) has moved in the TCP
> header, but blob seems to also set the old position. Also, the high
> 8-bits are now located inbetween the min/max parallel output read
> address at position 20.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index 5fc2753..8ed3e10 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -346,6 +346,15 @@ nvc0_tcp_gen_header(struct nvc0_program *tcp, struct 
> nv50_ir_prog_info *info)
>
> nvc0_vtgp_gen_header(tcp, info);
>
> +   if (info->target >= NVISA_GM107_CHIPSET) {
> +  /* On GM107+, the number of output patch components has moved in the 
> TCP
> +   * header, but it seems like blob still also uses the old position.
> +   * Also, the high 8-bits are located inbetween the min/max parallel
> +   * field and has to be set after updating the outputs. */
> +  tcp->hdr[3] = opcs << 28;

Semantically identical, but I think it'll help out the poor casual reader:

tcp->hdr[3] = (opcs & 0x0f) << 28;

Otherwise this is

Acked-by: Ilia Mirkin 

[to reflect the fact that I've done absolutely no verification of your
claims about how the hw works]

> +  tcp->hdr[4] |= (opcs & 0xf0) << 16;
> +   }
> +
> nvc0_tp_get_tess_mode(tcp, info);
>
> return 0;
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Interest in GL_ARB_gl_spirv support?

2016-07-26 Thread Marek Olšák
On Wed, Jul 27, 2016 at 12:29 AM, oscar bg  wrote:
> Hi,
> seems this year 2016 OpenGL ARB update brings a small number of extensions..
> seems the most important is GL_ARB_gl_spirv.. seems like SPIRV as a binary
> format for OpenGL and Mesa doesn't have any binary format even supporting
> ARB_program_binary ext.. a Nvidia driver is already providing support from
> day 1 for Linux as always..
>
> just asking how difficult would be to bring support to Mesa drivers.. and if
> there is any interest by Mesa devs start working on it soon..
>
> seems already we have SPIRV support in Mesa in Vulkan drivers: Anvil Vulkan
> Intel driver and some days ago RADV a open source Vulkan driver for AMD GPUs
> has been anounced.. as this drivers already eat SPIRV code seems this
> extension would take less work to port to this two vendor GPUs?

The showstopper for sharing Vulkan code is that OpenGL doesn't have
pipeline state objects. Because of that, you can ignore RADV. I think
nobody has enough resources to replicate what the radeonsi TGSI path
does, so you are pretty much stuck with TGSI.

SPIRV -> NIR -> TGSI should work.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Interest in GL_ARB_gl_spirv support?

2016-07-26 Thread Ilia Mirkin
On Tue, Jul 26, 2016 at 7:44 PM, Marek Olšák  wrote:
> On Wed, Jul 27, 2016 at 12:29 AM, oscar bg  wrote:
>> Hi,
>> seems this year 2016 OpenGL ARB update brings a small number of extensions..
>> seems the most important is GL_ARB_gl_spirv.. seems like SPIRV as a binary
>> format for OpenGL and Mesa doesn't have any binary format even supporting
>> ARB_program_binary ext.. a Nvidia driver is already providing support from
>> day 1 for Linux as always..
>>
>> just asking how difficult would be to bring support to Mesa drivers.. and if
>> there is any interest by Mesa devs start working on it soon..
>>
>> seems already we have SPIRV support in Mesa in Vulkan drivers: Anvil Vulkan
>> Intel driver and some days ago RADV a open source Vulkan driver for AMD GPUs
>> has been anounced.. as this drivers already eat SPIRV code seems this
>> extension would take less work to port to this two vendor GPUs?
>
> The showstopper for sharing Vulkan code is that OpenGL doesn't have
> pipeline state objects. Because of that, you can ignore RADV. I think
> nobody has enough resources to replicate what the radeonsi TGSI path
> does, so you are pretty much stuck with TGSI.
>
> SPIRV -> NIR -> TGSI should work.

FWIW my plans for nouveau definitely involve direct SPIR-V input. This
will also be useful for an independent Vulkan driver, as well as
OpenCL SPIR-V. However I'm not sure when such plans will materialize.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 19/35] i965/blorp: Move surface offset calculations into a helper

2016-07-26 Thread Jason Ekstrand
The helper does a full transformation on the surface to turn it into a new
2-D single-layer single-level surface representing the original layer and
level in memory.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 84 ++-
 1 file changed, 43 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index c8cb41a..8ccb8da 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -114,6 +114,46 @@ blorp_get_image_offset_sa(struct isl_device *dev, const 
struct isl_surf *surf,
}
 }
 
+static void
+surf_apply_level_layer_offsets(struct isl_device *dev, struct isl_surf *surf,
+   struct isl_view *view, uint32_t *byte_offset,
+   uint32_t *tile_x_sa, uint32_t *tile_y_sa)
+{
+   /* This only makes sense for a single level and array slice */
+   assert(view->levels == 1 && view->array_len == 1);
+
+   uint32_t x_offset_sa, y_offset_sa;
+   blorp_get_image_offset_sa(dev, surf, view->base_level,
+ view->base_array_layer,
+ _offset_sa, _offset_sa);
+
+   isl_tiling_get_intratile_offset_sa(dev, surf->tiling, view->format,
+  surf->row_pitch, x_offset_sa, 
y_offset_sa,
+  byte_offset, tile_x_sa, tile_y_sa);
+
+   /* Now that that's done, we have a very bare 2-D surface */
+   surf->dim = ISL_SURF_DIM_2D;
+   surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+
+   surf->logical_level0_px.width =
+  minify(surf->logical_level0_px.width, view->base_level);
+   surf->logical_level0_px.height =
+  minify(surf->logical_level0_px.height, view->base_level);
+   surf->logical_level0_px.depth = 1;
+   surf->logical_level0_px.array_len = 1;
+   surf->levels = 1;
+
+   /* Alignment doesn't matter since we have 1 miplevel and 1 array slice so
+* just pick something that works for everybody.
+*/
+   surf->image_alignment_el = isl_extent3d(4, 4, 1);
+
+   /* TODO: surf->physcal_level0_extent_sa? */
+
+   view->base_level = 0;
+   view->base_array_layer = 0;
+}
+
 void
 brw_blorp_surface_info_init(struct brw_context *brw,
 struct brw_blorp_surface_info *info,
@@ -206,20 +246,9 @@ brw_blorp_surface_info_init(struct brw_context *brw,
}
}
 
-   uint32_t x_offset, y_offset;
-   blorp_get_image_offset_sa(>isl_dev, >surf,
- level, layer / layer_multiplier,
- _offset, _offset);
-
-   uint32_t mt_x, mt_y;
-   intel_miptree_get_image_offset(mt, level, layer, _x, _y);
-   assert(mt_x == x_offset && mt_y == y_offset);
-
-   isl_tiling_get_intratile_offset_sa(>isl_dev, info->surf.tiling,
-  info->view.format,
-  info->surf.row_pitch, x_offset, y_offset,
-  >bo_offset,
-  >tile_x_sa, >tile_y_sa);
+   surf_apply_level_layer_offsets(>isl_dev, >surf, >view,
+  >bo_offset,
+  >tile_x_sa, >tile_y_sa);
 }
 
 
@@ -345,35 +374,8 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
struct isl_surf surf = surface->surf;
 
/* Stomp surface dimensions and tiling (if needed) with info from blorp */
-   surf.dim = ISL_SURF_DIM_2D;
-   surf.dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
surf.logical_level0_px.width = surface->width;
surf.logical_level0_px.height = surface->height;
-   surf.logical_level0_px.depth = 1;
-   surf.logical_level0_px.array_len = 1;
-   surf.levels = 1;
-
-   /* Alignment doesn't matter since we have 1 miplevel and 1 array slice so
-* just pick something that works for everybody.
-*/
-   surf.image_alignment_el = isl_extent3d(4, 4, 1);
-
-   if (brw->gen == 6 && surf.samples > 1) {
-  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
-   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
-   * divide them each by 2.
-   */
-  surf.logical_level0_px.width /= 2;
-  surf.logical_level0_px.height /= 2;
-   }
-
-   if (brw->gen == 6 && surf.image_alignment_el.height > 4) {
-  /* This can happen on stencil buffers on Sandy Bridge due to the
-   * single-LOD work-around.  It's fairly harmless as long as we don't
-   * pass a bogus value into isl_surf_fill_state().
-   */
-  surf.image_alignment_el = isl_extent3d(4, 2, 1);
-   }
 
union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 12/35] i965/blorp: Get rid of brw_blorp_surface_info::array_layout

2016-07-26 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 1 -
 src/mesa/drivers/dri/i965/brw_blorp.h | 9 -
 2 files changed, 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 96201e4..48755fc 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -69,7 +69,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
intel_miptree_get_image_offset(mt, level, layer,
   >x_offset, >y_offset);
 
-   info->array_layout = mt->array_layout;
info->swizzle = SWIZZLE_XYZW;
 
if (format == MESA_FORMAT_NONE)
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index d60b988..7aa67be 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -119,15 +119,6 @@ struct brw_blorp_surface_info
uint32_t y_offset;
 
/**
-* Indicates if we use the standard miptree layout (ALL_LOD_IN_EACH_SLICE),
-* or if we tightly pack array slices at each LOD (ALL_SLICES_AT_EACH_LOD).
-*
-* If ALL_SLICES_AT_EACH_LOD is set, then ARYSPC_LOD0 can be used. Ignored
-* prior to Gen7.
-*/
-   enum miptree_array_layout array_layout;
-
-   /**
 * Format that should be used when setting up the surface state for this
 * surface.  Should correspond to one of the BRW_SURFACEFORMAT_* enums.
 */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 20/35] i965/blorp: Get rid of brw_blorp_surface_info::width/height

2016-07-26 Thread Jason Ekstrand
Instead, we manually mutate the surface size as needed.
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 21 ++---
 src/mesa/drivers/dri/i965/brw_blorp.h| 12 
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 19 +++
 src/mesa/drivers/dri/i965/gen6_blorp.c   |  4 ++--
 src/mesa/drivers/dri/i965/gen7_blorp.c   |  4 ++--
 5 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8ccb8da..78707ca 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -203,8 +203,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
 
info->level = level;
info->layer = layer;
-   info->width = minify(mt->physical_width0, level - mt->first_level);
-   info->height = minify(mt->physical_height0, level - mt->first_level);
 
if (format == MESA_FORMAT_NONE)
   format = mt->format;
@@ -373,10 +371,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
 
struct isl_surf surf = surface->surf;
 
-   /* Stomp surface dimensions and tiling (if needed) with info from blorp */
-   surf.logical_level0_px.width = surface->width;
-   surf.logical_level0_px.height = surface->height;
-
union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
 
const struct isl_surf *aux_surf = NULL;
@@ -610,16 +604,13 @@ gen6_blorp_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 * prevents the clobbering.
 */
params.dst.surf.samples = MAX2(mt->num_samples, 1);
-   if (params.depth.surf.samples > 1) {
-  params.depth.width = ALIGN(mt->logical_width0, 8);
-  params.depth.height = ALIGN(mt->logical_height0, 4);
-   } else {
-  params.depth.width = ALIGN(params.depth.width, 8);
-  params.depth.height = ALIGN(params.depth.height, 4);
-   }
+   params.depth.surf.logical_level0_px.width =
+  ALIGN(params.depth.surf.logical_level0_px.width, 8);
+   params.depth.surf.logical_level0_px.height =
+  ALIGN(params.depth.surf.logical_level0_px.height, 4);
 
-   params.x1 = params.depth.width;
-   params.y1 = params.depth.height;
+   params.x1 = params.depth.surf.logical_level0_px.width;
+   params.y1 = params.depth.surf.logical_level0_px.height;
 
assert(intel_miptree_level_has_hiz(mt, level));
 
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 185406e..282235d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -94,18 +94,6 @@ struct brw_blorp_surface_info
 */
uint32_t layer;
 
-   /**
-* Width of the miplevel to be used.  For surfaces using
-* INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not pixels.
-*/
-   uint32_t width;
-
-   /**
-* Height of the miplevel to be used.  For surfaces using
-* INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not pixels.
-*/
-   uint32_t height;
-
uint32_t bo_offset;
uint32_t tile_x_sa, tile_y_sa;
 };
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 32450aa..fb81a22 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1816,24 +1816,31 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
  params.y0 = ROUND_DOWN_TO(params.y0, 4);
  params.x1 = ALIGN(params.x1 * 2, 4);
  params.y1 = ALIGN(params.y1, 4);
+ params.dst.surf.logical_level0_px.width *= 2;
  break;
   case 4:
  params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);
  params.y0 = ROUND_DOWN_TO(params.y0 * 2, 4);
  params.x1 = ALIGN(params.x1 * 2, 4);
  params.y1 = ALIGN(params.y1 * 2, 4);
+ params.dst.surf.logical_level0_px.width *= 2;
+ params.dst.surf.logical_level0_px.height *= 2;
  break;
   case 8:
  params.x0 = ROUND_DOWN_TO(params.x0 * 4, 8);
  params.y0 = ROUND_DOWN_TO(params.y0 * 2, 4);
  params.x1 = ALIGN(params.x1 * 4, 8);
  params.y1 = ALIGN(params.y1 * 2, 4);
+ params.dst.surf.logical_level0_px.width *= 4;
+ params.dst.surf.logical_level0_px.height *= 2;
  break;
   case 16:
  params.x0 = ROUND_DOWN_TO(params.x0 * 4, 8);
  params.y0 = ROUND_DOWN_TO(params.y0 * 4, 8);
  params.x1 = ALIGN(params.x1 * 4, 8);
  params.y1 = ALIGN(params.y1 * 4, 8);
+ params.dst.surf.logical_level0_px.width *= 4;
+ params.dst.surf.logical_level0_px.height *= 4;
  break;
   default:
  unreachable("Unrecognized sample count in brw_blorp_blit_params 
ctor");
@@ -1918,8 +1925,10 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
   params.y0 = ROUND_DOWN_TO(params.y0, y_align) / 2;
   params.x1 = ALIGN(params.x1, x_align) * 2;
   params.y1 = ALIGN(params.y1, y_align) / 2;
-  params.dst.width = 

[Mesa-dev] [PATCH v2 03/35] isl/state: Use a valid alignment for 1-D textures

2016-07-26 Thread Jason Ekstrand
The alignment we use doesn't matter (see the comment) but it should at
least be an alignment we can represent with the enums.

Reviewed-by: Topi Pohjolainen 
---
 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index d1c8f17..6febcbf 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -142,7 +142,7 @@ get_image_alignment(const struct isl_surf *surf)
   * true alignment is likely outside the enum range of HALIGN* and
   * VALIGN*.
   */
- return isl_extent3d(0, 0, 0);
+ return isl_extent3d(4, 4, 1);
   } else {
  /* In Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in 
units
   * of surface elements (not pixels nor samples). For compressed 
formats,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 25/35] i965/blorp: Map 1-D render targets with DIM_LAYOUT_GEN4_2D as 2D on gen9

2016-07-26 Thread Jason Ekstrand
The sampling hardware can handle them ok.  It just looks at the tiling to
determine whether it's the new gen9 1-D layout or the old one.  The render
hardware isn't so smart.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index d9b5554..2cf0f99 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -371,6 +371,12 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
 
struct isl_surf surf = surface->surf;
 
+   if (surf.dim == ISL_SURF_DIM_1D &&
+   surf.dim_layout == ISL_DIM_LAYOUT_GEN4_2D) {
+  assert(surf.logical_level0_px.height == 1);
+  surf.dim = ISL_SURF_DIM_2D;
+   }
+
union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
 
const struct isl_surf *aux_surf = NULL;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 08/35] i965/blorp: Make sample count asserts a bit more lazy

2016-07-26 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index a68e406..e0a6d7c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1302,7 +1302,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
nir_ssa_def *src_pos, *dst_pos, *color;
 
/* Sanity checks */
-   if (key->dst_tiled_w && key->rt_samples > 0) {
+   if (key->dst_tiled_w && key->rt_samples > 1) {
   /* If the destination image is W tiled and multisampled, then the thread
* must be dispatched once per sample, not once per pixel.  This is
* necessary because after conversion between W and Y tiling, there's no
@@ -1333,13 +1333,13 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
 
/* Make sure layout is consistent with sample count */
assert((key->tex_layout == INTEL_MSAA_LAYOUT_NONE) ==
-  (key->tex_samples == 0));
+  (key->tex_samples <= 1));
assert((key->rt_layout == INTEL_MSAA_LAYOUT_NONE) ==
-  (key->rt_samples == 0));
+  (key->rt_samples <= 1));
assert((key->src_layout == INTEL_MSAA_LAYOUT_NONE) ==
-  (key->src_samples == 0));
+  (key->src_samples <= 1));
assert((key->dst_layout == INTEL_MSAA_LAYOUT_NONE) ==
-  (key->dst_samples == 0));
+  (key->dst_samples <= 1));
 
nir_builder b;
nir_builder_init_simple_shader(, NULL, MESA_SHADER_FRAGMENT, NULL);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Interest in GL_ARB_gl_spirv support?

2016-07-26 Thread Matt Turner
On Tue, Jul 26, 2016 at 9:16 PM, Jason Ekstrand  wrote:
> On Tue, Jul 26, 2016 at 4:50 PM, Ilia Mirkin  wrote:
>>
>> On Tue, Jul 26, 2016 at 7:44 PM, Marek Olšák  wrote:
>> > On Wed, Jul 27, 2016 at 12:29 AM, oscar bg  wrote:
>> >> Hi,
>> >> seems this year 2016 OpenGL ARB update brings a small number of
>> >> extensions..
>> >> seems the most important is GL_ARB_gl_spirv.. seems like SPIRV as a
>> >> binary
>> >> format for OpenGL and Mesa doesn't have any binary format even
>> >> supporting
>> >> ARB_program_binary ext.. a Nvidia driver is already providing support
>> >> from
>> >> day 1 for Linux as always..
>> >>
>> >> just asking how difficult would be to bring support to Mesa drivers..
>> >> and if
>> >> there is any interest by Mesa devs start working on it soon..
>> >>
>> >> seems already we have SPIRV support in Mesa in Vulkan drivers: Anvil
>> >> Vulkan
>> >> Intel driver and some days ago RADV a open source Vulkan driver for AMD
>> >> GPUs
>> >> has been anounced.. as this drivers already eat SPIRV code seems this
>> >> extension would take less work to port to this two vendor GPUs?
>
>
> We're thinking about it but, unfortunately, I haven't had much time to look
> into the extension.  One thing that I can say right now, is that the
> spirv_to_nir pass isn't what you'd call OpenGL-friendly.  If you pass it
> invalid SPIR-V, it will crash.  What little error checking it does do is
> done via asserts.  That's perfectly acceptable in the Vulkan world but
> unless they went out of their way to ok it in the extension, crashing
> generally isn't acceptable in OpenGL.  Also, it's not entirely clear what
> all hoops we would have to go through to tie a NIR shader into the OpenGL
> state upload paths for things like uniforms, inputs/outputs, etc.  This
> isn't to say that it can't be done but it's substantially more than zero
> work.

The spec says

The OpenGL API expects the SPIR-V module to have already been validated,
and can return an error if it discovers anything invalid in
the module.  An invalid SPIR-V module is allowed to result in undefined
behavior.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 08/35] i965/blorp: Make sample count asserts a bit more lazy

2016-07-26 Thread Pohjolainen, Topi

We could have a small rational here:

Until now blorp used internally the sample count of zero to represent
single sampled surfaces. However, incoming single sampled surfaces may
have the sample count set as zero or one, and once the stomping to
zero is dropped these asserts would fire.

On Tue, Jul 26, 2016 at 03:01:59PM -0700, Jason Ekstrand wrote:
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index a68e406..e0a6d7c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1302,7 +1302,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
> nir_ssa_def *src_pos, *dst_pos, *color;
>  
> /* Sanity checks */
> -   if (key->dst_tiled_w && key->rt_samples > 0) {
> +   if (key->dst_tiled_w && key->rt_samples > 1) {
>/* If the destination image is W tiled and multisampled, then the 
> thread
> * must be dispatched once per sample, not once per pixel.  This is
> * necessary because after conversion between W and Y tiling, there's 
> no
> @@ -1333,13 +1333,13 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
>  
> /* Make sure layout is consistent with sample count */
> assert((key->tex_layout == INTEL_MSAA_LAYOUT_NONE) ==
> -  (key->tex_samples == 0));
> +  (key->tex_samples <= 1));
> assert((key->rt_layout == INTEL_MSAA_LAYOUT_NONE) ==
> -  (key->rt_samples == 0));
> +  (key->rt_samples <= 1));
> assert((key->src_layout == INTEL_MSAA_LAYOUT_NONE) ==
> -  (key->src_samples == 0));
> +  (key->src_samples <= 1));
> assert((key->dst_layout == INTEL_MSAA_LAYOUT_NONE) ==
> -  (key->dst_samples == 0));
> +  (key->dst_samples <= 1));
>  
> nir_builder b;
> nir_builder_init_simple_shader(, NULL, MESA_SHADER_FRAGMENT, NULL);
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96950] Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.

2016-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96950

--- Comment #9 from Mathias Fröhlich  ---
I have verified 0ad with the provided trace here, updated as requested and
pushed.
Thanks for testing and review on your side.

I assume the originator verifies and closes the bug?

Mathias

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Call i965 GLSL IR backend optimisation from the common linker

2016-07-26 Thread Timothy Arceri
The ultimate goal is to be able to convert to NIR and make use of its
optimisations before assigning varying and uniform locations. This
should allow us to start removing some of the GLSL IR optimisation
passes.

This series falls short of making use of NIR because lower_packed_varyings()
modifies the IR after we assign varying locations. I can see two ways
around this, listing them in increasing difficultly level they would be:

- replacing the current packing pass with one that follows the packing
rules of ARB_enhanced_layouts this would mean we can no longer pack
across slots and matrix and array packing effectivness would be slightly
decreased.
- write a NIR packing pass.

Even without converting to NIR this series solves a number of the other
problems with converting to NIR earlier and provides a nice shader-db
improvement on its own.

Broadwell shader-db results:

total instructions in shared programs: 8651650 -> 8644415 (-0.08%)
instructions in affected programs: 38754 -> 31519 (-18.67%)
total loops in shared programs:2085 -> 2085 (0.00%)
helped:320
HURT:  0
GAINED:0

Ivybridge reported no difference.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] mesa: Implement _mesa_all_varyings_in_vbos.

2016-07-26 Thread Mathias Fröhlich
Hi,

On Thursday, June 23, 2016 16:53:59 Fredrik Höglund wrote:
> On Friday 17 June 2016, mathias.froehl...@gmx.net wrote:
> > From: Mathias Fröhlich 
> > 
> > Implement the equivalent of vbo_all_varyings_in_vbos for
> > vertex array objects.
> > 
> > Signed-off-by: Mathias Fröhlich 
> > ---
> >  src/mesa/main/arrayobj.c | 35 +++
> >  src/mesa/main/arrayobj.h |  4 
> >  2 files changed, 39 insertions(+)
> > 
> > diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
> > index 9c3451e..041ee63 100644
> > --- a/src/mesa/main/arrayobj.c
> > +++ b/src/mesa/main/arrayobj.c
> > @@ -359,6 +359,41 @@ _mesa_update_vao_client_arrays(struct gl_context *ctx,
> >  }
> >  
> >  
> > +bool
> > +_mesa_all_varyings_in_vbos(const struct gl_vertex_array_object *vao)
> > +{
> > +   /* Walk those enabled arrays that have the default vbo attached */
> > +   GLbitfield64 mask = vao->_Enabled & ~vao->VertexAttribBufferMask;
> > +
> > +   while (mask) {
> > +  /** We do not use u_bit_scan64 as we can here walk
> > +   *  multiple attrib arrays at once
> > +   */
> > +  const int i = ffsll(mask) - 1;
> > +  const struct gl_vertex_attrib_array *attrib_array =
> > + >VertexAttrib[i];
> > +  const struct gl_vertex_buffer_binding *buffer_binding =
> > + >VertexBinding[attrib_array->VertexBinding];
> > +
> > +  /* Only enabled arrays shall appear in the _Enabled bitmask */
> > +  assert(attrib_array->Enabled);
> > +  /* We have already masked out vao->VertexAttribBufferMask  */
> > +  assert(!_mesa_is_bufferobj(buffer_binding->BufferObj));
> > +
> > +  /* Bail out once we find the first non vbo with a non zero stride */
> > +  if (buffer_binding->Stride != 0)
> > + return false;
> 
> I'm not sure if this is correct.  The default value for Stride is 16,
> not 0.  The only way Stride can be zero in a binding point that doesn't
> have a buffer object bound is if the user has explicitly called
> glBindVertexBuffer() with both the buffer and stride parameters set
> to zero.
> 
> StrideB in gl_client_array on the other hand is always zero when the array
> is one of the currval arrays managed by the VBO context.  It is never zero
> when the array is a user array that has been specified with gl*Pointer().
> 
> I think the point of vbo_all_varyings_in_vbos() is to return false if any
> enabled array doesn't have a VBO bound, and is not one of the currval
> arrays.

Additionally vbo_all_varyings_in_vbos() treats zero stride user
arrays like a current vertex attribute values. Already because
vbo_all_varyings_in_vbos() does not distinguish between a user
zero stride array and a current attribute value which is presented likewise.
Also I believe it's legal to call glBindVertexBuffer() with zero buffer
and stride - or am I wrong here?
So IMO what you write would result in a change of behavior.

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] Make more use of state already tracked in the VAO.

2016-07-26 Thread Mathias Fröhlich
Hi,

This should have been some preparing cleanup for some patches doing less work in
the fast draw path.

I have updated the comment as requested and now resent with the new comment.
And I believe that _mesa_all_varyings_in_vbos is equivalent
to vbo_all_varyings_in_vbos but working on a VAO. Else we would get a
change in behavior.

So: Ping.

Thanks

Mathias

On Friday, June 17, 2016 20:03:52 mathias.froehl...@gmx.net wrote:
> From: Mathias Fröhlich 
> 
> Hi,
> 
> The first two patches fix a bug in tracking the VAO internal
> state. The majority of the changeset makes more use of the
> state currently tracked in the VAO and transitions to use
> more of the first order information found in the VAO instead
> of relying on the gl_client_array members that mirror the
> VAO fields. The last two patches rip out members from
> gl_client_array that are set but no longer used.
> 
> Please review,
> 
> Thanks
> 
> Mathias
> 
> 
> Mathias Fröhlich (11):
>   mesa: Add flush_vertices argument to _mesa_bind_vertex_buffer.
>   mesa: Unbind deleted vbo using _mesa_bind_vertex_buffer.
>   mesa: Implement _mesa_all_varyings_in_vbos.
>   vbo: Walk the VAO to see if all varyings are in vbos.
>   vbo: Walk the VAO to check for mapped buffers.
>   mesa: Walk the VAO in _mesa_print_arrays.
>   vbo: Walk the VAO in print_draw_arrays.
>   vbo: Walk the VAO in check_array_data.
>   vbo: Use the VAO array enabled flags in vbo_exec_array.
>   mesa: Remove set but not used gl_client_array::Enabled.
>   mesa: Remove set but not used gl_client_array::Stride.
> 
>  src/mesa/drivers/common/meta.c   |  16 ++--
>  src/mesa/main/arrayobj.c |  35 
>  src/mesa/main/arrayobj.h |   4 +
>  src/mesa/main/bufferobj.c|  11 ++-
>  src/mesa/main/mtypes.h   |   2 -
>  src/mesa/main/varray.c   |  70 +++
>  src/mesa/main/varray.h   |   4 +-
>  src/mesa/state_tracker/st_cb_rasterpos.c |   2 -
>  src/mesa/vbo/vbo_context.c   |   2 -
>  src/mesa/vbo/vbo_exec_array.c| 141 
> ++-
>  src/mesa/vbo/vbo_exec_draw.c |   2 -
>  src/mesa/vbo/vbo_save_draw.c |   2 -
>  src/mesa/vbo/vbo_split_copy.c|   8 +-
>  13 files changed, 171 insertions(+), 128 deletions(-)
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965/surface_formats: Don't advertise 8 or 16-bit RGB formats

2016-07-26 Thread Ilia Mirkin
On Wed, Jul 27, 2016 at 1:04 AM, Jason Ekstrand  wrote:
> We have implicitly been not advertising these formats since we had them
> turned off in the format capabilities table.  We are about to update that
> table and this prevents a change in behavior.  The only change in behavior
> created by this patch is that we no longer advertise support for
> R16G16B16_FLOAT which means that it's now renderable which seems like a
> bonus.  Maybe someday we'll want to change things to start supporting
> 16-bit RGB formats natively but, at the moment, there's no need.
> ---
>  src/mesa/drivers/dri/i965/brw_surface_formats.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> index 2543f4b..69d3bd4 100644
> --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> @@ -311,6 +311,16 @@ brw_init_surface_formats(struct brw_context *brw)
>if (texture == 0 && format != MESA_FORMAT_RGBA_FLOAT32)
>  continue;
>
> +  /* Don't advertisel 8 and 16-bit RGB formats to core mesa.  This 
> ensures

advertise

You might also want to exclude RGB32. It's required for texture
buffers (GL 4.0+), but not for renderable surfaces.

> +   * that they are renderable from an API perspective since core mesa 
> will
> +   * fall back to RGBA or RGBX (we can't render to non-power-of-two
> +   * formats).  For 8-bit, formats, this also keeps us from hitting some
> +   * nasty corners in intel_miptree_map_blit if you ever try to map one.
> +   */
> +  int format_size = _mesa_get_format_bytes(format);
> +  if (format_size == 3 || format_size == 6)
> + continue;
> +
>if (isl_format_supports_sampling(devinfo, texture) &&
>(isl_format_supports_filtering(devinfo, texture) || is_integer))
>  ctx->TextureFormatSupported[format] = true;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 15/35] i965/blorp: Add an isl_view to blorp_surface_info

2016-07-26 Thread Pohjolainen, Topi
On Tue, Jul 26, 2016 at 03:02:06PM -0700, Jason Ekstrand wrote:
> Eventually, this will be the actual view that gets passed into isl to
> create the surface state.  For now, we just use it for the format and the
> swizzle.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 38 
> +++
>  src/mesa/drivers/dri/i965/brw_blorp.h | 16 ++-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 34 
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  2 +-
>  src/mesa/drivers/dri/i965/gen8_blorp.c| 29 
>  5 files changed, 64 insertions(+), 55 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 8f7690c..ef256a7 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -43,9 +43,11 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>  * using INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, then it had 
> better
>  * be a multiple of num_samples.
>  */
> +   unsigned layer_multiplier = 1;
> if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
> mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
>assert(mt->num_samples <= 1 || layer % mt->num_samples == 0);
> +  layer_multiplier = MAX2(mt->num_samples, 1);
> }
>  
> intel_miptree_check_level_layer(mt, level, layer);
> @@ -61,13 +63,27 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>info->aux_usage = ISL_AUX_USAGE_NONE;
> }
>  
> +   info->view = (struct isl_view) {
> +  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> +  ISL_SURF_USAGE_TEXTURE_BIT,
> +  .format = ISL_FORMAT_UNSUPPORTED, /* Set later */
> +  .base_level = level,
> +  .levels = 1,
> +  .base_array_layer = layer / layer_multiplier,
> +  .array_len = 1,
> +  .channel_select = {
> + ISL_CHANNEL_SELECT_RED,
> + ISL_CHANNEL_SELECT_GREEN,
> + ISL_CHANNEL_SELECT_BLUE,
> + ISL_CHANNEL_SELECT_ALPHA,
> +  },
> +   };
> +
> info->level = level;
> info->layer = layer;
> info->width = minify(mt->physical_width0, level - mt->first_level);
> info->height = minify(mt->physical_height0, level - mt->first_level);
>  
> -   info->swizzle = SWIZZLE_XYZW;
> -
> if (format == MESA_FORMAT_NONE)
>format = mt->format;
>  
> @@ -75,8 +91,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> case MESA_FORMAT_S_UINT8:
>assert(info->surf.tiling == ISL_TILING_W);
>/* Prior to Broadwell, we can't render to R8_UINT */
> -  info->brw_surfaceformat = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> -BRW_SURFACEFORMAT_R8_UNORM;
> +  info->view.format = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> +  BRW_SURFACEFORMAT_R8_UNORM;

Should we use ISL_FORMAT_ instead? Or at least the cast. Assigning an enum
with another always looks bad unless it is clear they happen to have exact
same values.

>break;
> case MESA_FORMAT_Z24_UNORM_X8_UINT:
>/* It would make sense to use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS
> @@ -89,20 +105,20 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> * pattern as long as we copy the right amount of data, so just map it
> * as 8-bit BGRA.
> */
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
> +  info->view.format = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
>break;
> case MESA_FORMAT_Z_FLOAT32:
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_R32_FLOAT;
> +  info->view.format = BRW_SURFACEFORMAT_R32_FLOAT;
>break;
> case MESA_FORMAT_Z_UNORM16:
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_R16_UNORM;
> +  info->view.format = BRW_SURFACEFORMAT_R16_UNORM;
>break;
> default: {
>if (is_render_target) {
>   assert(brw->format_supported_as_render_target[format]);
> - info->brw_surfaceformat = brw->render_target_format[format];
> + info->view.format = brw->render_target_format[format];

Perhaps use the cast such as you do further down in the patch:

info->view.format =
   (enum isl_format)brw->render_target_format[format];

>} else {
> - info->brw_surfaceformat = brw_format_for_mesa_format(format);
> + info->view.format = brw_format_for_mesa_format(format);
>}
>break;
> }
> @@ -111,7 +127,7 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> uint32_t x_offset, y_offset;
> intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
>  
> -   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bpb / 8;
> +   uint8_t bs = isl_format_get_layout(info->view.format)->bpb / 8;
> isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
>

Re: [Mesa-dev] [PATCH v2 15/35] i965/blorp: Add an isl_view to blorp_surface_info

2016-07-26 Thread Pohjolainen, Topi
On Tue, Jul 26, 2016 at 03:02:06PM -0700, Jason Ekstrand wrote:
> Eventually, this will be the actual view that gets passed into isl to
> create the surface state.  For now, we just use it for the format and the
> swizzle.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 38 
> +++
>  src/mesa/drivers/dri/i965/brw_blorp.h | 16 ++-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 34 
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  2 +-
>  src/mesa/drivers/dri/i965/gen8_blorp.c| 29 
>  5 files changed, 64 insertions(+), 55 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 8f7690c..ef256a7 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -43,9 +43,11 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>  * using INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, then it had 
> better
>  * be a multiple of num_samples.
>  */
> +   unsigned layer_multiplier = 1;

In principle we could just:

  const unsigned layer_multiplier = MAX2(mt->num_samples, 1);

> if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
> mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
>assert(mt->num_samples <= 1 || layer % mt->num_samples == 0);
> +  layer_multiplier = MAX2(mt->num_samples, 1);
> }
>  
> intel_miptree_check_level_layer(mt, level, layer);
> @@ -61,13 +63,27 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>info->aux_usage = ISL_AUX_USAGE_NONE;
> }
>  
> +   info->view = (struct isl_view) {
> +  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> +  ISL_SURF_USAGE_TEXTURE_BIT,
> +  .format = ISL_FORMAT_UNSUPPORTED, /* Set later */
> +  .base_level = level,
> +  .levels = 1,
> +  .base_array_layer = layer / layer_multiplier,
> +  .array_len = 1,
> +  .channel_select = {
> + ISL_CHANNEL_SELECT_RED,
> + ISL_CHANNEL_SELECT_GREEN,
> + ISL_CHANNEL_SELECT_BLUE,
> + ISL_CHANNEL_SELECT_ALPHA,
> +  },
> +   };
> +
> info->level = level;
> info->layer = layer;
> info->width = minify(mt->physical_width0, level - mt->first_level);
> info->height = minify(mt->physical_height0, level - mt->first_level);
>  
> -   info->swizzle = SWIZZLE_XYZW;
> -
> if (format == MESA_FORMAT_NONE)
>format = mt->format;
>  
> @@ -75,8 +91,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> case MESA_FORMAT_S_UINT8:
>assert(info->surf.tiling == ISL_TILING_W);
>/* Prior to Broadwell, we can't render to R8_UINT */
> -  info->brw_surfaceformat = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> -BRW_SURFACEFORMAT_R8_UNORM;
> +  info->view.format = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> +  BRW_SURFACEFORMAT_R8_UNORM;
>break;
> case MESA_FORMAT_Z24_UNORM_X8_UINT:
>/* It would make sense to use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS
> @@ -89,20 +105,20 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> * pattern as long as we copy the right amount of data, so just map it
> * as 8-bit BGRA.
> */
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
> +  info->view.format = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
>break;
> case MESA_FORMAT_Z_FLOAT32:
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_R32_FLOAT;
> +  info->view.format = BRW_SURFACEFORMAT_R32_FLOAT;
>break;
> case MESA_FORMAT_Z_UNORM16:
> -  info->brw_surfaceformat = BRW_SURFACEFORMAT_R16_UNORM;
> +  info->view.format = BRW_SURFACEFORMAT_R16_UNORM;
>break;
> default: {
>if (is_render_target) {
>   assert(brw->format_supported_as_render_target[format]);
> - info->brw_surfaceformat = brw->render_target_format[format];
> + info->view.format = brw->render_target_format[format];
>} else {
> - info->brw_surfaceformat = brw_format_for_mesa_format(format);
> + info->view.format = brw_format_for_mesa_format(format);
>}
>break;
> }
> @@ -111,7 +127,7 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> uint32_t x_offset, y_offset;
> intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
>  
> -   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bpb / 8;
> +   uint8_t bs = isl_format_get_layout(info->view.format)->bpb / 8;
> isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
>info->surf.row_pitch, x_offset, 
> y_offset,
>>bo_offset,
> @@ -287,7 +303,7 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
> }
>  
> struct isl_view view 

[Mesa-dev] [PATCH 03/11] mesa: Implement _mesa_all_varyings_in_vbos.

2016-07-26 Thread Mathias . Froehlich
From: Mathias Fröhlich 

Implement the equivalent of vbo_all_varyings_in_vbos for
vertex array objects.

v2: Update comment.

Signed-off-by: Mathias Fröhlich 
---
 src/mesa/main/arrayobj.c | 35 +++
 src/mesa/main/arrayobj.h |  4 
 2 files changed, 39 insertions(+)

diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index 9c3451e..becf32f 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -359,6 +359,41 @@ _mesa_update_vao_client_arrays(struct gl_context *ctx,
 }
 
 
+bool
+_mesa_all_varyings_in_vbos(const struct gl_vertex_array_object *vao)
+{
+   /* Walk those enabled arrays that have the default vbo attached */
+   GLbitfield64 mask = vao->_Enabled & ~vao->VertexAttribBufferMask;
+
+   while (mask) {
+  /* Do not use u_bit_scan64 as we can walk multiple
+   * attrib arrays at once
+   */
+  const int i = ffsll(mask) - 1;
+  const struct gl_vertex_attrib_array *attrib_array =
+ >VertexAttrib[i];
+  const struct gl_vertex_buffer_binding *buffer_binding =
+ >VertexBinding[attrib_array->VertexBinding];
+
+  /* Only enabled arrays shall appear in the _Enabled bitmask */
+  assert(attrib_array->Enabled);
+  /* We have already masked out vao->VertexAttribBufferMask  */
+  assert(!_mesa_is_bufferobj(buffer_binding->BufferObj));
+
+  /* Bail out once we find the first non vbo with a non zero stride */
+  if (buffer_binding->Stride != 0)
+ return false;
+
+  /* Note that we cannot use the xor variant since the _BoundArray mask
+   * may contain array attributes that are bound but not enabled.
+   */
+  mask &= ~buffer_binding->_BoundArrays;
+   }
+
+   return true;
+}
+
+
 /**/
 /* API Functions  */
 /**/
diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
index 6a4247f..d30c85c 100644
--- a/src/mesa/main/arrayobj.h
+++ b/src/mesa/main/arrayobj.h
@@ -81,6 +81,10 @@ extern void
 _mesa_update_vao_client_arrays(struct gl_context *ctx,
struct gl_vertex_array_object *vao);
 
+/* Returns true if all varying arrays reside in vbos */
+extern bool
+_mesa_all_varyings_in_vbos(const struct gl_vertex_array_object *vao);
+
 /*
  * API functions
  */
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 26/27] i965/blorp: brw_blorp_blit.cpp -> blorp_blit.c

2016-07-26 Thread Matt Turner
On Tue, Jul 26, 2016 at 3:11 PM, Jason Ekstrand  wrote:
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources   |2 +-
>  src/mesa/drivers/dri/i965/blorp_blit.c   | 1662 
> ++
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 1662 
> --
>  3 files changed, 1663 insertions(+), 1663 deletions(-)
>  create mode 100644 src/mesa/drivers/dri/i965/blorp_blit.c
>  delete mode 100644 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp

Same comment about the other rename patches: what's the point of
dropping the "brw_"?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/13] glsl: free hash tables earlier

2016-07-26 Thread Timothy Arceri
These are only used by get_matching_input() which has been call
at this point so free the hash tables.
---
 src/compiler/glsl/link_varyings.cpp | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index d48c680..91d8974 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -2156,6 +2156,9 @@ assign_varying_locations(struct gl_context *ctx,
   }
}
 
+   hash_table_dtor(consumer_inputs);
+   hash_table_dtor(consumer_interface_inputs);
+
for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
   if (!tfeedback_decls[i].is_varying())
  continue;
@@ -2165,8 +2168,6 @@ assign_varying_locations(struct gl_context *ctx,
 
   if (matched_candidate == NULL) {
  hash_table_dtor(tfeedback_candidates);
- hash_table_dtor(consumer_inputs);
- hash_table_dtor(consumer_interface_inputs);
  return false;
   }
 
@@ -2185,15 +2186,10 @@ assign_varying_locations(struct gl_context *ctx,
 
   if (!tfeedback_decls[i].assign_location(ctx, prog)) {
  hash_table_dtor(tfeedback_candidates);
- hash_table_dtor(consumer_inputs);
- hash_table_dtor(consumer_interface_inputs);
  return false;
   }
}
-
hash_table_dtor(tfeedback_candidates);
-   hash_table_dtor(consumer_inputs);
-   hash_table_dtor(consumer_interface_inputs);
 
if (consumer && producer) {
   foreach_in_list(ir_instruction, node, consumer->ir) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/13] glsl: disable dead code removal of lowered ubos

2016-07-26 Thread Timothy Arceri
This lets us assign uniform storage for packed UBOs after
they have been lowered otherwise the var is removed too early.
---
 src/compiler/glsl/glsl_parser_extras.cpp   | 5 +++--
 src/compiler/glsl/ir_optimization.h| 4 +++-
 src/compiler/glsl/link_varyings.cpp| 2 +-
 src/compiler/glsl/linker.cpp   | 1 +
 src/compiler/glsl/opt_dead_code.cpp| 8 +---
 src/compiler/glsl/test_optpass.cpp | 5 +++--
 src/mesa/drivers/dri/i965/brw_link.cpp | 2 +-
 src/mesa/main/ff_fragment_shader.cpp   | 2 +-
 src/mesa/program/ir_to_mesa.cpp| 2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +-
 10 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index e702291..7842020 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1879,7 +1879,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
   /* Do some optimization at compile time to reduce shader IR size
* and reduce later work if the same shader is linked multiple times
*/
-  while (do_common_optimization(shader->ir, false, false, options,
+  while (do_common_optimization(shader->ir, false, false, false, options,
 ctx->Const.NativeIntegers))
  ;
 
@@ -1977,6 +1977,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
 bool
 do_common_optimization(exec_list *ir, bool linked,
   bool uniform_locations_assigned,
+   bool ubos_lowered,
const struct gl_shader_compiler_options *options,
bool native_integers)
 {
@@ -2019,7 +2020,7 @@ do_common_optimization(exec_list *ir, bool linked,
}
 
if (linked)
-  OPT(do_dead_code, ir, uniform_locations_assigned);
+  OPT(do_dead_code, ir, uniform_locations_assigned, ubos_lowered);
else
   OPT(do_dead_code_unlinked, ir);
OPT(do_dead_code_local, ir);
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index c29260a..d129210 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -77,6 +77,7 @@ enum lower_packing_builtins_op {
 
 bool do_common_optimization(exec_list *ir, bool linked,
bool uniform_locations_assigned,
+bool ubos_lowered,
 const struct gl_shader_compiler_options *options,
 bool native_integers);
 
@@ -97,7 +98,8 @@ void do_dead_builtin_varyings(struct gl_context *ctx,
   gl_linked_shader *consumer,
   unsigned num_tfeedback_decls,
   class tfeedback_decl *tfeedback_decls);
-bool do_dead_code(exec_list *instructions, bool uniform_locations_assigned);
+bool do_dead_code(exec_list *instructions, bool uniform_locations_assigned,
+  bool ubos_lowered);
 bool do_dead_code_local(exec_list *instructions);
 bool do_dead_code_unlinked(exec_list *instructions);
 bool do_dead_functions(exec_list *instructions);
diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index f6778b6..d48c680 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -582,7 +582,7 @@ remove_unused_shader_inputs_and_outputs(bool 
is_separate_shader_object,
/* Eliminate code that is now dead due to unused inputs/outputs being
 * demoted.
 */
-   while (do_dead_code(sh->ir, false))
+   while (do_dead_code(sh->ir, false, true))
   ;
 
 }
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 61f6c42..ba61d39 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4992,6 +4992,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
 
   while (do_common_optimization(prog->_LinkedShaders[i]->ir, true, false,
+false,
 >Const.ShaderCompilerOptions[i],
 ctx->Const.NativeIntegers))
 ;
diff --git a/src/compiler/glsl/opt_dead_code.cpp 
b/src/compiler/glsl/opt_dead_code.cpp
index 75e668a..980660e 100644
--- a/src/compiler/glsl/opt_dead_code.cpp
+++ b/src/compiler/glsl/opt_dead_code.cpp
@@ -43,7 +43,8 @@ static bool debug = false;
  * for usage on an unlinked instruction stream.
  */
 bool
-do_dead_code(exec_list *instructions, bool uniform_locations_assigned)
+do_dead_code(exec_list *instructions, bool uniform_locations_assigned,
+ bool ubos_lowered)
 {
ir_variable_refcount_visitor v;
bool progress = false;
@@ -144,7 +145,8 @@ do_dead_code(exec_list *instructions, bool 
uniform_locations_assigned)
  * layouts, do not eliminate it.

[Mesa-dev] [PATCH 10/13] i965: move common optimisation loop to a helper

2016-07-26 Thread Timothy Arceri
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 50 --
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index efd67e7..e56df93 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -86,6 +86,35 @@ brw_lower_packing_builtins(struct brw_context *brw,
 }
 
 static void
+brw_common_opts(struct gl_linked_shader *shader, struct gl_context *ctx,
+bool uniform_locs_assigned,
+const struct brw_compiler *compiler,
+const struct gl_shader_compiler_options *options)
+{
+   bool progress;
+   do {
+  progress = false;
+
+  if (compiler->scalar_stage[shader->Stage]) {
+ if (shader->Stage == MESA_SHADER_VERTEX ||
+ shader->Stage == MESA_SHADER_FRAGMENT)
+brw_do_channel_expressions(shader->ir);
+ brw_do_vector_splitting(shader->ir);
+  }
+
+  progress = do_lower_jumps(shader->ir, true, true,
+true, /* main return */
+false, /* continue */
+false /* loops */
+) || progress;
+
+  progress = do_common_optimization(shader->ir, true,
+uniform_locs_assigned, true, options,
+ctx->Const.NativeIntegers) || progress;
+   } while (progress);
+}
+
+static void
 process_glsl_ir(struct brw_context *brw,
 struct gl_shader_program *shader_prog,
 struct gl_linked_shader *shader)
@@ -149,26 +178,7 @@ process_glsl_ir(struct brw_context *brw,
  _mesa_shader_stage_to_abbrev(shader->Stage));
}
 
-   bool progress;
-   do {
-  progress = false;
-
-  if (compiler->scalar_stage[shader->Stage]) {
- if (shader->Stage == MESA_SHADER_VERTEX ||
- shader->Stage == MESA_SHADER_FRAGMENT)
-brw_do_channel_expressions(shader->ir);
- brw_do_vector_splitting(shader->ir);
-  }
-
-  progress = do_lower_jumps(shader->ir, true, true,
-true, /* main return */
-false, /* continue */
-false /* loops */
-) || progress;
-
-  progress = do_common_optimization(shader->ir, true, true, true,
-options, ctx->Const.NativeIntegers) || 
progress;
-   } while (progress);
+   brw_common_opts(shader, ctx, false, compiler, options);
 
validate_ir_tree(shader->ir);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/13] glsl: remove remaining tabs in link_uniform_initializers.cpp

2016-07-26 Thread Timothy Arceri
---
 src/compiler/glsl/link_uniform_initializers.cpp | 78 -
 1 file changed, 39 insertions(+), 39 deletions(-)

diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp
index 3750021..021e950 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -46,30 +46,30 @@ get_storage(struct gl_shader_program *prog, const char 
*name)
 
 void
 copy_constant_to_storage(union gl_constant_value *storage,
-const ir_constant *val,
-const enum glsl_base_type base_type,
+ const ir_constant *val,
+ const enum glsl_base_type base_type,
  const unsigned int elements,
  unsigned int boolean_true)
 {
for (unsigned int i = 0; i < elements; i++) {
   switch (base_type) {
   case GLSL_TYPE_UINT:
-storage[i].u = val->value.u[i];
-break;
+ storage[i].u = val->value.u[i];
+ break;
   case GLSL_TYPE_INT:
   case GLSL_TYPE_SAMPLER:
-storage[i].i = val->value.i[i];
-break;
+ storage[i].i = val->value.i[i];
+ break;
   case GLSL_TYPE_FLOAT:
-storage[i].f = val->value.f[i];
-break;
+ storage[i].f = val->value.f[i];
+ break;
   case GLSL_TYPE_DOUBLE:
  /* XXX need to check on big-endian */
  memcpy([i * 2].u, >value.d[i], sizeof(double));
  break;
   case GLSL_TYPE_BOOL:
-storage[i].b = val->value.b[i] ? boolean_true : 0;
-break;
+ storage[i].b = val->value.b[i] ? boolean_true : 0;
+ break;
   case GLSL_TYPE_ARRAY:
   case GLSL_TYPE_STRUCT:
   case GLSL_TYPE_IMAGE:
@@ -79,11 +79,11 @@ copy_constant_to_storage(union gl_constant_value *storage,
   case GLSL_TYPE_SUBROUTINE:
   case GLSL_TYPE_FUNCTION:
   case GLSL_TYPE_ERROR:
-/* All other types should have already been filtered by other
- * paths in the caller.
- */
-assert(!"Should not get here.");
-break;
+ /* All other types should have already been filtered by other
+  * paths in the caller.
+  */
+ assert(!"Should not get here.");
+ break;
   }
}
 }
@@ -102,9 +102,9 @@ set_opaque_binding(void *mem_ctx, gl_shader_program *prog,
   const glsl_type *const element_type = type->fields.array;
 
   for (unsigned int i = 0; i < type->length; i++) {
-const char *element_name = ralloc_asprintf(mem_ctx, "%s[%d]", name, i);
+ const char *element_name = ralloc_asprintf(mem_ctx, "%s[%d]", name, 
i);
 
-set_opaque_binding(mem_ctx, prog, element_type,
+ set_opaque_binding(mem_ctx, prog, element_type,
 element_name, binding);
   }
} else {
@@ -172,7 +172,7 @@ set_block_binding(gl_shader_program *prog, const char 
*block_name,
 
 void
 set_uniform_initializer(void *mem_ctx, gl_shader_program *prog,
-   const char *name, const glsl_type *type,
+const char *name, const glsl_type *type,
 ir_constant *val, unsigned int boolean_true)
 {
const glsl_type *t_without_array = type->without_array();
@@ -182,12 +182,12 @@ set_uniform_initializer(void *mem_ctx, gl_shader_program 
*prog,
   field_constant = (ir_constant *)val->components.get_head();
 
   for (unsigned int i = 0; i < type->length; i++) {
-const glsl_type *field_type = type->fields.structure[i].type;
-const char *field_name = ralloc_asprintf(mem_ctx, "%s.%s", name,
-   type->fields.structure[i].name);
-set_uniform_initializer(mem_ctx, prog, field_name,
+ const glsl_type *field_type = type->fields.structure[i].type;
+ const char *field_name = ralloc_asprintf(mem_ctx, "%s.%s", name,
+type->fields.structure[i].name);
+ set_uniform_initializer(mem_ctx, prog, field_name,
  field_type, field_constant, boolean_true);
-field_constant = (ir_constant *)field_constant->next;
+ field_constant = (ir_constant *)field_constant->next;
   }
   return;
} else if (t_without_array->is_record() ||
@@ -195,9 +195,9 @@ set_uniform_initializer(void *mem_ctx, gl_shader_program 
*prog,
   const glsl_type *const element_type = type->fields.array;
 
   for (unsigned int i = 0; i < type->length; i++) {
-const char *element_name = ralloc_asprintf(mem_ctx, "%s[%d]", name, i);
+ const char *element_name = ralloc_asprintf(mem_ctx, "%s[%d]", name, 
i);
 
-set_uniform_initializer(mem_ctx, prog, element_name,
+ set_uniform_initializer(mem_ctx, prog, element_name,
  element_type, val->array_elements[i],
  

[Mesa-dev] [PATCH 11/13] mesa/i965: create Driver.ProcessGLSLIR()

2016-07-26 Thread Timothy Arceri
This allows us to do backend specific processing on GLSL IR from
the shared linker.
---
 src/mesa/drivers/dri/i965/brw_link.cpp  | 12 ++--
 src/mesa/drivers/dri/i965/brw_program.c |  1 +
 src/mesa/drivers/dri/i965/brw_shader.h  |  4 
 src/mesa/main/dd.h  |  3 +++
 4 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index e56df93..244c8f0 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -114,12 +114,12 @@ brw_common_opts(struct gl_linked_shader *shader, struct 
gl_context *ctx,
} while (progress);
 }
 
-static void
-process_glsl_ir(struct brw_context *brw,
-struct gl_shader_program *shader_prog,
-struct gl_linked_shader *shader)
+extern "C" void
+brw_process_glsl_ir(struct gl_context *ctx,
+struct gl_shader_program *shader_prog,
+struct gl_linked_shader *shader)
 {
-   struct gl_context *ctx = >ctx;
+   struct brw_context *brw = brw_context(ctx);
const struct brw_compiler *compiler = brw->intelScreen->compiler;
const struct gl_shader_compiler_options *options =
   >Const.ShaderCompilerOptions[shader->Stage];
@@ -233,7 +233,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 
   _mesa_copy_linked_program_data((gl_shader_stage) stage, shProg, prog);
 
-  process_glsl_ir(brw, shProg, shader);
+  brw_process_glsl_ir(ctx, shProg, shader);
 
   /* Make a pass over the IR to add state references for any built-in
* uniforms that are used.  This has to be done now (during linking).
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 7785490..559bb4d 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -377,6 +377,7 @@ void brwInitFragProgFuncs( struct dd_function_table 
*functions )
 
functions->NewShader = brw_new_shader;
functions->LinkShader = brw_link_shader;
+   functions->ProcessGLSLIR = brw_process_glsl_ir;
 
functions->MemoryBarrier = brw_memory_barrier;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index e61c080..65acc30 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -290,6 +290,10 @@ bool brw_cs_precompile(struct gl_context *ctx,
 
 GLboolean brw_link_shader(struct gl_context *ctx, struct gl_shader_program 
*prog);
 struct gl_linked_shader *brw_new_shader(gl_shader_stage stage);
+void
+brw_process_glsl_ir(struct gl_context *ctx,
+struct gl_shader_program *shader_prog,
+struct gl_linked_shader *shader);
 
 int type_size_scalar(const struct glsl_type *type);
 int type_size_vec4(const struct glsl_type *type);
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 114cbd2..3f9ebdf 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -786,6 +786,9 @@ struct dd_function_table {
/*@{*/
struct gl_linked_shader *(*NewShader)(gl_shader_stage stage);
void (*UseProgram)(struct gl_context *ctx, struct gl_shader_program 
*shProg);
+   void (*ProcessGLSLIR)(struct gl_context *ctx,
+ struct gl_shader_program *shader_prog,
+ struct gl_linked_shader *shader);
/*@}*/
 
/**
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/35] i965/blorp: Refactor interleaved multisample destination handling

2016-07-26 Thread Pohjolainen, Topi
On Tue, Jul 26, 2016 at 03:02:04PM -0700, Jason Ekstrand wrote:
> We put all of the code for fake IMS together.  This requires moving a bit
> of the program key setup code further down so that it gets the right values
> out of the final surface.
> 
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 71 
> +---
>  1 file changed, 34 insertions(+), 37 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index c337a86..03e4984 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1698,28 +1698,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>unreachable("Unrecognized blorp format");
> }
>  
> -   if (brw->gen > 6) {
> -  /* Gen7's rendering hardware only supports the IMS layout for depth and
> -   * stencil render targets.  Blorp always maps its destination surface 
> as
> -   * a color render target (even if it's actually a depth or stencil
> -   * buffer).  So if the destination is IMS, we'll have to map it as a
> -   * single-sampled texture and interleave the samples ourselves.
> -   */
> -  if (dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
> - params.dst.surf.samples = 1;
> - params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> -  }
> -   }
> -
> -   if (params.src.surf.samples > 0 && params.dst.surf.samples > 1) {
> -  /* We are blitting from a multisample buffer to a multisample buffer, 
> so
> -   * we must preserve samples within a pixel.  This means we have to
> -   * arrange for the WM program to run once per sample rather than once
> -   * per pixel.
> -   */
> -  wm_prog_key.persample_msaa_dispatch = true;
> -   }
> -
> /* Scaled blitting or not. */
> wm_prog_key.blit_scaled =
>((dst_x1 - dst_x0) == (src_x1 - src_x0) &&
> @@ -1759,20 +1737,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> wm_prog_key.src_samples = src_mt->num_samples;
> wm_prog_key.dst_samples = dst_mt->num_samples;
>  
> -   /* tex_samples and rt_samples are the sample counts that are set up in
> -* SURFACE_STATE.
> -*/
> -   wm_prog_key.tex_samples = params.src.surf.samples;
> -   wm_prog_key.rt_samples  = params.dst.surf.samples;
> -
> wm_prog_key.tex_aux_usage = params.src.aux_usage;
>  
> -   /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline will
> -* use to access the source and destination surfaces.
> -*/
> -   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
> -   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
> -
> /* src_layout and dst_layout indicate the true MSAA layout used by src and
>  * dst.
>  */
> @@ -1809,7 +1775,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>params.wm_inputs.src_z = 0;
> }
>  
> -   if (params.dst.surf.samples <= 1 && dst_mt->num_samples > 1) {
> +   if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
>/* We must expand the rectangle we send through the rendering pipeline,
> * to account for the fact that we are mapping the destination region 
> as
> * single-sampled when it is in fact multisampled.  We must also align
> @@ -1822,8 +1788,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> * If it's UMS, then we have no choice but to set up the rendering
> * pipeline as multisampled.
> */
> -  assert(dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS);
> -  switch (dst_mt->num_samples) {
> +  assert(params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED);

==

> +  switch (params.dst.surf.samples) {
>case 2:
>   params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);
>   params.y0 = ROUND_DOWN_TO(params.y0, 4);
> @@ -1851,6 +1817,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>default:
>   unreachable("Unrecognized sample count in brw_blorp_blit_params 
> ctor");
>}
> +
> +  /* Gen7's rendering hardware only supports the IMS layout for depth and
> +   * stencil render targets.  Blorp always maps its destination surface 
> as
> +   * a color render target (even if it's actually a depth or stencil
> +   * buffer).  So if the destination is IMS, we'll have to map it as a
> +   * single-sampled texture and interleave the samples ourselves.
> +   */
> +  params.dst.surf.samples = 1;
> +  params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> +
>wm_prog_key.use_kill = true;
> }
>  
> @@ -1952,6 +1928,27 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>params.src.y_offset /= 2;
> }
>  
> +   /* tex_samples and rt_samples are the sample counts that are set up in
> +* SURFACE_STATE.
> +*/
> +   wm_prog_key.tex_samples = 

Re: [Mesa-dev] Call i965 GLSL IR backend optimisation from the common linker

2016-07-26 Thread Matt Turner
On Tue, Jul 26, 2016 at 10:20 PM, Timothy Arceri
 wrote:
> The ultimate goal is to be able to convert to NIR and make use of its
> optimisations before assigning varying and uniform locations. This
> should allow us to start removing some of the GLSL IR optimisation
> passes.

I'm very excited about this!

> This series falls short of making use of NIR because lower_packed_varyings()
> modifies the IR after we assign varying locations. I can see two ways
> around this, listing them in increasing difficultly level they would be:
>
> - replacing the current packing pass with one that follows the packing
> rules of ARB_enhanced_layouts this would mean we can no longer pack
> across slots and matrix and array packing effectivness would be slightly
> decreased.
> - write a NIR packing pass.

Specifically a NIR implementation of lower_packed_varyings(), right?

>
> Even without converting to NIR this series solves a number of the other
> problems with converting to NIR earlier and provides a nice shader-db
> improvement on its own.
>
> Broadwell shader-db results:
>
> total instructions in shared programs: 8651650 -> 8644415 (-0.08%)
> instructions in affected programs: 38754 -> 31519 (-18.67%)
> total loops in shared programs:2085 -> 2085 (0.00%)
> helped:320
> HURT:  0
> GAINED:0

Impressive.

> Ivybridge reported no difference.

I suspect that's because Ivybridge's vertex shader is vec4, and we
don't dead code eliminate individual *components* of varyings, whereas
on Broadwell with scalar vertex shaders we're able to eliminate those
dead components.

Thanks so much for working on this!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/pipeline: Enable only one dispatch width in case of per sample shading

2016-07-26 Thread Jason Ekstrand
On Jul 26, 2016 12:54 PM, "Anuj Phogat"  wrote:
>
> Fixes ~45 DEQP sample shading tests:
> ./deqp-vk --deqp-case=dEQP-VK.pipeline.multisample.min_sample_shading*
>
> Many tests exited with VK_ERROR_OUT_OF_DEVICE_MEMORY without this patch.
>
> Cc: Jason Ekstrand 
> Signed-off-by: Anuj Phogat 
>
> ---
> Another patch enabling the sample shading is required to test this patch.
> I'll send out the enabling patch once we pass all the sample shading
tests.
> Use https://github.com/aphogat/mesa, branch: review to test the patch.
> ---
>  src/intel/vulkan/gen7_pipeline.c |  9 -
>  src/intel/vulkan/gen8_pipeline.c | 12 
>  2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/src/intel/vulkan/gen7_pipeline.c
b/src/intel/vulkan/gen7_pipeline.c
> index 8ce50be..23535f5 100644
> --- a/src/intel/vulkan/gen7_pipeline.c
> +++ b/src/intel/vulkan/gen7_pipeline.c
> @@ -249,6 +249,8 @@ genX(graphics_pipeline_create)(
>   anv_finishme("primitive_id needs sbe swizzling setup");
>
>emit_3dstate_sbe(pipeline);
> +  bool per_sample_ps = pCreateInfo->pMultisampleState &&
> +
 pCreateInfo->pMultisampleState->sampleShadingEnable;
>
>anv_batch_emit(>batch, GENX(3DSTATE_PS), ps) {
>   ps.KernelStartPointer0   = pipeline->ps_ksp0;
> @@ -274,7 +276,12 @@ genX(graphics_pipeline_create)(
>
>   ps._32PixelDispatchEnable= false;
>   ps._16PixelDispatchEnable= wm_prog_data->dispatch_16;
> - ps._8PixelDispatchEnable = wm_prog_data->dispatch_8;
> + /* On all hardware generations, the only configurations
supporting
> +  * persample dispatch are in which only one dispatch width is
enabled.
> +  */
> + ps._8PixelDispatchEnable = wm_prog_data->dispatch_8 &&
> +(!per_sample_ps ||
> + !wm_prog_data->dispatch_16);

I don't think we need to do this.  brw_compile_fs in brw_fs.cpp should
handle this for us based on the shader key.  We should be able to just set
the shader key bits correctly and then trust brw_compile_fs to give us only
one dispatch width.

--Jason

>
>   ps.DispatchGRFStartRegisterforConstantSetupData0 =
>  wm_prog_data->base.dispatch_grf_start_reg,
> diff --git a/src/intel/vulkan/gen8_pipeline.c
b/src/intel/vulkan/gen8_pipeline.c
> index cc10d3a..bde7660 100644
> --- a/src/intel/vulkan/gen8_pipeline.c
> +++ b/src/intel/vulkan/gen8_pipeline.c
> @@ -333,12 +333,19 @@ genX(graphics_pipeline_create)(
>}
> } else {
>emit_3dstate_sbe(pipeline);
> +  bool per_sample_ps = pCreateInfo->pMultisampleState &&
> +
 pCreateInfo->pMultisampleState->sampleShadingEnable;
>
>anv_batch_emit(>batch, GENX(3DSTATE_PS), ps) {
>   ps.KernelStartPointer0 = pipeline->ps_ksp0;
>   ps.KernelStartPointer1 = 0;
>   ps.KernelStartPointer2 = pipeline->ps_ksp0 +
wm_prog_data->prog_offset_2;
> - ps._8PixelDispatchEnable   = wm_prog_data->dispatch_8;
> + /* On all hardware generations, the only configurations
supporting
> +  * persample dispatch are in which only one dispatch width is
enabled.
> +  */
> + ps._8PixelDispatchEnable   = wm_prog_data->dispatch_8 &&
> +  (!per_sample_ps ||
> +   !wm_prog_data->dispatch_16);
>   ps._16PixelDispatchEnable  = wm_prog_data->dispatch_16;
>   ps._32PixelDispatchEnable  = false;
>   ps.SingleProgramFlow   = false;
> @@ -365,9 +372,6 @@ genX(graphics_pipeline_create)(
>  wm_prog_data->dispatch_grf_start_reg_2;
>}
>
> -  bool per_sample_ps = pCreateInfo->pMultisampleState &&
> -
 pCreateInfo->pMultisampleState->sampleShadingEnable;
> -
>anv_batch_emit(>batch, GENX(3DSTATE_PS_EXTRA), ps) {
>   ps.PixelShaderValid  = true;
>   ps.PixelShaderKillsPixel = wm_prog_data->uses_kill;
> --
> 2.5.5
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Interest in GL_ARB_gl_spirv support?

2016-07-26 Thread Jason Ekstrand
On Tue, Jul 26, 2016 at 4:50 PM, Ilia Mirkin  wrote:

> On Tue, Jul 26, 2016 at 7:44 PM, Marek Olšák  wrote:
> > On Wed, Jul 27, 2016 at 12:29 AM, oscar bg  wrote:
> >> Hi,
> >> seems this year 2016 OpenGL ARB update brings a small number of
> extensions..
> >> seems the most important is GL_ARB_gl_spirv.. seems like SPIRV as a
> binary
> >> format for OpenGL and Mesa doesn't have any binary format even
> supporting
> >> ARB_program_binary ext.. a Nvidia driver is already providing support
> from
> >> day 1 for Linux as always..
> >>
> >> just asking how difficult would be to bring support to Mesa drivers..
> and if
> >> there is any interest by Mesa devs start working on it soon..
> >>
> >> seems already we have SPIRV support in Mesa in Vulkan drivers: Anvil
> Vulkan
> >> Intel driver and some days ago RADV a open source Vulkan driver for AMD
> GPUs
> >> has been anounced.. as this drivers already eat SPIRV code seems this
> >> extension would take less work to port to this two vendor GPUs?
>

We're thinking about it but, unfortunately, I haven't had much time to look
into the extension.  One thing that I can say right now, is that the
spirv_to_nir pass isn't what you'd call OpenGL-friendly.  If you pass it
invalid SPIR-V, it will crash.  What little error checking it does do is
done via asserts.  That's perfectly acceptable in the Vulkan world but
unless they went out of their way to ok it in the extension, crashing
generally isn't acceptable in OpenGL.  Also, it's not entirely clear what
all hoops we would have to go through to tie a NIR shader into the OpenGL
state upload paths for things like uniforms, inputs/outputs, etc.  This
isn't to say that it can't be done but it's substantially more than zero
work.


> > The showstopper for sharing Vulkan code is that OpenGL doesn't have
> > pipeline state objects. Because of that, you can ignore RADV. I think
> > nobody has enough resources to replicate what the radeonsi TGSI path
> > does, so you are pretty much stuck with TGSI.
> >
> > SPIRV -> NIR -> TGSI should work.
>

Eric's NIR -> TGSI pass is a bit rusty at this point, but it wouldn't be
hard to revive it.


> FWIW my plans for nouveau definitely involve direct SPIR-V input. This
> will also be useful for an independent Vulkan driver, as well as
> OpenCL SPIR-V. However I'm not sure when such plans will materialize.
>

I've just got to say it.  You're crazy...

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/13] glsl: use UniformHash to find storage location

2016-07-26 Thread Timothy Arceri
There is no need to be looping over all the uniforms.
---
 src/compiler/glsl/link_uniform_initializers.cpp | 29 ++---
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp
index 17660a7..3750021 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -22,6 +22,7 @@
  */
 
 #include "main/core.h"
+#include "program/hash_table.h"
 #include "ir.h"
 #include "linker.h"
 #include "ir_uniform.h"
@@ -33,14 +34,13 @@
 namespace linker {
 
 gl_uniform_storage *
-get_storage(gl_uniform_storage *storage, unsigned num_storage,
-   const char *name)
+get_storage(struct gl_shader_program *prog, const char *name)
 {
-   for (unsigned int i = 0; i < num_storage; i++) {
-  if (strcmp(name, storage[i].name) == 0)
-return [i];
-   }
+   unsigned id;
+   if (prog->UniformHash->get(id, name))
+  return >UniformStorage[id];
 
+   assert(!"No uniform storage found!");
return NULL;
 }
 
@@ -108,13 +108,10 @@ set_opaque_binding(void *mem_ctx, gl_shader_program *prog,
 element_name, binding);
   }
} else {
-  struct gl_uniform_storage *const storage =
- get_storage(prog->UniformStorage, prog->NumUniformStorage, name);
+  struct gl_uniform_storage *const storage = get_storage(prog, name);
 
-  if (storage == NULL) {
- assert(storage != NULL);
+  if (!storage)
  return;
-  }
 
   const unsigned elements = MAX2(storage->array_elements, 1);
 
@@ -207,14 +204,10 @@ set_uniform_initializer(void *mem_ctx, gl_shader_program 
*prog,
   return;
}
 
-   struct gl_uniform_storage *const storage =
-  get_storage(prog->UniformStorage,
-  prog->NumUniformStorage,
- name);
-   if (storage == NULL) {
-  assert(storage != NULL);
+   struct gl_uniform_storage *const storage = get_storage(prog, name);
+
+   if (!storage)
   return;
-   }
 
if (val->type->is_array()) {
   const enum glsl_base_type base_type =
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/13] glsl: move update_uniform_buffer_variables() to lower UBO

2016-07-26 Thread Timothy Arceri
This make more sense here as its lowering that uses the results of
this function.

This allows us to call lower_ubo_reference() before assigning uniform
locations which is useful for calling backend specific optimisations
on the IR before assigning uniform and varying locations.

While we are at it we also call the other lowering passes before
assigning locations.
---
 src/compiler/glsl/link_uniforms.cpp   | 71 --
 src/compiler/glsl/linker.cpp  | 38 
 src/compiler/glsl/lower_ubo_reference.cpp | 72 +++
 3 files changed, 91 insertions(+), 90 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 793f12c..bb8905b 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -882,75 +882,6 @@ public:
 };
 
 /**
- * Walks the IR and update the references to uniform blocks in the
- * ir_variables to point at linked shader's list (previously, they
- * would point at the uniform block list in one of the pre-linked
- * shaders).
- */
-static void
-link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
-{
-   foreach_in_list(ir_instruction, node, shader->ir) {
-  ir_variable *const var = node->as_variable();
-
-  if ((var == NULL) || !var->is_in_buffer_block())
- continue;
-
-  assert(var->data.mode == ir_var_uniform ||
- var->data.mode == ir_var_shader_storage);
-
-  if (var->is_interface_instance()) {
- var->data.location = 0;
- continue;
-  }
-
-  bool found = false;
-  char sentinel = '\0';
-
-  if (var->type->is_record()) {
- sentinel = '.';
-  } else if (var->type->is_array() && (var->type->fields.array->is_array()
- || var->type->without_array()->is_record())) {
- sentinel = '[';
-  }
-
-  unsigned num_blocks = var->data.mode == ir_var_uniform ?
- shader->NumUniformBlocks : shader->NumShaderStorageBlocks;
-  struct gl_uniform_block **blks = var->data.mode == ir_var_uniform ?
- shader->UniformBlocks : shader->ShaderStorageBlocks;
-
-  const unsigned l = strlen(var->name);
-  for (unsigned i = 0; i < num_blocks; i++) {
- for (unsigned j = 0; j < blks[i]->NumUniforms; j++) {
-if (sentinel) {
-   const char *begin = blks[i]->Uniforms[j].Name;
-   const char *end = strchr(begin, sentinel);
-
-   if (end == NULL)
-  continue;
-
-   if ((ptrdiff_t) l != (end - begin))
-  continue;
-
-   if (strncmp(var->name, begin, l) == 0) {
-  found = true;
-  var->data.location = j;
-  break;
-   }
-} else if (!strcmp(var->name, blks[i]->Uniforms[j].Name)) {
-   found = true;
-   var->data.location = j;
-   break;
-}
- }
- if (found)
-break;
-  }
-  assert(found);
-   }
-}
-
-/**
  * Combine the hidden uniform hash map with the uniform hash map so that the
  * hidden uniforms will be given indicies at the end of the uniform storage
  * array.
@@ -1261,8 +1192,6 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
   memset(sh->SamplerUnits, 0, sizeof(sh->SamplerUnits));
   memset(sh->ImageUnits, 0, sizeof(sh->ImageUnits));
 
-  link_update_uniform_buffer_variables(sh);
-
   /* Reset various per-shader target counts.
*/
   uniform_size.start_shader();
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index ba61d39..a2b1ce2 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4559,6 +4559,25 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
  return false;
}
 
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (sh == NULL)
+ continue;
+
+  const struct gl_shader_compiler_options *options =
+ >Const.ShaderCompilerOptions[i];
+
+  if (options->LowerBufferInterfaceBlocks)
+ lower_ubo_reference(prog->_LinkedShaders[i],
+ options->ClampBlockIndicesToArrayBounds);
+
+  if (options->LowerShaderSharedVariables)
+ lower_shared_reference(sh, >Comp.SharedSize);
+
+  lower_vector_derefs(sh);
+  do_vec_index_to_swizzle(sh->ir);
+   }
+
/* If there is no fragment shader we need to set transform feedback.
 *
 * For SSO we also need to assign output locations.  We assign them here
@@ -4671,25 +4690,6 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
if (!prog->LinkStatus)
   return false;
 
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  if (prog->_LinkedShaders[i] == NULL)
- continue;
-
-  const struct gl_shader_compiler_options 

[Mesa-dev] [PATCH 12/13] glsl/i965: call backend optimisations from glsl linker

2016-07-26 Thread Timothy Arceri
Here we get the backend to do its extra GLSL IR passes before
assigning varying and uniform locations.

We move the lower_variable_index_to_cond_assign() call to
brw_link_shader() as this must be called after we have done
varying packing to avoid regressions.

Broadwell shader-db results:

total instructions in shared programs: 8651650 -> 8644415 (-0.08%)
instructions in affected programs: 38754 -> 31519 (-18.67%)
helped:320
HURT:  0
---
 src/compiler/glsl/linker.cpp   | 18 ++
 src/mesa/drivers/dri/i965/brw_link.cpp | 32 +---
 2 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index a2b1ce2..c5e75e3 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4612,6 +4612,10 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
  do_dead_builtin_varyings(ctx, sh, NULL, num_tfeedback_decls,
   tfeedback_decls);
 
+ if (ctx->Driver.ProcessGLSLIR) {
+ctx->Driver.ProcessGLSLIR(ctx, prog, sh);
+ }
+
  if (prog->SeparateShader) {
 const uint64_t reserved_slots =
reserved_varying_slot(sh, ir_var_shader_in);
@@ -4650,6 +4654,10 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
   next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
   tfeedback_decls);
 
+if (ctx->Driver.ProcessGLSLIR) {
+   ctx->Driver.ProcessGLSLIR(ctx, prog, sh_next);
+}
+
 if (!assign_varying_locations(ctx, mem_ctx, prog, sh_i, sh_next,
   next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
   tfeedback_decls,
@@ -4670,6 +4678,10 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
 
 next = i;
  }
+
+ if (ctx->Driver.ProcessGLSLIR) {
+ctx->Driver.ProcessGLSLIR(ctx, prog, prog->_LinkedShaders[first]);
+ }
   }
}
 
@@ -4677,6 +4689,12 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
  has_xfb_qualifiers))
   return false;
 
+   if (last == MESA_SHADER_COMPUTE) {
+  if (ctx->Driver.ProcessGLSLIR) {
+ ctx->Driver.ProcessGLSLIR(ctx, prog, prog->_LinkedShaders[last]);
+  }
+   }
+
update_array_sizes(prog);
link_assign_uniform_locations(prog, ctx, num_explicit_uniform_locs);
link_assign_atomic_counter_resources(ctx, prog);
diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 244c8f0..4c3a508 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -165,19 +165,6 @@ brw_process_glsl_ir(struct gl_context *ctx,
 
do_copy_propagation(shader->ir);
 
-   bool lowered_variable_indexing =
-  lower_variable_index_to_cond_assign(shader->Stage, shader->ir,
-  options->EmitNoIndirectInput,
-  options->EmitNoIndirectOutput,
-  options->EmitNoIndirectTemp,
-  options->EmitNoIndirectUniform);
-
-   if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
-  perf_debug("Unsupported form of variable indexing in %s; falling "
- "back to very inefficient code generation\n",
- _mesa_shader_stage_to_abbrev(shader->Stage));
-   }
-
brw_common_opts(shader, ctx, false, compiler, options);
 
validate_ir_tree(shader->ir);
@@ -231,9 +218,24 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 return false;
   prog->Parameters = _mesa_new_parameter_list();
 
-  _mesa_copy_linked_program_data((gl_shader_stage) stage, shProg, prog);
+  const struct gl_shader_compiler_options *options =
+ >Const.ShaderCompilerOptions[shader->Stage];
+  bool lowered_variable_indexing =
+ lower_variable_index_to_cond_assign(shader->Stage, shader->ir,
+ options->EmitNoIndirectInput,
+ options->EmitNoIndirectOutput,
+ options->EmitNoIndirectTemp,
+ options->EmitNoIndirectUniform);
+
+  if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
+ perf_debug("Unsupported form of variable indexing in %s; falling "
+"back to very inefficient code generation\n",
+_mesa_shader_stage_to_abbrev(shader->Stage));
+  }
+
+  brw_common_opts(shader, ctx, true, compiler, options);
 
-  brw_process_glsl_ir(ctx, shProg, shader);
+  _mesa_copy_linked_program_data((gl_shader_stage) stage, shProg, prog);
 
   

[Mesa-dev] [PATCH 01/13] glsl: split out varying and uniform linking code

2016-07-26 Thread Timothy Arceri
Here a new function link_varyings_and_uniforms() is created this
should help make it easier to follow the code in link_shader()
which was getting very large.

Note the end of the new function contains a for loop with some
lowering calls that currently don't seem related to varyings or
uniforms but they are a dependancy for converting to NIR ealier
so we move things here now to keep things easy to follow.
---
 src/compiler/glsl/linker.cpp | 429 ++-
 1 file changed, 222 insertions(+), 207 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 6d45a02..02d16ec 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4475,6 +4475,226 @@ disable_varying_optimizations_for_sso(struct 
gl_shader_program *prog)
}
 }
 
+static bool
+link_varyings_and_uniforms(unsigned first, unsigned last,
+   unsigned num_explicit_uniform_locs,
+   struct gl_context *ctx,
+   struct gl_shader_program *prog, void *mem_ctx)
+{
+   bool has_xfb_qualifiers = false;
+   unsigned num_tfeedback_decls = 0;
+   char **varying_names = NULL;
+   tfeedback_decl *tfeedback_decls = NULL;
+
+   /* Mark all generic shader inputs and outputs as unpaired. */
+   for (unsigned i = MESA_SHADER_VERTEX; i <= MESA_SHADER_FRAGMENT; i++) {
+  if (prog->_LinkedShaders[i] != NULL) {
+ link_invalidate_variable_locations(prog->_LinkedShaders[i]->ir);
+  }
+   }
+
+   unsigned prev = first;
+   for (unsigned i = prev + 1; i <= MESA_SHADER_FRAGMENT; i++) {
+  if (prog->_LinkedShaders[i] == NULL)
+ continue;
+
+  match_explicit_outputs_to_inputs(prog->_LinkedShaders[prev],
+   prog->_LinkedShaders[i]);
+  prev = i;
+   }
+
+   if (!assign_attribute_or_color_locations(prog, >Const,
+MESA_SHADER_VERTEX)) {
+  return false;
+   }
+
+   if (!assign_attribute_or_color_locations(prog, >Const,
+MESA_SHADER_FRAGMENT)) {
+  return false;
+   }
+
+   /* From the ARB_enhanced_layouts spec:
+*
+*"If the shader used to record output variables for transform feedback
+*varyings uses the "xfb_buffer", "xfb_offset", or "xfb_stride" layout
+*qualifiers, the values specified by TransformFeedbackVaryings are
+*ignored, and the set of variables captured for transform feedback is
+*instead derived from the specified layout qualifiers."
+*/
+   for (int i = MESA_SHADER_FRAGMENT - 1; i >= 0; i--) {
+  /* Find last stage before fragment shader */
+  if (prog->_LinkedShaders[i]) {
+ has_xfb_qualifiers =
+process_xfb_layout_qualifiers(mem_ctx, prog->_LinkedShaders[i],
+  _tfeedback_decls,
+  _names);
+ break;
+  }
+   }
+
+   if (!has_xfb_qualifiers) {
+  num_tfeedback_decls = prog->TransformFeedback.NumVarying;
+  varying_names = prog->TransformFeedback.VaryingNames;
+   }
+
+   if (num_tfeedback_decls != 0) {
+  /* From GL_EXT_transform_feedback:
+   *   A program will fail to link if:
+   *
+   *   * the  specified by TransformFeedbackVaryingsEXT is
+   * non-zero, but the program object has no vertex or geometry
+   * shader;
+   */
+  if (first >= MESA_SHADER_FRAGMENT) {
+ linker_error(prog, "Transform feedback varyings specified, but "
+  "no vertex, tessellation, or geometry shader is "
+  "present.\n");
+ return false;
+  }
+
+  tfeedback_decls = ralloc_array(mem_ctx, tfeedback_decl,
+ num_tfeedback_decls);
+  if (!parse_tfeedback_decls(ctx, prog, mem_ctx, num_tfeedback_decls,
+ varying_names, tfeedback_decls))
+ return false;
+   }
+
+   /* If there is no fragment shader we need to set transform feedback.
+*
+* For SSO we also need to assign output locations.  We assign them here
+* because we need to do it for both single stage programs and multi stage
+* programs.
+*/
+   if (last < MESA_SHADER_FRAGMENT &&
+   (num_tfeedback_decls != 0 || prog->SeparateShader)) {
+  const uint64_t reserved_out_slots =
+ reserved_varying_slot(prog->_LinkedShaders[last], ir_var_shader_out);
+  if (!assign_varying_locations(ctx, mem_ctx, prog,
+prog->_LinkedShaders[last], NULL,
+num_tfeedback_decls, tfeedback_decls,
+reserved_out_slots))
+ return false;
+   }
+
+   if (last <= MESA_SHADER_FRAGMENT) {
+  /* Remove unused varyings from the first/last stage unless SSO */
+  remove_unused_shader_inputs_and_outputs(prog->SeparateShader,
+

[Mesa-dev] [PATCH 06/13] glsl: move uniform linking code to link_assign_uniform_storage()

2016-07-26 Thread Timothy Arceri
This makes link_assign_uniform_locations() easier to follow.
---
 src/compiler/glsl/link_uniforms.cpp | 132 +++-
 1 file changed, 69 insertions(+), 63 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 89196e6..793f12c 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -1153,13 +1153,77 @@ link_setup_uniform_remap_tables(struct gl_context *ctx,
}
 }
 
+static void
+link_assign_uniform_storage(struct gl_context *ctx,
+struct gl_shader_program *prog,
+const unsigned num_data_slots,
+unsigned num_explicit_uniform_locs)
+{
+   /* On the outside chance that there were no uniforms, bail out.
+*/
+   if (prog->NumUniformStorage == 0)
+  return;
+
+   unsigned int boolean_true = ctx->Const.UniformBooleanTrue;
+
+   prog->UniformStorage = rzalloc_array(prog, struct gl_uniform_storage,
+prog->NumUniformStorage);
+   union gl_constant_value *data = rzalloc_array(prog->UniformStorage,
+ union gl_constant_value,
+ num_data_slots);
+#ifndef NDEBUG
+   union gl_constant_value *data_end = [num_data_slots];
+#endif
+
+   parcel_out_uniform_storage parcel(prog, prog->UniformHash,
+ prog->UniformStorage, data);
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (prog->_LinkedShaders[i] == NULL)
+ continue;
+
+  parcel.start_shader((gl_shader_stage)i);
+
+  foreach_in_list(ir_instruction, node, prog->_LinkedShaders[i]->ir) {
+ ir_variable *const var = node->as_variable();
+
+ if ((var == NULL) || (var->data.mode != ir_var_uniform &&
+   var->data.mode != ir_var_shader_storage))
+continue;
+
+ parcel.set_and_process(var);
+  }
+
+  prog->_LinkedShaders[i]->active_samplers = parcel.shader_samplers_used;
+  prog->_LinkedShaders[i]->shadow_samplers = parcel.shader_shadow_samplers;
+
+  STATIC_ASSERT(sizeof(prog->_LinkedShaders[i]->SamplerTargets) ==
+sizeof(parcel.targets));
+  memcpy(prog->_LinkedShaders[i]->SamplerTargets, parcel.targets,
+ sizeof(prog->_LinkedShaders[i]->SamplerTargets));
+   }
+
+#ifndef NDEBUG
+   for (unsigned i = 0; i < prog->NumUniformStorage; i++) {
+  assert(prog->UniformStorage[i].storage != NULL ||
+ prog->UniformStorage[i].builtin ||
+ prog->UniformStorage[i].is_shader_storage ||
+ prog->UniformStorage[i].block_index != -1);
+   }
+
+   assert(parcel.values == data_end);
+#endif
+
+   link_setup_uniform_remap_tables(ctx, prog, num_explicit_uniform_locs);
+
+   link_set_uniform_initializers(prog, boolean_true);
+}
+
 void
 link_assign_uniform_locations(struct gl_shader_program *prog,
   struct gl_context *ctx,
   unsigned int num_explicit_uniform_locs)
 {
-   unsigned int boolean_true = ctx->Const.UniformBooleanTrue;
-
ralloc_free(prog->UniformStorage);
prog->UniformStorage = NULL;
prog->NumUniformStorage = 0;
@@ -1225,70 +1289,12 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
}
 
prog->NumUniformStorage = uniform_size.num_active_uniforms;
-   const unsigned num_data_slots = uniform_size.num_values;
-   const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms;
+   prog->NumHiddenUniforms = uniform_size.num_hidden_uniforms;
 
/* assign hidden uniforms a slot id */
hiddenUniforms->iterate(assign_hidden_uniform_slot_id, _size);
delete hiddenUniforms;
 
-   /* On the outside chance that there were no uniforms, bail out.
-*/
-   if (prog->NumUniformStorage == 0)
-  return;
-
-   prog->UniformStorage = rzalloc_array(prog, struct gl_uniform_storage,
-prog->NumUniformStorage);
-   union gl_constant_value *data = rzalloc_array(prog->UniformStorage,
- union gl_constant_value,
- num_data_slots);
-#ifndef NDEBUG
-   union gl_constant_value *data_end = [num_data_slots];
-#endif
-
-   parcel_out_uniform_storage parcel(prog, prog->UniformHash,
- prog->UniformStorage, data);
-
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  if (prog->_LinkedShaders[i] == NULL)
- continue;
-
-  parcel.start_shader((gl_shader_stage)i);
-
-  foreach_in_list(ir_instruction, node, prog->_LinkedShaders[i]->ir) {
- ir_variable *const var = node->as_variable();
-
- if ((var == NULL) || (var->data.mode != ir_var_uniform &&
-   var->data.mode != ir_var_shader_storage))
-continue;
-
-   

[Mesa-dev] [PATCH 02/13] glsl: remove dead builtins before assigning varying locations

2016-07-26 Thread Timothy Arceri
Builtins already have locations assigned so this shouldn't
changing anything. We want to call it earlier so we can tranform
GLSL IR to NIR earlier.
---
 src/compiler/glsl/linker.cpp | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 02d16ec..2fefccf 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4587,8 +4587,12 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
 
   /* If the program is made up of only a single stage */
   if (first == last) {
-
  gl_linked_shader *const sh = prog->_LinkedShaders[last];
+
+ do_dead_builtin_varyings(ctx, NULL, sh, 0, NULL);
+ do_dead_builtin_varyings(ctx, sh, NULL, num_tfeedback_decls,
+  tfeedback_decls);
+
  if (prog->SeparateShader) {
 const uint64_t reserved_slots =
reserved_varying_slot(sh, ir_var_shader_in);
@@ -4604,10 +4608,6 @@ link_varyings_and_uniforms(unsigned first, unsigned last,
   reserved_slots))
return false;
  }
-
- do_dead_builtin_varyings(ctx, NULL, sh, 0, NULL);
- do_dead_builtin_varyings(ctx, sh, NULL, num_tfeedback_decls,
-  tfeedback_decls);
   } else {
  /* Linking the stages in the opposite order (from fragment to vertex)
   * ensures that inter-shader outputs written to in an earlier stage
@@ -4627,16 +4627,16 @@ link_varyings_and_uniforms(unsigned first, unsigned 
last,
 const uint64_t reserved_in_slots =
reserved_varying_slot(sh_next, ir_var_shader_in);
 
+do_dead_builtin_varyings(ctx, sh_i, sh_next,
+  next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
+  tfeedback_decls);
+
 if (!assign_varying_locations(ctx, mem_ctx, prog, sh_i, sh_next,
   next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
   tfeedback_decls,
   reserved_out_slots | reserved_in_slots))
return false;
 
-do_dead_builtin_varyings(ctx, sh_i, sh_next,
-  next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
-  tfeedback_decls);
-
 /* This must be done after all dead varyings are eliminated. */
 if (sh_i != NULL) {
unsigned slots_used = _mesa_bitcount_64(reserved_out_slots);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/13] i965: stop passing stage as a function parameter

2016-07-26 Thread Timothy Arceri
We already pass the shader so we can just get the stage from this.
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 3b85f79..efd67e7 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -86,8 +86,7 @@ brw_lower_packing_builtins(struct brw_context *brw,
 }
 
 static void
-process_glsl_ir(gl_shader_stage stage,
-struct brw_context *brw,
+process_glsl_ir(struct brw_context *brw,
 struct gl_shader_program *shader_prog,
 struct gl_linked_shader *shader)
 {
@@ -138,8 +137,7 @@ process_glsl_ir(gl_shader_stage stage,
do_copy_propagation(shader->ir);
 
bool lowered_variable_indexing =
-  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
-  shader->ir,
+  lower_variable_index_to_cond_assign(shader->Stage, shader->ir,
   options->EmitNoIndirectInput,
   options->EmitNoIndirectOutput,
   options->EmitNoIndirectTemp,
@@ -225,7 +223,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 
   _mesa_copy_linked_program_data((gl_shader_stage) stage, shProg, prog);
 
-  process_glsl_ir((gl_shader_stage) stage, brw, shProg, shader);
+  process_glsl_ir(brw, shProg, shader);
 
   /* Make a pass over the IR to add state references for any built-in
* uniforms that are used.  This has to be done now (during linking).
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/13] glsl: move uniform linking code to new link_setup_uniform_remap_tables()

2016-07-26 Thread Timothy Arceri
This makes link_assign_uniform_locations() easier to follow.
---
 src/compiler/glsl/link_uniforms.cpp | 330 +++-
 src/compiler/glsl/linker.cpp|   4 +-
 src/compiler/glsl/linker.h  |   5 +-
 3 files changed, 177 insertions(+), 162 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index dbe808f..89196e6 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -998,12 +998,168 @@ find_empty_block(struct gl_shader_program *prog,
return -1;
 }
 
+static void
+link_setup_uniform_remap_tables(struct gl_context *ctx,
+struct gl_shader_program *prog,
+unsigned num_explicit_uniform_locs)
+{
+   unsigned total_entries = num_explicit_uniform_locs;
+   unsigned empty_locs =
+  prog->NumUniformRemapTable - num_explicit_uniform_locs;
+
+   /* Reserve all the explicit locations of the active uniforms. */
+   for (unsigned i = 0; i < prog->NumUniformStorage; i++) {
+  if (prog->UniformStorage[i].type->is_subroutine() ||
+  prog->UniformStorage[i].is_shader_storage)
+ continue;
+
+  if (prog->UniformStorage[i].remap_location != UNMAPPED_UNIFORM_LOC) {
+ /* How many new entries for this uniform? */
+ const unsigned entries =
+MAX2(1, prog->UniformStorage[i].array_elements);
+
+ /* Set remap table entries point to correct gl_uniform_storage. */
+ for (unsigned j = 0; j < entries; j++) {
+unsigned element_loc = prog->UniformStorage[i].remap_location + j;
+assert(prog->UniformRemapTable[element_loc] ==
+   INACTIVE_UNIFORM_EXPLICIT_LOCATION);
+prog->UniformRemapTable[element_loc] = >UniformStorage[i];
+ }
+  }
+   }
+
+   /* Reserve locations for rest of the uniforms. */
+   for (unsigned i = 0; i < prog->NumUniformStorage; i++) {
+
+  if (prog->UniformStorage[i].type->is_subroutine() ||
+  prog->UniformStorage[i].is_shader_storage)
+ continue;
+
+  /* Built-in uniforms should not get any location. */
+  if (prog->UniformStorage[i].builtin)
+ continue;
+
+  /* Explicit ones have been set already. */
+  if (prog->UniformStorage[i].remap_location != UNMAPPED_UNIFORM_LOC)
+ continue;
+
+  /* how many new entries for this uniform? */
+  const unsigned entries = MAX2(1, prog->UniformStorage[i].array_elements);
+
+  /* Find UniformRemapTable for empty blocks where we can fit this 
uniform. */
+  int chosen_location = -1;
+
+  if (empty_locs)
+ chosen_location = find_empty_block(prog, >UniformStorage[i]);
+
+  /* Add new entries to the total amount of entries. */
+  total_entries += entries;
+
+  if (chosen_location != -1) {
+ empty_locs -= entries;
+  } else {
+ chosen_location = prog->NumUniformRemapTable;
+
+ /* resize remap table to fit new entries */
+ prog->UniformRemapTable =
+reralloc(prog,
+ prog->UniformRemapTable,
+ gl_uniform_storage *,
+ prog->NumUniformRemapTable + entries);
+ prog->NumUniformRemapTable += entries;
+  }
+
+  /* set pointers for this uniform */
+  for (unsigned j = 0; j < entries; j++)
+ prog->UniformRemapTable[chosen_location + j] =
+>UniformStorage[i];
+
+  /* set the base location in remap table for the uniform */
+  prog->UniformStorage[i].remap_location = chosen_location;
+   }
+
+   /* Verify that total amount of entries for explicit and implicit locations
+* is less than MAX_UNIFORM_LOCATIONS.
+*/
+
+   if (total_entries > ctx->Const.MaxUserAssignableUniformLocations) {
+  linker_error(prog, "count of uniform locations > MAX_UNIFORM_LOCATIONS"
+   "(%u > %u)", total_entries,
+   ctx->Const.MaxUserAssignableUniformLocations);
+   }
+
+   /* Reserve all the explicit locations of the active subroutine uniforms. */
+   for (unsigned i = 0; i < prog->NumUniformStorage; i++) {
+  if (!prog->UniformStorage[i].type->is_subroutine())
+ continue;
+
+  if (prog->UniformStorage[i].remap_location == UNMAPPED_UNIFORM_LOC)
+ continue;
+
+  for (unsigned j = 0; j < MESA_SHADER_STAGES; j++) {
+ struct gl_linked_shader *sh = prog->_LinkedShaders[j];
+ if (!sh)
+continue;
+
+ if (!prog->UniformStorage[i].opaque[j].active)
+continue;
+
+ /* How many new entries for this uniform? */
+ const unsigned entries =
+MAX2(1, prog->UniformStorage[i].array_elements);
+
+ /* Set remap table entries point to correct gl_uniform_storage. */
+ for (unsigned k = 0; k < entries; k++) {
+unsigned element_loc = prog->UniformStorage[i].remap_location + k;
+

Re: [Mesa-dev] [PATCH v2 26/27] i965/blorp: brw_blorp_blit.cpp -> blorp_blit.c

2016-07-26 Thread Jason Ekstrand
On Jul 26, 2016 6:49 PM, "Matt Turner"  wrote:
>
> On Tue, Jul 26, 2016 at 3:11 PM, Jason Ekstrand 
wrote:
> > ---
> >  src/mesa/drivers/dri/i965/Makefile.sources   |2 +-
> >  src/mesa/drivers/dri/i965/blorp_blit.c   | 1662
++
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 1662
--
> >  3 files changed, 1663 insertions(+), 1663 deletions(-)
> >  create mode 100644 src/mesa/drivers/dri/i965/blorp_blit.c
> >  delete mode 100644 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
>
> Same comment about the other rename patches: what's the point of
> dropping the "brw_"?

Eventually, these are going to move to src/intel/blorp.  When they do, the
blorp prefix will be sufficient.  At the moment, it makes it clear that
they're is little to nothing brw about them (hence the distinction between
brw_blorp.c and blorp.c in patch 25) and avoids the annoying dependency
issues of renaming a fine from .c to .cpp.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 11/35] i965/blorp: Use isl_msaa_layout instead of intel_msaa_layout

2016-07-26 Thread Pohjolainen, Topi
On Tue, Jul 26, 2016 at 03:02:02PM -0700, Jason Ekstrand wrote:
> We also remove brw_blorp_surface_info::msaa_layout.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c|  18 -
>  src/mesa/drivers/dri/i965/brw_blorp.h|  14 +---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 111 
> +--
>  3 files changed, 39 insertions(+), 104 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index d38be8a..96201e4 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -70,7 +70,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>>x_offset, >y_offset);
>  
> info->array_layout = mt->array_layout;
> -   info->msaa_layout = mt->msaa_layout;
> info->swizzle = SWIZZLE_XYZW;
>  
> if (format == MESA_FORMAT_NONE)
> @@ -210,22 +209,6 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
> struct nir_shader *nir,
> return program;
>  }
>  
> -static enum isl_msaa_layout
> -get_isl_msaa_layout(enum intel_msaa_layout layout)
> -{
> -   switch (layout) {
> -   case INTEL_MSAA_LAYOUT_NONE:
> -  return ISL_MSAA_LAYOUT_NONE;
> -   case INTEL_MSAA_LAYOUT_IMS:
> -  return ISL_MSAA_LAYOUT_INTERLEAVED;
> -   case INTEL_MSAA_LAYOUT_UMS:
> -   case INTEL_MSAA_LAYOUT_CMS:
> -  return ISL_MSAA_LAYOUT_ARRAY;
> -   default:
> -  unreachable("Invalid MSAA layout");
> -   }
> -}
> -
>  struct surface_state_info {
> unsigned num_dwords;
> unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes 
> */
> @@ -255,7 +238,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
> /* Stomp surface dimensions and tiling (if needed) with info from blorp */
> surf.dim = ISL_SURF_DIM_2D;
> surf.dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
> -   surf.msaa_layout = get_isl_msaa_layout(surface->msaa_layout);
> surf.logical_level0_px.width = surface->width;
> surf.logical_level0_px.height = surface->height;
> surf.logical_level0_px.depth = 1;
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index 0f142b4..d60b988 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -134,12 +134,6 @@ struct brw_blorp_surface_info
> uint32_t brw_surfaceformat;
>  
> /**
> -* For MSAA surfaces, MSAA layout that should be used when setting up the
> -* surface state for this surface.
> -*/
> -   enum intel_msaa_layout msaa_layout;
> -
> -   /**
>  * In order to support cases where RGBA format is backing client requested
>  * RGB, one needs to have means to force alpha channel to one when user
>  * requested RGB surface is used as blit source. This is possible by
> @@ -298,7 +292,7 @@ struct brw_blorp_blit_prog_key
> /* MSAA layout that has been configured in the surface state for texturing
>  * from.
>  */
> -   enum intel_msaa_layout tex_layout;
> +   enum isl_msaa_layout tex_layout;
>  
> enum isl_aux_usage tex_aux_usage;
>  
> @@ -306,7 +300,7 @@ struct brw_blorp_blit_prog_key
> unsigned src_samples;
>  
> /* Actual MSAA layout used by the source image. */
> -   enum intel_msaa_layout src_layout;
> +   enum isl_msaa_layout src_layout;
>  
> /* Number of samples per pixel that have been configured in the render
>  * target.
> @@ -314,13 +308,13 @@ struct brw_blorp_blit_prog_key
> unsigned rt_samples;
>  
> /* MSAA layout that has been configured in the render target. */
> -   enum intel_msaa_layout rt_layout;
> +   enum isl_msaa_layout rt_layout;
>  
> /* Actual number of samples per pixel in the destination image. */
> unsigned dst_samples;
>  
> /* Actual MSAA layout used by the destination image. */
> -   enum intel_msaa_layout dst_layout;
> +   enum isl_msaa_layout dst_layout;
>  
> /* Type of the data to be read from the texture (one of
>  * BRW_REGISTER_TYPE_{UD,D,F}).
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index ce00bb7..c337a86 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -684,23 +684,18 @@ blorp_nir_retile_w_to_y(nir_builder *b, nir_ssa_def 
> *pos)
>   */
>  static inline nir_ssa_def *
>  blorp_nir_encode_msaa(nir_builder *b, nir_ssa_def *pos,
> -  unsigned num_samples, enum intel_msaa_layout layout)
> +  unsigned num_samples, enum isl_msaa_layout layout)
>  {
> assert(pos->num_components == 2 || pos->num_components == 3);
>  
> switch (layout) {
> -   case INTEL_MSAA_LAYOUT_NONE:
> +   case ISL_MSAA_LAYOUT_NONE:
>assert(pos->num_components == 2);
>return pos;
> -   case INTEL_MSAA_LAYOUT_CMS:
> -  /* We can't compensate for compressed layout since at this point in the
> -   * program we haven't 

[Mesa-dev] [PATCH 4/5] anv/image: Don't create invalid render target surfaces

2016-07-26 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_image.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index dff51bc..ce08979 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -531,7 +531,18 @@ anv_image_view_init(struct anv_image_view *iview,
   iview->sampler_surface_state.alloc_size = 0;
}
 
-   if (image->usage & usage_mask & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
+   /* This is kind-of hackish.  It is possible, due to get_full_usage above,
+* to get a surface state with a non-renderable format but with
+* VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT.  This happens in particular for
+* formats which aren't renderable but where we want to use Vulkan copy
+* commands so VK_IMAGE_USAGE_TRANSFER_DST_BIT is set.  In the case of a
+* copy, meta will use a format that we can render to, but most of the rest
+* of the time, we don't want to create those surface states.  Once we
+* start using blorp for copies, this problem will go away and we can
+* remove a lot of hacks.
+*/
+   if ((image->usage & usage_mask & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) &&
+   isl_format_supports_rendering(>info, isl_view.format)) {
   iview->color_rt_surface_state = alloc_surface_state(device, cmd_buffer);
 
   isl_view.usage = cube_usage | ISL_SURF_USAGE_RENDER_TARGET_BIT;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] isl/formats: Report ETC as being samplable on Bay Trail

2016-07-26 Thread Jason Ekstrand
---
 src/intel/isl/isl_format.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index e0b91bb..366d32e 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -372,6 +372,15 @@ isl_format_supports_sampling(const struct brw_device_info 
*devinfo,
if (!format_info[format].exists)
   return false;
 
+   if (devinfo->is_baytrail) {
+  const struct isl_format_layout *fmtl = isl_format_get_layout(format);
+  /* Support for ETC1 and ETC2 exists on Bay Trail even though big-core
+   * GPUs didn't get it until Broadwell.
+   */
+  if (fmtl->txc == ISL_TXC_ETC1 || fmtl->txc == ISL_TXC_ETC2)
+ return true;
+   }
+
return format_gen(devinfo) >= format_info[format].sampling;
 }
 
@@ -382,6 +391,15 @@ isl_format_supports_filtering(const struct brw_device_info 
*devinfo,
if (!format_info[format].exists)
   return false;
 
+   if (devinfo->is_baytrail) {
+  const struct isl_format_layout *fmtl = isl_format_get_layout(format);
+  /* Support for ETC1 and ETC2 exists on Bay Trail even though big-core
+   * GPUs didn't get it until Broadwell.
+   */
+  if (fmtl->txc == ISL_TXC_ETC1 || fmtl->txc == ISL_TXC_ETC2)
+ return true;
+   }
+
return format_gen(devinfo) >= format_info[format].filtering;
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] isl: Update the format table and add asserts

2016-07-26 Thread Jason Ekstrand
The real objective of this series is patch 5 which prevents us from
accidentally creating a surface state with a format unsupported by the
hardware.  This turns some of the new Vulkan CTS tests from a hang into an
informative crash.  In order to get there, however, we needed to update the
format table in isl with some of the new formats added on Haswell and later
generations.  In order to do that, we had to fix up the dri driver, and own
the rabbit hole we go!

At the end of the series, the hangs in the latest CTS are gone (they came
from trying to clear an unsupported image format).

Jason Ekstrand (5):
  i965/surface_formats: Don't advertise 8 or 16-bit RGB formats
  isl/formats: Report ETC as being samplable on Bay Trail
  isl/formats: Update the table with more samplable formats
  anv/image: Don't create invalid render target surfaces
  isl/state: Add some asserts about format capabilities

 src/intel/isl/isl_format.c  | 48 +
 src/intel/isl/isl_surface_state.c   |  5 +++
 src/intel/vulkan/anv_image.c| 13 ++-
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 10 ++
 4 files changed, 60 insertions(+), 16 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] isl/formats: Update the table with more samplable formats

2016-07-26 Thread Jason Ekstrand
There were a lot of formats where support was added on Haswell or later but
we never updated the format table.
---
 src/intel/isl/isl_format.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index 366d32e..73688a7 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -218,8 +218,8 @@ static const struct surface_format_info format_info[] = {
SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE1)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   A1B5G5R5_UNORM)
SF(90, 90,  x,  x, 90,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_UINT)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_SINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_UINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_SINT)
SF( Y,  Y,  x, 45,  Y,  Y,  Y,  x,  x,x,   R8_UNORM)
SF( Y,  Y,  x,  x,  Y, 60,  Y,  x,  x,x,   R8_SNORM)
SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,x,   R8_SINT)
@@ -237,10 +237,10 @@ static const struct surface_format_info format_info[] = {
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   P4A4_UNORM_PALETTE1)
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   A4P4_UNORM_PALETTE1)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   Y8_UNORM)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   DXT1_RGB_SRGB)
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R1_UNORM)
SF( Y,  Y,  x,  Y,  Y,  x,  x,  x, 60,x,   YCRCB_NORMAL)
@@ -261,8 +261,8 @@ static const struct surface_format_info format_info[] = {
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   DXT1_RGB)
 /* smpl filt shad CK  RT  AB  VB  SO  color ccs_e */
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   FXT1)
-   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_UNORM)
-   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_SNORM)
+   SF(75, 75,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_UNORM)
+   SF(75, 75,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_SNORM)
SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_SSCALED)
SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R8G8B8_USCALED)
SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R64G64B64A64_FLOAT)
@@ -270,8 +270,8 @@ static const struct surface_format_info format_info[] = {
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   BC4_SNORM)
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   BC5_SNORM)
SF(50, 50,  x,  x,  x,  x, 60,  x,  x,x,   R16G16B16_FLOAT)
-   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_UNORM)
-   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_SNORM)
+   SF(75, 75,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_UNORM)
+   SF(75, 75,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_SNORM)
SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_SSCALED)
SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R16G16B16_USCALED)
SF(70, 70,  x,  x,  x,  x,  x,  x,  x,x,   BC6H_SF16)
@@ -279,7 +279,7 @@ static const struct surface_format_info format_info[] = {
SF(70, 70,  x,  x,  x,  x,  x,  x,  x,x,   BC7_UNORM_SRGB)
SF(70, 70,  x,  x,  x,  x,  x,  x,  x,x,   BC6H_UF16)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   PLANAR_420_8)
-   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   R8G8B8_UNORM_SRGB)
+   SF(75, 75,  x,  x,  x,  x,  x,  x,  x,x,   R8G8B8_UNORM_SRGB)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC1_RGB8)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_RGB8)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_R11)
@@ -287,8 +287,8 @@ static const struct surface_format_info format_info[] = {
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_R11)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_RG11)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8)
-   SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
-   SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
+   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
+   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R32_SFIXED)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_SNORM)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_USCALED)
@@ -305,8 +305,8 @@ static const struct surface_format_info format_info[] = {
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,  

[Mesa-dev] [PATCH 1/5] i965/surface_formats: Don't advertise 8 or 16-bit RGB formats

2016-07-26 Thread Jason Ekstrand
We have implicitly been not advertising these formats since we had them
turned off in the format capabilities table.  We are about to update that
table and this prevents a change in behavior.  The only change in behavior
created by this patch is that we no longer advertise support for
R16G16B16_FLOAT which means that it's now renderable which seems like a
bonus.  Maybe someday we'll want to change things to start supporting
16-bit RGB formats natively but, at the moment, there's no need.
---
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index 2543f4b..69d3bd4 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -311,6 +311,16 @@ brw_init_surface_formats(struct brw_context *brw)
   if (texture == 0 && format != MESA_FORMAT_RGBA_FLOAT32)
 continue;
 
+  /* Don't advertisel 8 and 16-bit RGB formats to core mesa.  This ensures
+   * that they are renderable from an API perspective since core mesa will
+   * fall back to RGBA or RGBX (we can't render to non-power-of-two
+   * formats).  For 8-bit, formats, this also keeps us from hitting some
+   * nasty corners in intel_miptree_map_blit if you ever try to map one.
+   */
+  int format_size = _mesa_get_format_bytes(format);
+  if (format_size == 3 || format_size == 6)
+ continue;
+
   if (isl_format_supports_sampling(devinfo, texture) &&
   (isl_format_supports_filtering(devinfo, texture) || is_integer))
 ctx->TextureFormatSupported[format] = true;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] isl/state: Add some asserts about format capabilities

2016-07-26 Thread Jason Ekstrand
This keeps invalid surface states from leaking through and potentially
hanging the GPU.  We shouldn't actually be hitting this on a regular basis,
but a helpful assert is better than a hang.
---
 src/intel/isl/isl_surface_state.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index d1c8f17..a30086d 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -210,6 +210,11 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
struct GENX(RENDER_SURFACE_STATE) s = { 0 };
 
s.SurfaceType = get_surftype(info->surf->dim, info->view->usage);
+
+   if (info->view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT)
+  assert(isl_format_supports_rendering(dev->info, info->view->format));
+   else if (info->view->usage & ISL_SURF_USAGE_TEXTURE_BIT)
+  assert(isl_format_supports_sampling(dev->info, info->view->format));
s.SurfaceFormat = info->view->format;
 
 #if GEN_IS_HASWELL
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: fix optimization of discard nested multiple levels

2016-07-26 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The order of optimizations can lead to the conditional discard optimization
being applied twice to the same discard statement. In this case, we must
ensure that both conditions are applied.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762
Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/glsl/opt_conditional_discard.cpp | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/opt_conditional_discard.cpp 
b/src/compiler/glsl/opt_conditional_discard.cpp
index 1ca8803..a27bead 100644
--- a/src/compiler/glsl/opt_conditional_discard.cpp
+++ b/src/compiler/glsl/opt_conditional_discard.cpp
@@ -72,7 +72,14 @@ opt_conditional_discard_visitor::visit_leave(ir_if *ir)
 
/* Move the condition and replace the ir_if with the ir_discard. */
ir_discard *discard = (ir_discard *) ir->then_instructions.head;
-   discard->condition = ir->condition;
+   if (!discard->condition)
+  discard->condition = ir->condition;
+   else {
+  void *ctx = ralloc_parent(ir);
+  discard->condition = new(ctx) ir_expression(ir_binop_logic_and,
+  ir->condition,
+  discard->condition);
+   }
ir->replace_with(discard);
 
progress = true;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 01/11] gallium: move pipe_screen destroy into pipe-loader

2016-07-26 Thread Emil Velikov
On 22 July 2016 at 17:22, Rob Herring  wrote:
> In preparation to add reference counting of pipe_screen in the pipe-loader,
> pipe_loader_release needs to destroy the pipe_screen instead of state
> trackers.
>
> Signed-off-by: Rob Herring 
> Cc: Emil Velikov 

> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -45,14 +45,12 @@ device::device(clover::platform , 
> pipe_loader_device *ldev) :
> pipe = pipe_loader_create_screen(ldev);
> if (!pipe || !pipe->get_param(pipe, PIPE_CAP_COMPUTE)) {
>if (pipe)
> - pipe->destroy(pipe);
> + pipe_loader_release(, 1);
My C++ is a bit rusty - are we going to end up using device::ldev here
or the one provided by the user ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >