date:20151022

Re: [Mesa-dev] [PATCH v2] i965/vec4: track and use independently each flag channel

2015-10-22 Thread Matt Turner

On Mon, Oct 19, 2015 at 10:38 AM, Alejandro Piñeiro
 wrote:
> vec4_live_variables tracks now each flag channel independently, so
> vec4_dead_code_eliminate can update the writemask of null registers,
> based on which component are alive at the moment. This would allow
> vec4_cmod_propagation to optimize out several movs involving null
> registers.
>
> v2: added support to track each flag channel independently at vec4
> live_variables, as v1 assumed that it was already doing it, as
> pointed by Francisco Jerez
> ---
>
> I was tempted to split this commit in two, one for tracking each
> channel independently at vec4_live_variables, and other for using it
> at vec4_dead_code_eliminate, but in the end I concluded that made more
> sense as one commit, as are changes tightly tied.
>
> FWIW, the shader-db numbers for the optimization patch ("i965/vec4:
> adding vec4_cmod_propagation optimization", 2/5) changed, being now
> the following:
>
> total instructions in shared programs: 6240413 -> 6235841 (-0.07%)
> instructions in affected programs: 401933 -> 397361 (-1.14%)
> total loops in shared programs:1979 -> 1979 (0.00%)
> helped:2265
> HURT:  0
>
> That are slightly worse that the numbers for the first version of the
> patch, before adding that extra check. I will use this numbers if this
> patch is approved.
>
> FWIW, without this change on live_variables and dead_code eliminate,
> the numbers on the optimization would be the following ones:
>
> total instructions in shared programs: 6240413 -> 6240253 (-0.00%)
> instructions in affected programs: 18965 -> 18805 (-0.84%)
> total loops in shared programs:1979 -> 1979 (0.00%)
> helped:160
> HURT:
>
>
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h| 26 ++
>  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 31 
> +++---
>  .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 14 ++
>  3 files changed, 57 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> index 96dd633..934d7b1 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> @@ -185,6 +185,32 @@ public:
>return predicate || opcode == VS_OPCODE_UNPACK_FLAGS_SIMD4X2;
> }
>
> +   bool reads_flag(unsigned c)
> +   {
> +  if (!reads_flag())
> + return false;
> +
> +  switch(predicate) {

Space between switch and (

> +  case BRW_PREDICATE_NONE:
> + return false;
> + break;

No need for break since it returns early (same applies below).

> +  case BRW_PREDICATE_ALIGN16_REPLICATE_X:
> + return (c == 0 ? true : false);

Simply return c == 0. (same below)

> + break;
> +  case BRW_PREDICATE_ALIGN16_REPLICATE_Y:
> + return (c == 1 ? true : false);
> + break;
> +  case BRW_PREDICATE_ALIGN16_REPLICATE_Z:
> + return (c == 2 ? true : false);
> + break;
> +  case BRW_PREDICATE_ALIGN16_REPLICATE_W:
> + return (c == 3 ? true : false);
> + break;
> +  default:
> + return true;
> +  }
> +   }
> +
> bool writes_flag()
> {
>return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> index 8fc7a36..d4216fd 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> @@ -78,13 +78,19 @@ vec4_visitor::dead_code_eliminate()
>   sizeof(BITSET_WORD));
>
>foreach_inst_in_block_reverse(vec4_instruction, inst, block) {
> - if (inst->dst.file == GRF && !inst->has_side_effects()) {
> + if ((inst->dst.file == GRF && !inst->has_side_effects()) ||
> + (inst->dst.is_null() && inst->writes_flag())){
>  bool result_live[4] = { false };
>
> -for (unsigned i = 0; i < inst->regs_written; i++) {
> -   for (int c = 0; c < 4; c++)
> -  result_live[c] |= BITSET_TEST(
> - live, var_from_reg(alloc, offset(inst->dst, i), c));
> +if (inst->dst.file == GRF) {
> +   for (unsigned i = 0; i < inst->regs_written; i++) {
> +  for (int c = 0; c < 4; c++)
> + result_live[c] |= BITSET_TEST(
> +live, var_from_reg(alloc, offset(inst->dst, i), c));
> +   }
> +} else {
> +   for (unsigned c = 0; c < 4; c++)
> +  result_live[c] = BITSET_TEST(flag_live, c);
>  }
>
>  /* If the instruction can't do writemasking, then it's all or

Below this hunk is the code that changes the opcode to NOP when all

Re: [Mesa-dev] [RFC 07/21] mesa/extensions: Use _mesa_extension_supported()

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 11:15 AM, Chad Versace  wrote:
> On Mon 19 Oct 2015, Nanley Chery wrote:
>> From: Nanley Chery 
>>
>> Replace open-coded checks for extension support with
>> _mesa_extension_supported().
>>
>> Signed-off-by: Nanley Chery 
>> ---
>>  src/mesa/main/extensions.c   | 54 
>> 
>>  src/mesa/main/extensions_table.h |  6 ++---
>>  2 files changed, 14 insertions(+), 46 deletions(-)
>>
>> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
>> index 7137bc9..5d2eb1d 100644
>> --- a/src/mesa/main/extensions.c
>> +++ b/src/mesa/main/extensions.c
>> @@ -49,8 +49,6 @@ enum {
>> GL  = (1 << API_OPENGL_COMPAT) | (1 << API_OPENGL_CORE),
>> ES1 = 1 << API_OPENGLES,
>> ES2 = 1 << API_OPENGLES2,
>> -   ES3 = 1 << (API_OPENGL_LAST + 1),
>> -   ES31 = 1 << (API_OPENGL_LAST + 2),
>
> I like the removal of the ES3 and ES31 tokens. There are only 4 GL APIs
> (compat, core, es1, and es2), and the enum now better reflects that.

To be clear, they were a kludge to work around Mesa's inability to
expose extensions based on API version. If we're adding that ability,
I agree it's awesome to get rid of these. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/8] i965/fs: Avoid scalar destinations in emit_uniformize()

2015-10-22 Thread Kristian Høgsberg Kristensen

The scalar destination registers break copy propagation. Instead compute
the results to a regular register and then reference a component when we
later use the result as a source.

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index df10a9d..4a2c6d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -391,13 +391,13 @@ namespace brw {
   emit_uniformize(const src_reg ) const
   {
  const fs_builder ubld = exec_all();
- const dst_reg chan_index = component(vgrf(BRW_REGISTER_TYPE_UD), 0);
- const dst_reg dst = component(vgrf(src.type), 0);
+ const dst_reg chan_index = vgrf(BRW_REGISTER_TYPE_UD);
+ const dst_reg dst = vgrf(src.type);
 
  ubld.emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, chan_index);
- ubld.emit(SHADER_OPCODE_BROADCAST, dst, src, chan_index);
+ ubld.emit(SHADER_OPCODE_BROADCAST, dst, src, component(chan_index, 
0));
 
- return src_reg(dst);
+ return src_reg(component(dst, 0));
   }
 
   /**
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/8] i965/fs: Don't uniformize surface index twice

2015-10-22 Thread Kristian Høgsberg Kristensen

The emit_untyped_read and emit_untyped_write helpers already uniformize
the surface index argument. No need to do it before calling them.

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 5dc63c9..00f200a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1511,7 +1511,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  surf_index = vgrf(glsl_type::uint_type);
  bld.ADD(surf_index, get_nir_src(instr->src[0]),
  fs_reg(stage_prog_data->binding_table.ssbo_start));
- surf_index = bld.emit_uniformize(surf_index);
 
  /* Assume this may touch any UBO. It would be nice to provide
   * a tighter bound, but the array information is already lowered away.
@@ -1756,7 +1755,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  surf_index = vgrf(glsl_type::uint_type);
  bld.ADD(surf_index, get_nir_src(instr->src[1]),
   fs_reg(stage_prog_data->binding_table.ssbo_start));
- surf_index = bld.emit_uniformize(surf_index);
 
  brw_mark_surface_used(prog_data,
stage_prog_data->binding_table.ssbo_start +
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 8/8] i965/fs: Allow copy propagating into new surface access opcodes

2015-10-22 Thread Kristian Høgsberg Kristensen

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 5589716..97e206d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -612,6 +612,21 @@ fs_visitor::try_constant_propagate(fs_inst *inst, 
acp_entry *entry)
  }
  break;
 
+  case SHADER_OPCODE_UNTYPED_ATOMIC:
+  case SHADER_OPCODE_UNTYPED_SURFACE_READ:
+  case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
+  case SHADER_OPCODE_TYPED_ATOMIC:
+  case SHADER_OPCODE_TYPED_SURFACE_READ:
+  case SHADER_OPCODE_TYPED_SURFACE_WRITE:
+ /* We only propagate into the surface argument of the
+  * instruction. Everything else goes through LOAD_PAYLOAD.
+  */
+ if (i == 1) {
+inst->src[i] = val;
+progress = true;
+ }
+ break;
+
   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD:
   case SHADER_OPCODE_BROADCAST:
  inst->src[i] = val;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 6/8] i965/fs: Drop offset_reg temporary in ssbo load

2015-10-22 Thread Kristian Høgsberg Kristensen

Now that we don't read each component one-by-one, we don't need the
temoprary vgrf for the offset. More importantly, this register was type
UD while the nir source was type D. This broke copy propagation and left
a redundant MOV in the generated code.

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 00f200a..a82c616 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1521,13 +1521,11 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   }
 
   /* Get the offset to read from */
-  fs_reg offset_reg = vgrf(glsl_type::uint_type);
-  unsigned const_offset_bytes = 0;
+  fs_reg offset_reg;
   if (has_indirect) {
- bld.MOV(offset_reg, get_nir_src(instr->src[1]));
+ offset_reg = get_nir_src(instr->src[1]);
   } else {
- const_offset_bytes = instr->const_index[0];
- bld.MOV(offset_reg, fs_reg(const_offset_bytes));
+ offset_reg = fs_reg(instr->const_index[0]);
   }
 
   /* Read the vector */
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/8] i965/fs: Use unsigned immediate 0 when eliminating SHADER_OPCODE_FIND_LIVE_CHANNEL

2015-10-22 Thread Kristian Høgsberg Kristensen

The destination for SHADER_OPCODE_FIND_LIVE_CHANNEL is always a UD
register.  When we replace the opcode with a MOV, make sure we use a UD
immediate 0 so copy propagation doesn't bail because of non-matching
types.

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 23d3418..695c211 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2614,7 +2614,7 @@ fs_visitor::eliminate_find_live_channel()
   case SHADER_OPCODE_FIND_LIVE_CHANNEL:
  if (depth == 0) {
 inst->opcode = BRW_OPCODE_MOV;
-inst->src[0] = fs_reg(0);
+inst->src[0] = fs_reg(0u);
 inst->sources = 1;
 inst->force_writemask_all = true;
 progress = true;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/8] i965/fs: Read all components of a SSBO field with one send

2015-10-22 Thread Kristian Høgsberg Kristensen

Instead of looping through single-component reads, read all components
in one go.

Reviewed-by: Iago Toral Quiroga 
Reviewed-by: Jordan Justen 
Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 25 +++--
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index acb51c0..5dc63c9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1532,24 +1532,13 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   }
 
   /* Read the vector */
-  for (int i = 0; i < instr->num_components; i++) {
- fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
-1 /* dims */, 1 /* size */,
-BRW_PREDICATE_NONE);
- read_result.type = dest.type;
- bld.MOV(dest, read_result);
- dest = offset(dest, bld, 1);
-
- /* Vector components are stored contiguous in memory */
- if (i < instr->num_components) {
-if (!has_indirect) {
-   const_offset_bytes += 4;
-   bld.MOV(offset_reg, fs_reg(const_offset_bytes));
-} else {
-   bld.ADD(offset_reg, offset_reg, brw_imm_ud(4));
-}
- }
-  }
+  fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
+ 1 /* dims */,
+ instr->num_components,
+ BRW_PREDICATE_NONE);
+  read_result.type = dest.type;
+  for (int i = 0; i < instr->num_components; i++)
+ bld.MOV(offset(dest, bld, i), offset(read_result, bld, i));
 
   break;
}
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/8] SSBO optimizations

2015-10-22 Thread Kristian Høgsberg Kristensen

Here's an updated and expanded ssbo optimization series. I found a bit
of low-hanging fruit around dynamic ssbo array indexing. I removed the
IMM shortcut in emit_uniformize() and added the constant propagation
for the read and write opcodes. The result is the same for constant
indexing, and it helps dynamic indexing a little bit.

There's still a redundant MOV in the dynamic case that I'm going to
punt on for now. We want to copy propagate the scalar surface index
into the send instruction, but that fails as that's considered
!can_do_source_mods(), but the surface index argument isn't part of
the payload and can accept misc strides and modifiers.

I also took a look at ssbo stores and made it write out contiguous
channels in the writemask together, in particular, the common case of
writing a vec4 goes from 4 to 1 write instruction.

Kristian Høgsberg Kristensen (8):
  i965: Don't use message headers for untyped reads
  i965/fs: Read all components of a SSBO field with one send
  i965/fs: Use unsigned immediate 0 when eliminating
SHADER_OPCODE_FIND_LIVE_CHANNEL
  i965/fs: Don't uniformize surface index twice
  i965/fs: Avoid scalar destinations in emit_uniformize()
  i965/fs: Drop offset_reg temporary in ssbo load
  i965/fs: Optimize ssbo stores
  i965/fs: Allow copy propagating into new surface access opcodes

 src/mesa/drivers/dri/i965/brw_eu_emit.c|  3 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 +-
 src/mesa/drivers/dri/i965/brw_fs_builder.h |  8 +-
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 15 
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 91 +-
 5 files changed, 58 insertions(+), 63 deletions(-)

-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-22 Thread Jordan Justen

On 2015-10-22 00:06:37, Iago Toral wrote:
> On Wed, 2015-10-21 at 23:24 -0700, Jordan Justen wrote:
> > On 2015-10-20 00:43:13, Iago Toral wrote:
> > > On Tue, 2015-10-20 at 00:12 -0700, Jordan Justen wrote:
> > > > An untyped surface read is volatile because it might be affected by a
> > > > write.
> > > > 
> > > > In the ES31-CTS.compute_shader.resources-max test, two back to back
> > > > read/modify/writes of an SSBO variable looked something like this:
> > > > 
> > > >   r1 = untyped_surface_read(ssbo_float)
> > > >   r2 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r2)
> > > >   r3 = untyped_surface_read(ssbo_float)
> > > >   r4 = r3 + 1
> > > >   untyped_surface_write(ssbo_float, r4)
> > > > 
> > > > And after CSE, we had:
> > > > 
> > > >   r1 = untyped_surface_read(ssbo_float)
> > > >   r2 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r2)
> > > >   r4 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r4)
> > > 
> > > Yeah, we cannot do CSE with SSBO loads. Patch looks good to me, but we
> > > should do the same in the vec4 CSE pass.
> > 
> > Yeah, I checked vec4 CSE. It looks like is_expression will
> > unconditionally return false for those opcodes.
> 
> Oh right.
> 
> > r-b?
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> FWIW, my ssbo load optimization pass is trying to "undo" this since it
> is all about doing CSE for ssbo loads that are safe to CSE, that is,
> when we know that we don't have stores/atomics that write to the same
> offset or memory barriers in between. I am trying to implement that in
> NIR though, so we still need this, to prevent i965 from trying to CSE
> the remaining loads it sees, since thise would not be safe to CSE.
> 
> Also, as I mentioned in another e-mail, we did not notice this issue
> earlier was because there are a couple of problems in i965 that make it
> quite difficult that the CSE pass identifies identical SSBO loads at the
> moment, but that is bound to change as soon as those things get
> eventually fixed.

Yeah. I only saw this after applying some optimization patches from krh.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] st/dri2: Add shared flag to missing locations

2015-10-22 Thread Erik Faye-Lund

On Thu, Oct 22, 2015 at 10:54 AM, Marek Olšák  wrote:
> On Thu, Oct 22, 2015 at 10:22 AM, Erik Faye-Lund  wrote:
>> On Wed, Oct 21, 2015 at 10:34 PM, Marek Olšák  wrote:
>>> On Wed, Oct 21, 2015 at 12:28 PM, Axel Davy  wrote:
 The PIPE_BIND_SHARED flag should be added whenever
 the resource may be shared with another process.

 In particular if the resource is imported, or may
 be exported, the flag should be used.
>>>
>>> This can't be enforced. EGL_MESA_image_dma_buf_export allows exporting
>>> any texture. Mesa can't know in advance if a texture will be exported.
>>
>> Could we not, at least in theory, crate a new texture and blit the old
>> one into it behind the scenes somehow when a texture gets exported?
>
> Sharing means textures are shared. There is no blitting allowed. Other
> users don't have to be notified that a shared texture has been
> rendered to.

Maybe I wasn't clear enough: I meant to migrate the texture data over
to a new shared home, not to create a copy.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-22 Thread Iago Toral

On Wed, 2015-10-21 at 23:24 -0700, Jordan Justen wrote:
> On 2015-10-20 00:43:13, Iago Toral wrote:
> > On Tue, 2015-10-20 at 00:12 -0700, Jordan Justen wrote:
> > > An untyped surface read is volatile because it might be affected by a
> > > write.
> > > 
> > > In the ES31-CTS.compute_shader.resources-max test, two back to back
> > > read/modify/writes of an SSBO variable looked something like this:
> > > 
> > >   r1 = untyped_surface_read(ssbo_float)
> > >   r2 = r1 + 1
> > >   untyped_surface_write(ssbo_float, r2)
> > >   r3 = untyped_surface_read(ssbo_float)
> > >   r4 = r3 + 1
> > >   untyped_surface_write(ssbo_float, r4)
> > > 
> > > And after CSE, we had:
> > > 
> > >   r1 = untyped_surface_read(ssbo_float)
> > >   r2 = r1 + 1
> > >   untyped_surface_write(ssbo_float, r2)
> > >   r4 = r1 + 1
> > >   untyped_surface_write(ssbo_float, r4)
> > 
> > Yeah, we cannot do CSE with SSBO loads. Patch looks good to me, but we
> > should do the same in the vec4 CSE pass.
> 
> Yeah, I checked vec4 CSE. It looks like is_expression will
> unconditionally return false for those opcodes.

Oh right.

> r-b?

Reviewed-by: Iago Toral Quiroga 

FWIW, my ssbo load optimization pass is trying to "undo" this since it
is all about doing CSE for ssbo loads that are safe to CSE, that is,
when we know that we don't have stores/atomics that write to the same
offset or memory barriers in between. I am trying to implement that in
NIR though, so we still need this, to prevent i965 from trying to CSE
the remaining loads it sees, since thise would not be safe to CSE.

Also, as I mentioned in another e-mail, we did not notice this issue
earlier was because there are a couple of problems in i965 that make it
quite difficult that the CSE pass identifies identical SSBO loads at the
moment, but that is bound to change as soon as those things get
eventually fixed.

Iago

> -Jordan
> 
> > 
> > > Signed-off-by: Jordan Justen 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |  3 ++-
> > >  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 ++
> > >  src/mesa/drivers/dri/i965/brw_shader.h   |  6 ++
> > >  3 files changed, 22 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> > > b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > > index c7628dc..3a28c8d 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > > @@ -93,7 +93,8 @@ is_expression(const fs_visitor *v, const fs_inst *const 
> > > inst)
> > > case SHADER_OPCODE_LOAD_PAYLOAD:
> > >return !inst->is_copy_payload(v->alloc);
> > > default:
> > > -  return inst->is_send_from_grf() && !inst->has_side_effects();
> > > +  return inst->is_send_from_grf() && !inst->has_side_effects() &&
> > > + !inst->is_volatile();
> > > }
> > >  }
> > >  
> > > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> > > b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > > index 2324b56..be911ed 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > > @@ -969,6 +969,20 @@ backend_instruction::has_side_effects() const
> > > }
> > >  }
> > >  
> > > +bool
> > > +backend_instruction::is_volatile() const
> > > +{
> > > +   switch (opcode) {
> > > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> > > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
> > > +   case SHADER_OPCODE_TYPED_SURFACE_READ:
> > > +   case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
> > > +  return true;
> > > +   default:
> > > +  return false;
> > > +   }
> > > +}
> > > +
> > >  #ifndef NDEBUG
> > >  static bool
> > >  inst_is_in_block(const bblock_t *block, const backend_instruction *inst)
> > > diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> > > b/src/mesa/drivers/dri/i965/brw_shader.h
> > > index b33b08f..35ee210 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_shader.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> > > @@ -115,6 +115,12 @@ struct backend_instruction : public exec_node {
> > >  * optimize these out unless you know what you are doing.
> > >  */
> > > bool has_side_effects() const;
> > > +
> > > +   /**
> > > +* True if the instruction might be affected by side effects of other
> > > +* instructions.
> > > +*/
> > > +   bool is_volatile() const;
> > >  #else
> > >  struct backend_instruction {
> > > struct exec_node link;
> > 
> > 
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 1/7] radeonsi: Allocate buffers for DCC.

2015-10-22 Thread Marek Olšák

On Wed, Oct 21, 2015 at 9:56 AM, Axel Davy  wrote:
> On 21/10/2015 00:10, Bas Nieuwenhuizen wrote:
>>
>>
>> DCC is disabled for textures that can be shared as sharing the
>> DCC buffers has not been implemented yet.
>>
>>
>>   +   surf->dcc_enabled =  !(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
>> +!(surf->flags & RADEON_SURF_SCANOUT) &&
>> +!compressed && AddrDccIn.numSamples <= 1;
>> +
>>
>
> Testing if a surface is scanout is not enough to avoid shared surfaces.
>
> In practice, it may be true currently mesa, and glamor via gbm, would use
> the scanout flag for shared
> buffers. It seems however a bit weak to rely on that.
>
> I suggest rather to use the pipe shared bind flag.
>
>  I noticed in some case of imported surfaces the bind flag is not
> advertised, I'm going to send a patch to fix that.

I think that using any BIND flag is weak. The best thing would be to
make some decisions in texture_from_handle and texture_get_handle
functions, not based on flags.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] main: Remove interface block array index for doing the name comparison

2015-10-22 Thread Samuel Iglesias Gonsalvez

From ARB_program_query_interface spec:

"uint GetProgramResourceIndex(uint program, enum programInterface,
   const char *name);
 [...]
 If  exactly matches the name string of one of the active resources
 for , the index of the matched resource is returned.
 Additionally, if  would exactly match the name string of an active
 resource if "[0]" were appended to , the index of the matched
 resource is returned. [...]"

"A string provided to GetProgramResourceLocation or
 GetProgramResourceLocationIndex is considered to match an active variable
 if:
[...]
   * if the string identifies the base name of an active array, where the
 string would exactly match the name of the variable if the suffix
 "[0]" were appended to the string;
[...]
"

But this is only happening on those ARB_program_query_interface's queries.
For the rest of specs we need to keep old behavior. For that reason,
arb_program_interface_query boolean is added to the affected functions.

Fixes the following two dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element

Signed-off-by: Samuel Iglesias Gonsalvez 
Cc: Tapani Pälli 
---
 src/mesa/main/program_resource.c |  6 ++---
 src/mesa/main/shader_query.cpp   | 58 +++-
 src/mesa/main/shaderapi.c|  4 +--
 src/mesa/main/shaderapi.h|  9 ---
 src/mesa/main/uniforms.c |  6 ++---
 5 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c
index eb71fdd..d029926 100644
--- a/src/mesa/main/program_resource.c
+++ b/src/mesa/main/program_resource.c
@@ -251,7 +251,7 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum 
programInterface,
case GL_UNIFORM_BLOCK:
case GL_SHADER_STORAGE_BLOCK:
   res = _mesa_program_resource_find_name(shProg, programInterface, name,
- _index);
+ _index, true);
   if (!res || array_index > 0)
  return GL_INVALID_INDEX;
 
@@ -377,7 +377,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum 
programInterface,
  goto fail;
}
 
-   return _mesa_program_resource_location(shProg, programInterface, name);
+   return _mesa_program_resource_location(shProg, programInterface, name, 
true);
 fail:
_mesa_error(ctx, GL_INVALID_ENUM, "glGetProgramResourceLocation(%s %s)",
_mesa_enum_to_string(programInterface), name);
@@ -411,5 +411,5 @@ _mesa_GetProgramResourceLocationIndex(GLuint program, 
GLenum programInterface,
}
 
return _mesa_program_resource_location_index(shProg, GL_PROGRAM_OUTPUT,
-name);
+name, true);
 }
diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 8182d3d..49766ca 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -218,7 +218,7 @@ _mesa_GetAttribLocation(GLhandleARB program, const 
GLcharARB * name)
unsigned array_index = 0;
struct gl_program_resource *res =
   _mesa_program_resource_find_name(shProg, GL_PROGRAM_INPUT, name,
-   _index);
+   _index, false);
 
if (!res)
   return -1;
@@ -367,7 +367,7 @@ _mesa_GetFragDataIndex(GLuint program, const GLchar *name)
   return -1;
 
return _mesa_program_resource_location_index(shProg, GL_PROGRAM_OUTPUT,
-name);
+name, false);
 }
 
 GLint GLAPIENTRY
@@ -404,7 +404,7 @@ _mesa_GetFragDataLocation(GLuint program, const GLchar 
*name)
unsigned array_index = 0;
struct gl_program_resource *res =
   _mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, name,
-   _index);
+   _index, false);
 
if (!res)
   return -1;
@@ -533,9 +533,12 @@ valid_array_index(const GLchar *name, unsigned 
*array_index)
 struct gl_program_resource *
 _mesa_program_resource_find_name(struct gl_shader_program *shProg,
  GLenum programInterface, const char *name,
- unsigned *array_index)
+ unsigned *array_index,
+ bool arb_program_interface_query)
 {
struct gl_program_resource *res = shProg->ProgramResourceList;
+   const char *name_first_square_bracket = strchr(name, '[');
+
for (unsigned i = 0; i < shProg->NumProgramResourceList; i++, res++) {
   if (res->Type != programInterface)
  continue;
@@ -543,6 +546,33 @@

[Mesa-dev] [PATCH 2/2] glsl: fix GL_BUFFER_DATA_SIZE value for shader storage blocks with unsize arrays

2015-10-22 Thread Samuel Iglesias Gonsalvez

From ARB_program_interface_query:

"For the property of BUFFER_DATA_SIZE, then the implementation-dependent
 minimum total buffer object size, in basic machine units, required to hold
 all active variables associated with an active uniform block, shader
 storage block, or atomic counter buffer is written to .  If the
 final member of an active shader storage block is array with no declared
 size, the minimum buffer size is computed assuming the array was declared
 as an array with one element."

Fixes the following dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.named_block
dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.unnamed_block
dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.block_array

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/glsl/link_uniform_blocks.cpp | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/glsl/link_uniform_blocks.cpp b/src/glsl/link_uniform_blocks.cpp
index 5285d8d..a10b44b 100644
--- a/src/glsl/link_uniform_blocks.cpp
+++ b/src/glsl/link_uniform_blocks.cpp
@@ -130,13 +130,22 @@ private:
 
   unsigned alignment = 0;
   unsigned size = 0;
-
+  /* From ARB_program_interface_query:
+   * "If the final member of an active shader storage block is array with
+   * no declared size, the minimum buffer size is computed assuming the
+   * array was declared as an array with one element."
+   *
+   * For that reason, we use the base type of the unsized array to 
calculate
+   * its size.
+   */
+  const glsl_type *type_for_size =
+ type->is_unsized_array() ? type->without_array() : type;
   if (packing == GLSL_INTERFACE_PACKING_STD430) {
  alignment = type->std430_base_alignment(v->RowMajor);
- size = type->std430_size(v->RowMajor);
+ size = type_for_size->std430_size(v->RowMajor);
   } else {
  alignment = type->std140_base_alignment(v->RowMajor);
- size = type->std140_size(v->RowMajor);
+ size = type_for_size->std140_size(v->RowMajor);
   }
 
   this->offset = glsl_align(this->offset, alignment);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-22 Thread Jordan Justen

On 2015-10-20 00:43:13, Iago Toral wrote:
> On Tue, 2015-10-20 at 00:12 -0700, Jordan Justen wrote:
> > An untyped surface read is volatile because it might be affected by a
> > write.
> > 
> > In the ES31-CTS.compute_shader.resources-max test, two back to back
> > read/modify/writes of an SSBO variable looked something like this:
> > 
> >   r1 = untyped_surface_read(ssbo_float)
> >   r2 = r1 + 1
> >   untyped_surface_write(ssbo_float, r2)
> >   r3 = untyped_surface_read(ssbo_float)
> >   r4 = r3 + 1
> >   untyped_surface_write(ssbo_float, r4)
> > 
> > And after CSE, we had:
> > 
> >   r1 = untyped_surface_read(ssbo_float)
> >   r2 = r1 + 1
> >   untyped_surface_write(ssbo_float, r2)
> >   r4 = r1 + 1
> >   untyped_surface_write(ssbo_float, r4)
> 
> Yeah, we cannot do CSE with SSBO loads. Patch looks good to me, but we
> should do the same in the vec4 CSE pass.

Yeah, I checked vec4 CSE. It looks like is_expression will
unconditionally return false for those opcodes.

r-b?

-Jordan

> 
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |  3 ++-
> >  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 ++
> >  src/mesa/drivers/dri/i965/brw_shader.h   |  6 ++
> >  3 files changed, 22 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > index c7628dc..3a28c8d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > @@ -93,7 +93,8 @@ is_expression(const fs_visitor *v, const fs_inst *const 
> > inst)
> > case SHADER_OPCODE_LOAD_PAYLOAD:
> >return !inst->is_copy_payload(v->alloc);
> > default:
> > -  return inst->is_send_from_grf() && !inst->has_side_effects();
> > +  return inst->is_send_from_grf() && !inst->has_side_effects() &&
> > + !inst->is_volatile();
> > }
> >  }
> >  
> > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> > b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > index 2324b56..be911ed 100644
> > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > @@ -969,6 +969,20 @@ backend_instruction::has_side_effects() const
> > }
> >  }
> >  
> > +bool
> > +backend_instruction::is_volatile() const
> > +{
> > +   switch (opcode) {
> > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
> > +   case SHADER_OPCODE_TYPED_SURFACE_READ:
> > +   case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
> > +  return true;
> > +   default:
> > +  return false;
> > +   }
> > +}
> > +
> >  #ifndef NDEBUG
> >  static bool
> >  inst_is_in_block(const bblock_t *block, const backend_instruction *inst)
> > diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> > b/src/mesa/drivers/dri/i965/brw_shader.h
> > index b33b08f..35ee210 100644
> > --- a/src/mesa/drivers/dri/i965/brw_shader.h
> > +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> > @@ -115,6 +115,12 @@ struct backend_instruction : public exec_node {
> >  * optimize these out unless you know what you are doing.
> >  */
> > bool has_side_effects() const;
> > +
> > +   /**
> > +* True if the instruction might be affected by side effects of other
> > +* instructions.
> > +*/
> > +   bool is_volatile() const;
> >  #else
> >  struct backend_instruction {
> > struct exec_node link;
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: fix shader storage block member rules when adding program resources

2015-10-22 Thread Samuel Iglesias Gonsálvez



On 22/10/15 08:29, Timothy Arceri wrote:
> On Wed, 2015-10-21 at 12:18 +0200, Samuel Iglesias Gonsalvez wrote:
>> Commit f24e5e did not take into account arrays of named shader
>> storage blocks.
>>
>> Fixes 20 dEQP-GLES31.functional.ssbo.* tests:
>>
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.s
>> hared_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.p
>> acked_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.s
>> td140_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.s
>> td430_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.single_buffer.shar
>> ed_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.single_buffer.pack
>> ed_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.single_buffer.std1
>> 40_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_struct_array.single_buffer.std4
>> 30_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_b
>> uffer.shared_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_b
>> uffer.packed_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_b
>> uffer.std140_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_b
>> uffer.std430_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.single_buff
>> er.shared_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.single_buff
>> er.packed_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.single_buff
>> er.std140_instance_array
>> dEQP
>> -GLES31.functional.ssbo.layout.single_nested_struct_array.single_buff
>> er.std430_instance_array
>> dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.2
>> dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.29
>> dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.33
>> dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.3
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> ---
>>  src/glsl/linker.cpp | 34 +-
>>  1 file changed, 29 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
>> index 07ea0e0..6593e58 100644
>> --- a/src/glsl/linker.cpp
>> +++ b/src/glsl/linker.cpp
>> @@ -3138,6 +3138,9 @@ should_add_buffer_variable(struct
>> gl_shader_program *shProg,
>>  {
>> bool found_interface = false;
>> const char *block_name = NULL;
>> +   unsigned block_name_len = 0;
>> +   const char *first_dot = strchr(name, '.');
> 
> Maybe call this block_name_dot rather than reusing first_dot
> 
>> +   const char *first_square_bracket = strchr(name, '[');
>>  
>> /* These rules only apply to buffer variables. So we return
>>  * true for the rest of types.
>> @@ -3147,7 +3150,27 @@ should_add_buffer_variable(struct
>> gl_shader_program *shProg,
>>  
>> for (unsigned i = 0; i < shProg->NumBufferInterfaceBlocks; i++) {
>>block_name = shProg->BufferInterfaceBlocks[i].Name;
>> -  if (strncmp(block_name, name, strlen(block_name)) == 0) {
>> +  block_name_len = strlen(block_name);
>> +
>> +  const char *first_block_square_bracket = strchr(block_name,
>> '[');
>> +  if (first_block_square_bracket) {
>> + /* The block is part of an array of named interfaces,
>> +  * for the name comparison we ignore the "[x]" part.
>> +  */
>> + block_name_len -= strlen(first_block_square_bracket);
>> +  }
>> +
>> +  if (first_dot) {
>> + /* Check if the variable name starts with the interface
>> +  * name. The interface name (if present) should have the
>> +  * length than the interface block name we are comparing
>> to.
>> +  */
>> + unsigned len = strlen(name) - strlen(first_dot);
>> + if (len != block_name_len)
>> +continue;
>> +  }
>> +
>> +  if (strncmp(block_name, name, block_name_len) == 0) {
>>   found_interface = true;
>>   break;
>>}
>> @@ -3156,8 +3179,11 @@ should_add_buffer_variable(struct
>> gl_shader_program *shProg,
>> /* We remove the interface name from the buffer variable name,
>>  * including the dot that follows it.
>>  */
>> -   if (found_interface)
>> -  name = name + strlen(block_name) + 1;
>> +   if (found_interface) {
>> +  name = name + block_name_len + 1;
>> +  first_dot = strchr(name, '.');
>> +  first_square_bracket = strchr(name, '[');
>> +   }
>>  
>> /* From: ARB_program_interface_query extension:
>>  *
>> @@ -3166,8 +3192,6 @@ should_add_buffer_variable(struct
>> gl_shader_program *shProg,
>>  *   of its type.  For arrays of

Re: [Mesa-dev] [PATCH] glsl: remove excess location qualifier validation

2015-10-22 Thread Timothy Arceri

On Thu, 2015-10-22 at 09:56 +0300, Tapani Pälli wrote:
> On 10/22/2015 09:41 AM, Timothy Arceri wrote:
> > On Thu, 2015-10-22 at 08:55 +0300, Tapani Pälli wrote:
> > > On 10/22/2015 08:29 AM, Timothy Arceri wrote:
> > > > Location has never been able to be a negative value because it
> > > > has
> > > > always been validated in the parser.
> > > > 
> > > > Also the linker doesn't check for negatives like the comment
> > > > claims.
> > > Neither does the parser, if one utilizes negative explicit
> > > location,
> > > parser says:
> > > 
> > > error: syntax error, unexpected '-', expecting INTCONSTANT or
> > > UINTCONSTANT
> > > 
> > > I'm not sure if this is quite OK, it should rather accept the
> > > negative
> > > value and the fail here in this check you are about to remove?
> > Yes I did noticed this. However even if we changed it to accept a
> > negative value the is another check in the parser that would catch
> > this
> > before the checks I'm removing here.
> > 
> > if ($3 >= 0) {
> > $$.location = $3;
> > } else {
> >  _mesa_glsl_error(& @3, state, "invalid location %d specified",
> > $3);
> >   YYERROR;
> > }
> 
> OK, then removing the check for uniform loc should be fine. For the 
> attributes change, I'm not sure yet, I believe the reason for the 
> 'silliness' (mentioned in the commit) is that all the built-in 
> attributes have negative locations, we need to be careful to continue
> dealing gracefully with that.

This function is only valdating the qualifier, builtins don't seem to
set the qualifier flags in the ast, as far as I can tell they set the
fields in the ir directly.

For location the only place that sets the explicit location flag is the
parser and this validation function is only called when that is set, so
I don't believe this should be a problem.

   if (qual->flags.q.explicit_location) {
  validate_explicit_location(qual, var, state, loc);
   }

> 
> > I'm also working on ARB_enhanced_layouts which will allow negative
> > values to get passed the parser and will be moving the check out of
> > the
> > parser and into the ast.
> > 
> > This patch is a clean-up before doing that, as the attributes code
> > doesn't do the validation at all and the one for uniforms should be
> > shared with the attibutes code.
> > 
> > > > ---
> > > > 
> > > >No piglit regressions and an extra negative test sent for
> > > >ARB_explicit_uniform_location [1]
> > > > 
> > > >[1] http://patchwork.freedesktop.org/patch/62573/
> > > > 
> > > >src/glsl/ast_to_hir.cpp | 70 ---
> > > > -
> > > > -
> > > >1 file changed, 22 insertions(+), 48 deletions(-)
> > > > 
> > > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > > > index 8549d55..0306530 100644
> > > > --- a/src/glsl/ast_to_hir.cpp
> > > > +++ b/src/glsl/ast_to_hir.cpp
> > > > @@ -2422,21 +2422,6 @@ validate_explicit_location(const struct
> > > > ast_type_qualifier *qual,
> > > >  const struct gl_context *const ctx = state->ctx;
> > > >  unsigned max_loc = qual->location + var->type
> > > > ->uniform_locations() - 1;
> > > >
> > > > -  /* ARB_explicit_uniform_location specification states:
> > > > -   *
> > > > -   * "The explicitly defined locations and the
> > > > generated
> > > > locations
> > > > -   * must be in the range of 0 to
> > > > MAX_UNIFORM_LOCATIONS
> > > > minus one."
> > > > -   *
> > > > -   * "Valid locations for default-block uniform
> > > > variable
> > > > locations
> > > > -   * are in the range of 0 to the implementation
> > > > -defined
> > > > maximum
> > > > -   * number of uniform locations."
> > > > -   */
> > > > -  if (qual->location < 0) {
> > > > - _mesa_glsl_error(loc, state,
> > > > -  "explicit location < 0 for uniform
> > > > %s",
> > > > var->name);
> > > > - return;
> > > > -  }
> > > > -
> > > >  if (max_loc >= ctx
> > > > ->Const.MaxUserAssignableUniformLocations) {
> > > > _mesa_glsl_error(loc, state, "location(s) consumed
> > > > by
> > > > uniform %s "
> > > >  ">= MAX_UNIFORM_LOCATIONS (%u)",
> > > > var
> > > > ->name,
> > > > @@ -2527,41 +2512,30 @@ validate_explicit_location(const struct
> > > > ast_type_qualifier *qual,
> > > >   } else {
> > > >  var->data.explicit_location = true;
> > > >
> > > > -  /* This bit of silliness is needed because invalid
> > > > explicit
> > > > locations
> > > > -   * are supposed to be flagged during linking.  Small
> > > > negative values
> > > > -   * biased by VERT_ATTRIB_GENERIC0 or FRAG_RESULT_DATA0
> > > > could
> > > > alias
> > > > -   * built-in values (e.g., -16+VERT_ATTRIB_GENERIC0 =
> > > > VERT_ATTRIB_POS).
> > > > -   * The linker needs to be able to differentiate these
> > > > cases.
> > > >This
> > > > -   * ensures that negative values

Re: [Mesa-dev] [PATCH 2/4] st/dri2: Add shared flag to missing locations

2015-10-22 Thread Marek Olšák

On Thu, Oct 22, 2015 at 10:22 AM, Erik Faye-Lund  wrote:
> On Wed, Oct 21, 2015 at 10:34 PM, Marek Olšák  wrote:
>> On Wed, Oct 21, 2015 at 12:28 PM, Axel Davy  wrote:
>>> The PIPE_BIND_SHARED flag should be added whenever
>>> the resource may be shared with another process.
>>>
>>> In particular if the resource is imported, or may
>>> be exported, the flag should be used.
>>
>> This can't be enforced. EGL_MESA_image_dma_buf_export allows exporting
>> any texture. Mesa can't know in advance if a texture will be exported.
>
> Could we not, at least in theory, crate a new texture and blit the old
> one into it behind the scenes somehow when a texture gets exported?

Sharing means textures are shared. There is no blitting allowed. Other
users don't have to be notified that a shared texture has been
rendered to.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: remove excess location qualifier validation

2015-10-22 Thread Tapani Pälli


On 10/22/2015 09:41 AM, Timothy Arceri wrote:

On Thu, 2015-10-22 at 08:55 +0300, Tapani Pälli wrote:

On 10/22/2015 08:29 AM, Timothy Arceri wrote:

Location has never been able to be a negative value because it has
always been validated in the parser.

Also the linker doesn't check for negatives like the comment
claims.

Neither does the parser, if one utilizes negative explicit location,
parser says:

error: syntax error, unexpected '-', expecting INTCONSTANT or
UINTCONSTANT

I'm not sure if this is quite OK, it should rather accept the
negative
value and the fail here in this check you are about to remove?

Yes I did noticed this. However even if we changed it to accept a
negative value the is another check in the parser that would catch this
before the checks I'm removing here.

if ($3 >= 0) {
$$.location = $3;
} else {
 _mesa_glsl_error(& @3, state, "invalid location %d specified", $3);
  YYERROR;
}


OK, then removing the check for uniform loc should be fine. For the 
attributes change, I'm not sure yet, I believe the reason for the 
'silliness' (mentioned in the commit) is that all the built-in 
attributes have negative locations, we need to be careful to continue 
dealing gracefully with that.



I'm also working on ARB_enhanced_layouts which will allow negative
values to get passed the parser and will be moving the check out of the
parser and into the ast.

This patch is a clean-up before doing that, as the attributes code
doesn't do the validation at all and the one for uniforms should be
shared with the attibutes code.


---

   No piglit regressions and an extra negative test sent for
   ARB_explicit_uniform_location [1]

   [1] http://patchwork.freedesktop.org/patch/62573/

   src/glsl/ast_to_hir.cpp | 70 
-
   1 file changed, 22 insertions(+), 48 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 8549d55..0306530 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2422,21 +2422,6 @@ validate_explicit_location(const struct
ast_type_qualifier *qual,
 const struct gl_context *const ctx = state->ctx;
 unsigned max_loc = qual->location + var->type
->uniform_locations() - 1;
   
-  /* ARB_explicit_uniform_location specification states:

-   *
-   * "The explicitly defined locations and the generated
locations
-   * must be in the range of 0 to MAX_UNIFORM_LOCATIONS
minus one."
-   *
-   * "Valid locations for default-block uniform variable
locations
-   * are in the range of 0 to the implementation-defined
maximum
-   * number of uniform locations."
-   */
-  if (qual->location < 0) {
- _mesa_glsl_error(loc, state,
-  "explicit location < 0 for uniform %s",
var->name);
- return;
-  }
-
 if (max_loc >= ctx
->Const.MaxUserAssignableUniformLocations) {
_mesa_glsl_error(loc, state, "location(s) consumed by
uniform %s "
 ">= MAX_UNIFORM_LOCATIONS (%u)", var
->name,
@@ -2527,41 +2512,30 @@ validate_explicit_location(const struct
ast_type_qualifier *qual,
  } else {
 var->data.explicit_location = true;
   
-  /* This bit of silliness is needed because invalid explicit

locations
-   * are supposed to be flagged during linking.  Small
negative values
-   * biased by VERT_ATTRIB_GENERIC0 or FRAG_RESULT_DATA0 could
alias
-   * built-in values (e.g., -16+VERT_ATTRIB_GENERIC0 =
VERT_ATTRIB_POS).
-   * The linker needs to be able to differentiate these cases.
   This
-   * ensures that negative values stay negative.
-   */
-  if (qual->location >= 0) {
- switch (state->stage) {
- case MESA_SHADER_VERTEX:
-var->data.location = (var->data.mode ==
ir_var_shader_in)
-   ? (qual->location + VERT_ATTRIB_GENERIC0)
-   : (qual->location + VARYING_SLOT_VAR0);
-break;
+  switch (state->stage) {
+  case MESA_SHADER_VERTEX:
+ var->data.location = (var->data.mode == ir_var_shader_in)
+? (qual->location + VERT_ATTRIB_GENERIC0)
+: (qual->location + VARYING_SLOT_VAR0);
+ break;
   
- case MESA_SHADER_TESS_CTRL:

- case MESA_SHADER_TESS_EVAL:
- case MESA_SHADER_GEOMETRY:
-if (var->data.patch)
-   var->data.location = qual->location +
VARYING_SLOT_PATCH0;
-else
-   var->data.location = qual->location +
VARYING_SLOT_VAR0;
-break;
+  case MESA_SHADER_TESS_CTRL:
+  case MESA_SHADER_TESS_EVAL:
+  case MESA_SHADER_GEOMETRY:
+ if (var->data.patch)
+var->data.location = qual->location +
VARYING_SLOT_PATCH0;
+ else
+var->data.location = qual->location +
VARYING_SLOT_VAR0;
+ break;
   
- case MESA_SHADER_FRAGMENT:

-var->data.location =

Re: [Mesa-dev] [RFC 10/21] mesa: Remove equality check in helper functions

2015-10-22 Thread Erik Faye-Lund

On Tue, Oct 20, 2015 at 12:44 AM, Nanley Chery  wrote:
> From: Nanley Chery 
>
> Since the version numbers being compared are integral and we don't ever
> expect gl_context::Version to be equal to 0, subtract 1 from the rhs of
> the equation and perform the optimization of removing the equality check.

Is this really an improvement? Changing '>=' to '>' and then making
all versions off-by-one seems like trading less readable and debugable
code for no real upside...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Initialize LOD to 0.0f for textureQueryLevels() and texture().

2015-10-22 Thread Iago Toral

Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-10-21 at 12:30 -0700, Matt Turner wrote:
> We implement textureQueryLevels (which takes no arguments, save the
> sampler) using the resinfo message (which takes an argument of LOD).
> Without initializing it, we'd generate a MOV from the null register to
> load the LOD argument.
> 
> Essentially the same logic applies to texture. A vertex shader cannot
> compute derivatives and so cannot produce an LOD, so TXL with an LOD of
> 0.0 is used.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index c39f97e..b8f90f2 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -882,6 +882,18 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
> uint32_t sampler,
> src_reg sampler_reg)
>  {
> +   /* The sampler can only meaningfully compute LOD for fragment shader
> +* messages. For all other stages, we change the opcode to TXL and 
> hardcode
> +* the LOD to 0.
> +*
> +* textureQueryLevels() is implemented in terms of TXS so we need to pass 
> a
> +* valid LOD argument.
> +*/
> +   if (op == ir_tex || op == ir_query_levels) {
> +  assert(lod.file == BAD_FILE);
> +  lod = src_reg(0.0f);
> +   }
> +
> enum opcode opcode;
> switch (op) {
> case ir_tex: opcode = SHADER_OPCODE_TXL; break;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 07/21] mesa/extensions: Use _mesa_extension_supported()

2015-10-22 Thread Chad Versace

On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Replace open-coded checks for extension support with
> _mesa_extension_supported().
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/extensions.c   | 54 
> 
>  src/mesa/main/extensions_table.h |  6 ++---
>  2 files changed, 14 insertions(+), 46 deletions(-)
> 
> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> index 7137bc9..5d2eb1d 100644
> --- a/src/mesa/main/extensions.c
> +++ b/src/mesa/main/extensions.c
> @@ -49,8 +49,6 @@ enum {
> GL  = (1 << API_OPENGL_COMPAT) | (1 << API_OPENGL_CORE),
> ES1 = 1 << API_OPENGLES,
> ES2 = 1 << API_OPENGLES2,
> -   ES3 = 1 << (API_OPENGL_LAST + 1),
> -   ES31 = 1 << (API_OPENGL_LAST + 2),

I like the removal of the ES3 and ES31 tokens. There are only 4 GL APIs
(compat, core, es1, and es2), and the enum now better reflects that.

[...]

> @@ -570,25 +558,15 @@ _mesa_make_extension_string(struct gl_context *ctx)
>  GLuint
>  _mesa_get_extension_count(struct gl_context *ctx)
>  {

[...]

> /* only count once */
> if (ctx->Extensions.Count != 0)
>return ctx->Extensions.Count;

Ouch... that looks like a race condition that needs fixing.

> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 24a0908..fecb402 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -167,7 +167,7 @@ EXT(EXT_rescale_normal  , dummy_true
>  EXT(EXT_secondary_color , dummy_true 
> , GLL,  0,  0,  0,  0, 1999)
>  EXT(EXT_separate_shader_objects , dummy_true 
> ,ES2 ,  0,  0,  0,  0, 2013)
>  EXT(EXT_separate_specular_color , dummy_true 
> , GLL,  0,  0,  0,  0, 1997)
> -EXT(EXT_shader_integer_mix  , EXT_shader_integer_mix 
> , GL   | ES3 ,  0,  0,  0,  0, 2013)
> +EXT(EXT_shader_integer_mix  , EXT_shader_integer_mix 
> , GL   | ES2 ,  0,  0,  0, 30, 2013)
>  EXT(EXT_shadow_funcs, ARB_shadow 
> , GLL,  0,  0,  0,  0, 2002)
>  EXT(EXT_stencil_two_side, EXT_stencil_two_side   
> , GLL,  0,  0,  0,  0, 2001)
>  EXT(EXT_stencil_wrap, dummy_true 
> , GLL,  0,  0,  0,  0, 2002)
> @@ -206,7 +206,7 @@ EXT(EXT_transform_feedback  , 
> EXT_transform_feedback
>  EXT(EXT_unpack_subimage , dummy_true 
> ,ES2 ,  0,  0,  0,  0, 2011)
>  EXT(EXT_vertex_array_bgra   , EXT_vertex_array_bgra  
> , GL ,  0,  0,  0,  0, 2008)
>  EXT(EXT_vertex_array, dummy_true 
> , GLL,  0,  0,  0,  0, 1995)
> -EXT(EXT_color_buffer_float  , dummy_true 
> ,ES3 ,  0,  0,  0,  0, 2013)
> +EXT(EXT_color_buffer_float  , dummy_true 
> ,ES2 ,  0,  0,  0, 30, 2013)
>  
>  
>  EXT(OES_blend_equation_separate , EXT_blend_equation_separate
> ,  ES1   ,  0,  0,  0,  0, 2009)
> @@ -250,7 +250,7 @@ EXT(OES_texture_float_linear, 
> OES_texture_float_linear
>  EXT(OES_texture_half_float  , OES_texture_half_float 
> ,ES2 ,  0,  0,  0,  0, 2005)
>  EXT(OES_texture_half_float_linear   , OES_texture_half_float_linear  
> ,ES2 ,  0,  0,  0,  0, 2005)
>  EXT(OES_texture_mirrored_repeat , dummy_true 
> ,  ES1   ,  0,  0,  0,  0, 2005)
> -EXT(OES_texture_storage_multisample_2d_array, ARB_texture_multisample
> ,   ES31 ,  0,  0,  0,  0, 2014)
> +EXT(OES_texture_storage_multisample_2d_array, ARB_texture_multisample
> ,ES2 ,  0,  0,  0, 31, 2014)
>  EXT(OES_texture_npot, ARB_texture_non_power_of_two   
> ,  ES1 | ES2 ,  0,  0,  0,  0, 2005)
>  EXT(OES_vertex_array_object , dummy_true 
> ,  ES1 | ES2 ,  0,  0,  0,  0, 2010)

This patch looks good to me. Looking forward to v2 of the series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeon/uvd: don't expose HEVC on old UVD hw (v2)

2015-10-22 Thread Alex Deucher

The section for UVD 2 and older was not updated
when HEVC support was added. Reported by Kano
on irc.

v2: integrate the UVD2 and older checks into the
main switch statement.

Signed-off-by: Alex Deucher 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/radeon/radeon_video.c | 46 ---
 1 file changed, 17 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 3a1834b..a874270 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -205,6 +205,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
 enum pipe_video_cap param)
 {
struct r600_common_screen *rscreen = (struct r600_common_screen 
*)screen;
+   enum pipe_video_format codec = u_reduce_video_profile(profile);
 
if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
switch (param) {
@@ -232,34 +233,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
}
}
 
-   /* UVD 2.x limits */
-   if (rscreen->family < CHIP_PALM) {
-   enum pipe_video_format codec = u_reduce_video_profile(profile);
-   switch (param) {
-   case PIPE_VIDEO_CAP_SUPPORTED:
-   /* no support for MPEG4 */
-   return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
-  /* FIXME: VC-1 simple/main profile is broken */
-  profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
-  profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
-   case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   /* MPEG2 only with shaders and no support for
-  interlacing on R6xx style UVD */
-   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
-  rscreen->family > CHIP_RV770;
-   default:
-   break;
-   }
-   }
-
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   switch (u_reduce_video_profile(profile)) {
+   switch (codec) {
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
-   return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   if (rscreen->family < CHIP_PALM)
+   /* no support for MPEG4 */
+   return codec != PIPE_VIDEO_FORMAT_MPEG4;
+   else
+   return entrypoint != 
PIPE_VIDEO_ENTRYPOINT_ENCODE;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
@@ -280,13 +264,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
+   if (rscreen->family < CHIP_PALM) {
+   /* MPEG2 only with shaders and no support for
+  interlacing on R6xx style UVD */
+   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
+  rscreen->family > CHIP_RV770;
+   } else {
+   if (u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_HEVC)
+   return false; //The hardware doesn't support 
interlaced HEVC.
+   return true;
+   }
case PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE:
return true;
case PIPE_VIDEO_CAP_MAX_LEVEL:
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 10/21] mesa: Remove equality check in helper functions

2015-10-22 Thread Chad Versace

On Thu 22 Oct 2015, Emil Velikov wrote:
> On 22 October 2015 at 07:43, Erik Faye-Lund  wrote:
> > On Tue, Oct 20, 2015 at 12:44 AM, Nanley Chery  
> > wrote:
> >> From: Nanley Chery 
> >>
> >> Since the version numbers being compared are integral and we don't ever
> >> expect gl_context::Version to be equal to 0, subtract 1 from the rhs of
> >> the equation and perform the optimization of removing the equality check.
> >
> > Is this really an improvement? Changing '>=' to '>' and then making
> > all versions off-by-one seems like trading less readable and debugable
> > code for no real upside...
> 
> Pretty much my line of thinking. Perhaps it was applicable for earlier
> iterations of the series ?

That's my opinion too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965/vec4: track and use independently each flag channel

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 11:19 AM, Matt Turner  wrote:
> On Mon, Oct 19, 2015 at 10:38 AM, Alejandro Piñeiro
>  wrote:
>> vec4_live_variables tracks now each flag channel independently, so
>> vec4_dead_code_eliminate can update the writemask of null registers,
>> based on which component are alive at the moment. This would allow
>> vec4_cmod_propagation to optimize out several movs involving null
>> registers.
>>
>> v2: added support to track each flag channel independently at vec4
>> live_variables, as v1 assumed that it was already doing it, as
>> pointed by Francisco Jerez
>> ---
>>
>> I was tempted to split this commit in two, one for tracking each
>> channel independently at vec4_live_variables, and other for using it
>> at vec4_dead_code_eliminate, but in the end I concluded that made more
>> sense as one commit, as are changes tightly tied.
>>
>> FWIW, the shader-db numbers for the optimization patch ("i965/vec4:
>> adding vec4_cmod_propagation optimization", 2/5) changed, being now
>> the following:
>>
>> total instructions in shared programs: 6240413 -> 6235841 (-0.07%)
>> instructions in affected programs: 401933 -> 397361 (-1.14%)
>> total loops in shared programs:1979 -> 1979 (0.00%)
>> helped:2265
>> HURT:  0
>>
>> That are slightly worse that the numbers for the first version of the
>> patch, before adding that extra check. I will use this numbers if this
>> patch is approved.
>>
>> FWIW, without this change on live_variables and dead_code eliminate,
>> the numbers on the optimization would be the following ones:
>>
>> total instructions in shared programs: 6240413 -> 6240253 (-0.00%)
>> instructions in affected programs: 18965 -> 18805 (-0.84%)
>> total loops in shared programs:1979 -> 1979 (0.00%)
>> helped:160
>> HURT:
>>
>>
>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h| 26 ++
>>  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 31 
>> +++---
>>  .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 14 ++
>>  3 files changed, 57 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
>> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> index 96dd633..934d7b1 100644
>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> @@ -185,6 +185,32 @@ public:
>>return predicate || opcode == VS_OPCODE_UNPACK_FLAGS_SIMD4X2;
>> }
>>
>> +   bool reads_flag(unsigned c)
>> +   {
>> +  if (!reads_flag())
>> + return false;
>> +
>> +  switch(predicate) {
>
> Space between switch and (
>
>> +  case BRW_PREDICATE_NONE:
>> + return false;
>> + break;
>
> No need for break since it returns early (same applies below).
>
>> +  case BRW_PREDICATE_ALIGN16_REPLICATE_X:
>> + return (c == 0 ? true : false);
>
> Simply return c == 0. (same below)
>
>> + break;
>> +  case BRW_PREDICATE_ALIGN16_REPLICATE_Y:
>> + return (c == 1 ? true : false);
>> + break;
>> +  case BRW_PREDICATE_ALIGN16_REPLICATE_Z:
>> + return (c == 2 ? true : false);
>> + break;
>> +  case BRW_PREDICATE_ALIGN16_REPLICATE_W:
>> + return (c == 3 ? true : false);
>> + break;
>> +  default:
>> + return true;
>> +  }
>> +   }
>> +
>> bool writes_flag()
>> {
>>return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> index 8fc7a36..d4216fd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> @@ -78,13 +78,19 @@ vec4_visitor::dead_code_eliminate()
>>   sizeof(BITSET_WORD));
>>
>>foreach_inst_in_block_reverse(vec4_instruction, inst, block) {
>> - if (inst->dst.file == GRF && !inst->has_side_effects()) {
>> + if ((inst->dst.file == GRF && !inst->has_side_effects()) ||
>> + (inst->dst.is_null() && inst->writes_flag())){
>>  bool result_live[4] = { false };
>>
>> -for (unsigned i = 0; i < inst->regs_written; i++) {
>> -   for (int c = 0; c < 4; c++)
>> -  result_live[c] |= BITSET_TEST(
>> - live, var_from_reg(alloc, offset(inst->dst, i), c));
>> +if (inst->dst.file == GRF) {
>> +   for (unsigned i = 0; i < inst->regs_written; i++) {
>> +  for (int c = 0; c < 4; c++)
>> + result_live[c] |= BITSET_TEST(
>> +live, var_from_reg(alloc, offset(inst->dst, i), c));
>> +   }
>> +} else {
>> +   for (unsigned c = 0; c < 4; c++)
>> +

Re: [Mesa-dev] [RFC 05/21] mesa/extensions: Add extension::version

2015-10-22 Thread Chad Versace

On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Enable limiting advertised extension support by context version with
> finer granularity. GLuint is chosen over smaller datatypes because,
> when this field is eventually used, usage of this datatype provides
> the smallest .text size of the compiled library.
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/extensions.c   |  15 +-
>  src/mesa/main/extensions_table.h | 626 
> +++
>  2 files changed, 326 insertions(+), 315 deletions(-)


> @@ -83,8 +88,14 @@ struct extension {
>   * \brief Table of supported OpenGL extensions for all API's.
>   */
>  static const struct extension extension_table[] = {
> -#define EXT(name_str, driver_cap, api_flags, ) \
> -{ .name = "GL_" #name_str, .offset = o(driver_cap), .api_set = 
> api_flags, .year = },
> +#define EXT(name_str, driver_cap, api_flags, gll_ver, glc_ver, gles_ver, 
> gles2_ver, ) \
> +{ .name = "GL_" #name_str, .offset = o(driver_cap), .api_set = 
> api_flags, \
> +  .version = { \
> +[API_OPENGL_COMPAT] = gll_ver, \
> +[API_OPENGL_CORE]   = glc_ver, \
> +[API_OPENGLES]  = gles_ver, \
> +[API_OPENGLES2] = gles2_ver, \
> +   }, .year = },

Please place '.year' and the final '}' each on its own line. That makes
the macro easier to read.

>  #include "extensions_table.h"
>  #undef EXT
>  };
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] i965: Fill out instruction list.

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 3:58 AM, Emil Velikov  wrote:
> On 21 October 2015 at 23:58, Matt Turner  wrote:
>> Add some instructions: illegal, movi, sends, sendsc.
>>
>> Remove some instructions with reused opcodes: msave, mrestore, push,
>> pop, goto. I did have some gross code for disassembling opcodes
>> per-generation, but there's very little meaningful overlap so it's
>> probably not needed.
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h  | 37 
>> ++--
>>  src/mesa/drivers/dri/i965/brw_disasm.c   | 16 --
>>  src/mesa/drivers/dri/i965/brw_shader.cpp |  2 +-
>>  3 files changed, 41 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index 393f17a..26fc0af 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -838,43 +838,62 @@ enum PACKED brw_horizontal_stride {
>>
>>  enum opcode {
>> /* These are the actual hardware opcodes. */
>> +   BRW_OPCODE_ILLEGAL = 0,
>> BRW_OPCODE_MOV =1,
>> BRW_OPCODE_SEL =2,
>> +   BRW_OPCODE_MOVI =   3,   /**< G45+ */
>> BRW_OPCODE_NOT =4,
>> BRW_OPCODE_AND =5,
>> BRW_OPCODE_OR = 6,
>> BRW_OPCODE_XOR =7,
>> BRW_OPCODE_SHR =8,
>> BRW_OPCODE_SHL =9,
>> +   // BRW_OPCODE_DIM = 10,  /**< Gen7.5 only */ /* Reused */
>> +   // BRW_OPCODE_SMOV =10,  /**< Gen8+   */ /* Reused */
>> +   /* Reserved - 11 */
>> BRW_OPCODE_ASR =12,
>> +   /* Reserved - 13-15 */
>> BRW_OPCODE_CMP =16,
>> BRW_OPCODE_CMPN =   17,
>> BRW_OPCODE_CSEL =   18,  /**< Gen8+ */
>> BRW_OPCODE_F32TO16 = 19,  /**< Gen7 only */
>> BRW_OPCODE_F16TO32 = 20,  /**< Gen7 only */
>> +   /* Reserved - 21-22 */
>> BRW_OPCODE_BFREV =  23,  /**< Gen7+ */
>> BRW_OPCODE_BFE =24,  /**< Gen7+ */
>> BRW_OPCODE_BFI1 =   25,  /**< Gen7+ */
>> BRW_OPCODE_BFI2 =   26,  /**< Gen7+ */
>> +   /* Reserved - 27-31 */
>> BRW_OPCODE_JMPI =   32,
>> +   // BRW_OPCODE_BRD = 33,  /**< Gen7+ */
>> BRW_OPCODE_IF = 34,
>> -   BRW_OPCODE_IFF =35,  /**< Pre-Gen6 */
>> +   BRW_OPCODE_IFF =35,  /**< Pre-Gen6*/ /* Reused */
>> +   // BRW_OPCODE_BRC = 35,  /**< Gen7+   */ /* Reused */
>> BRW_OPCODE_ELSE =   36,
>> BRW_OPCODE_ENDIF =  37,
>> -   BRW_OPCODE_DO = 38,
>> +   BRW_OPCODE_DO = 38,  /**< Pre-Gen6*/ /* Reused */
>> +   // BRW_OPCODE_CASE =38,  /**< Gen6 only   */ /* Reused */
>> BRW_OPCODE_WHILE =  39,
>> BRW_OPCODE_BREAK =  40,
>> BRW_OPCODE_CONTINUE = 41,
>> BRW_OPCODE_HALT =   42,
>> -   BRW_OPCODE_MSAVE =  44,  /**< Pre-Gen6 */
>> -   BRW_OPCODE_MRESTORE = 45, /**< Pre-Gen6 */
>> -   BRW_OPCODE_PUSH =   46,  /**< Pre-Gen6 */
>> -   BRW_OPCODE_GOTO =   46,  /**< Gen8+*/
>> -   BRW_OPCODE_POP =47,  /**< Pre-Gen6 */
>> +   // BRW_OPCODE_CALLA =   43,  /**< Gen7.5+ */
>> +   // BRW_OPCODE_MSAVE =   44,  /**< Pre-Gen6*/ /* Reused */
>> +   // BRW_OPCODE_CALL =44,  /**< Gen6+   */ /* Reused */
>> +   // BRW_OPCODE_MREST =   45,  /**< Pre-Gen6*/ /* Reused */
>> +   // BRW_OPCODE_RET = 45,  /**< Gen6+   */ /* Reused */
>> +   // BRW_OPCODE_PUSH =46,  /**< Pre-Gen6*/ /* Reused */
>> +   // BRW_OPCODE_FORK =46,  /**< Gen6 only   */ /* Reused */
>> +   // BRW_OPCODE_GOTO =46,  /**< Gen8+   */ /* Reused */
>> +   // BRW_OPCODE_POP = 47,  /**< Pre-Gen6*/
>> BRW_OPCODE_WAIT =   48,
>> BRW_OPCODE_SEND =   49,
>> BRW_OPCODE_SENDC =  50,
>> +   BRW_OPCODE_SENDS =  51,  /**< Gen9+ */
>> +   BRW_OPCODE_SENDSC = 52,  /**< Gen9+ */
>> +   /* Reserved 53-55 */
>> BRW_OPCODE_MATH =   56,  /**< Gen6+ */
>> +   /* Reserved 57-63 */
>> BRW_OPCODE_ADD =64,
>> BRW_OPCODE_MUL =65,
>> BRW_OPCODE_AVG =66,
>> @@ -893,14 +912,18 @@ enum opcode {
>> BRW_OPCODE_SUBB =   79,  /**< Gen7+ */
>> BRW_OPCODE_SAD2 =   80,
>> BRW_OPCODE_SADA2 =  81,
>> +   /* Reserved 82-83 */
>> BRW_OPCODE_DP4 =84,
>> BRW_OPCODE_DPH =85,
>> BRW_OPCODE_DP3 =86,
>> BRW_OPCODE_DP2 =87,
>> +   /* Reserved 88 */
>> BRW_OPCODE_LINE =   89,
>> BRW_OPCODE_PLN =90,  /**< G45+ */
>> BRW_OPCODE_MAD =91,  /**< Gen6+ */
>> BRW_OPCODE_LRP =92,  /**< Gen6+ */
>> +   // BRW_OPCODE_MADM =93,  /**< Gen8+ */
>> +   /* Reserved 94-124 */
>> BRW_OPCODE_NENOP =  125, /**< G45 only */
>> BRW_OPCODE_NOP =126,
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
>> b/src/mesa/drivers/dri/i965/brw_disasm.c
>> index db23a18..c2dac7c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_disasm.c
>> +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
>> @@ -34,6 +34,7 @@
>>
>>  const struct opcode_desc opcode_descs[128] = {
> Out of curiosity: I cannot see an opcode numerically equal to 127, is
>

Re: [Mesa-dev] [RFC 08/21] mesa/extensions: Replace extension::api_set with ::version

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 11:24 AM, Chad Versace  wrote:
> On Mon 19 Oct 2015, Nanley Chery wrote:
>> From: Nanley Chery 
>>
>> The api_set field has no users outside of _mesa_extension_supported().
>> Remove it and allow the version field to take its place.
>>
>> The brunt of the transformation was performed with the following vim 
>> commands:
>> s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g
>> s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g
>> s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g
>> s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, ES1/g
>> s/\( 
>> ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6,
>>  ES2/g
>>
>> Signed-off-by: Nanley Chery 
>> ---
>>  src/mesa/main/extensions.c   |  21 +-
>>  src/mesa/main/extensions_table.h | 636 
>> ---
>>  2 files changed, 326 insertions(+), 331 deletions(-)
>
> [...]
>
>> diff --git a/src/mesa/main/extensions_table.h 
>> b/src/mesa/main/extensions_table.h
>
> [...]
>
>> +#define GLL 0
>> +#define GLC 0
>> +#define ES1 0
>> +#define ES2 0
>> +#define  x ~0
>
>> +EXT(EXT_abgr, dummy_true
>>  , GLL, GLC,  x ,  x , 1995)
>> +EXT(EXT_bgra, dummy_true
>>  , GLL,  x ,  x ,  x , 1995)
>> +EXT(EXT_blend_color , EXT_blend_color   
>>  , GLL,  x ,  x ,  x , 1995)
>> +EXT(EXT_blend_equation_separate , EXT_blend_equation_separate   
>>  , GLL, GLC,  x ,  x , 2003)
>> +EXT(EXT_blend_func_separate , EXT_blend_func_separate   
>>  , GLL,  x ,  x ,  x , 1999)
>> +EXT(EXT_discard_framebuffer , dummy_true
>>  ,  x ,  x , ES1, ES2, 2009)
>> +EXT(EXT_blend_minmax, EXT_blend_minmax  
>>  , GLL,  x , ES1, ES2, 1995)
>
> I like the idea of this patch. It's a cleanup that needs to happen. But,
> the little details confuse me.
>
> When I see GLL, GLC, ES1, and ES2 as the entries in the version columns,
> the entries confuse me because I expect the tokens to all have different
> values. Otherwise, why would the tokens have different names?
>
> I think the table would be less confusing if you used the macros
>   #define x ~0
>   #define Y 0
> or
>   #define n ~0
>   #define Y 0
> or
>   #define n ~0
>   #define y 0
> or anything like that. I like the n/Y the best because the letters are
> mnenomic and the contrast of uppercase to lowercase is easy to see when
> scanning the table.

I get what you're saying, but Nanley's code as is allows you to see
exactly which APIs an extension exists in without looking up the
column headers.

With your approach, a reader would need to know that the columns are
ordered GL compat, GL core, ES1, ES2, if I've understood your
suggestion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 09/21] mesa: Generate a helper function for each extension

2015-10-22 Thread Chad Versace

On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Generate functions which determine if an extension is supported in the
> current context. Initially, enums were going to be explicitly used with
> _mesa_extension_supported(). The idea to embed the function and enums
> into generated helper functions was suggested by Kristian Høgsberg.
> 
> For performance, the function body no longer uses
> _mesa_extension_supported() and, as suggested by Chad Versace, the
> functions are also declared static inline.
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/context.h|  1 +
>  src/mesa/main/extensions.c | 22 +-
>  src/mesa/main/extensions.h | 39 +++
>  3 files changed, 41 insertions(+), 21 deletions(-)

[...]

> diff --git a/src/mesa/main/extensions.h b/src/mesa/main/extensions.h
> index 595512a..24cc04d 100644
> --- a/src/mesa/main/extensions.h
> +++ b/src/mesa/main/extensions.h
> @@ -55,6 +55,45 @@ _mesa_get_extension_count(struct gl_context *ctx);
>  extern const GLubyte *
>  _mesa_get_enabled_extension(struct gl_context *ctx, GLuint index);
>  
> +
> +/**
> + * \brief An element of the \c extension_table.
> + */
> +struct extension {

In addition to Marek's comment, the struct should be prefixed too.

> +   /** Name of extension, such as "GL_ARB_depth_clamp". */
> +   const char *name;
> +
> +   /** Offset (in bytes) of the corresponding member in struct 
> gl_extensions. */
> +   size_t offset;
> +
> +   /** Minimum version the extension requires for the given API
> +* (see gl_api defined in mtypes.h)
> +*/
> +   GLuint version[API_OPENGL_LAST + 1];
> +
> +   /** Year the extension was proposed or approved.  Used to sort the 
> +* extension string chronologically. */
> +   uint16_t year;
> +} extern const extension_table[];

[...]

> +/** Checks if the context suports a user-facing extension */
> +#define EXT(name_str, driver_cap, ...) \
> +static inline bool _mesa_has_##name_str(const struct gl_context *ctx) { \
> +  return ctx->Extensions.driver_cap && (ctx->Version >= \
> + extension_table[MESA_EXTENSION_##name_str].version[ctx->API]); \
> +}
> +#include "extensions_table.h"
> +#undef EXT

The function should be formatted like this, to follow Mesa style:

  |static inline bool
  |_mesa_has_foo(...)
  |{
  |   ...
  |}
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 08/21] mesa/extensions: Replace extension::api_set with ::version

2015-10-22 Thread Chad Versace

On Thu 22 Oct 2015, Matt Turner wrote:
> On Thu, Oct 22, 2015 at 11:24 AM, Chad Versace  wrote:
> > On Mon 19 Oct 2015, Nanley Chery wrote:
> >> From: Nanley Chery 
> >>
> >> The api_set field has no users outside of _mesa_extension_supported().
> >> Remove it and allow the version field to take its place.
> >>
> >> The brunt of the transformation was performed with the following vim 
> >> commands:
> >> s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g
> >> s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g
> >> s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g
> >> s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, 
> >> ES1/g
> >> s/\( 
> >> ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6,
> >>  ES2/g
> >>
> >> Signed-off-by: Nanley Chery 
> >> ---
> >>  src/mesa/main/extensions.c   |  21 +-
> >>  src/mesa/main/extensions_table.h | 636 
> >> ---
> >>  2 files changed, 326 insertions(+), 331 deletions(-)
> >
> > [...]
> >
> >> diff --git a/src/mesa/main/extensions_table.h 
> >> b/src/mesa/main/extensions_table.h
> >
> > [...]
> >
> >> +#define GLL 0
> >> +#define GLC 0
> >> +#define ES1 0
> >> +#define ES2 0
> >> +#define  x ~0
> >
> >> +EXT(EXT_abgr, dummy_true  
> >>, GLL, GLC,  x ,  x , 1995)
> >> +EXT(EXT_bgra, dummy_true  
> >>, GLL,  x ,  x ,  x , 1995)
> >> +EXT(EXT_blend_color , EXT_blend_color 
> >>, GLL,  x ,  x ,  x , 1995)
> >> +EXT(EXT_blend_equation_separate , EXT_blend_equation_separate 
> >>, GLL, GLC,  x ,  x , 2003)
> >> +EXT(EXT_blend_func_separate , EXT_blend_func_separate 
> >>, GLL,  x ,  x ,  x , 1999)
> >> +EXT(EXT_discard_framebuffer , dummy_true  
> >>,  x ,  x , ES1, ES2, 2009)
> >> +EXT(EXT_blend_minmax, EXT_blend_minmax
> >>, GLL,  x , ES1, ES2, 1995)
> >
> > I like the idea of this patch. It's a cleanup that needs to happen. But,
> > the little details confuse me.
> >
> > When I see GLL, GLC, ES1, and ES2 as the entries in the version columns,
> > the entries confuse me because I expect the tokens to all have different
> > values. Otherwise, why would the tokens have different names?
> >
> > I think the table would be less confusing if you used the macros
> >   #define x ~0
> >   #define Y 0
> > or
> >   #define n ~0
> >   #define Y 0
> > or
> >   #define n ~0
> >   #define y 0
> > or anything like that. I like the n/Y the best because the letters are
> > mnenomic and the contrast of uppercase to lowercase is easy to see when
> > scanning the table.
> 
> I get what you're saying, but Nanley's code as is allows you to see
> exactly which APIs an extension exists in without looking up the
> column headers.

You make a good point.

> With your approach, a reader would need to know that the columns are
> ordered GL compat, GL core, ES1, ES2, if I've understood your
> suggestion.

You understood my suggestion correctly.

In the giant table contained in brw_surface_formats.c, we solve the
readability problem by inserting a banner comment at semi-regular intervals,
about every 40 lines. Like this:

   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x, R8G8_SSCALED)
   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x, R8G8_USCALED)
/* smpl filt shad CK  RT  AB  VB  SO  color */
   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x, R16_SSCALED)
   SF( x,  x,  x,  x,  x,  x,  Y,  x,  x, R16_USCALED)
   SF(50, 50,  x,  x,  x,  x,  x,  x,  x, P8A8_UNORM_PALETTE0)
   SF(50, 50,  x,  x,  x,  x,  x,  x,  x, P8A8_UNORM_PALETTE1)

But, as you pointed out, Nanley's solution obviates the need for the banner
comment. And banner comments are clumsy. So I drop my suggestion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965/fs: Use type-W for immediate in SampleID setup.

2015-10-22 Thread Anuj Phogat

On Wed, Oct 21, 2015 at 10:05 AM, Matt Turner  wrote:
> Not a functional difference, but register is loaded with a signed
> immediate (V) and added to a signed type (D) producing a signed result
> (D).
>
> Also change the type of g0 to allow for compaction.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 49323eb..f9c78df 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1273,7 +1273,7 @@ fs_visitor::emit_sampleid_setup()
> if (key->compute_sample_id) {
>fs_reg t1 = vgrf(glsl_type::int_type);
>fs_reg t2 = vgrf(glsl_type::int_type);
> -  t2.type = BRW_REGISTER_TYPE_UW;
> +  t2.type = BRW_REGISTER_TYPE_W;
>
>/* The PS will be run in MSDISPMODE_PERSAMPLE. For example with
> * 8x multisampling, subspan 0 will represent sample N (where N
> @@ -1295,7 +1295,7 @@ fs_visitor::emit_sampleid_setup()
> * subspan 1, and finally sample 1 of subspan 1.
> */
>abld.exec_all()
> -  .AND(t1, fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD)),
> +  .AND(t1, fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_D)),
> fs_reg(0xc0));
>abld.exec_all().SHR(t1, t1, fs_reg(5));
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 13c495c..9a5992e1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -1429,7 +1429,7 @@ fs_generator::generate_set_sample_id(fs_inst *inst,
> brw_set_default_exec_size(p, BRW_EXECUTE_8);
> brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> -   struct brw_reg reg = retype(stride(src1, 1, 4, 0), BRW_REGISTER_TYPE_UW);
> +   struct brw_reg reg = stride(src1, 1, 4, 0);
> if (dispatch_width == 8) {
>brw_ADD(p, dst, src0, reg);
> } else if (dispatch_width == 16) {
> --
> 2.4.9
>

Verified the generated instructions on IVB. Series is:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 5/8] i965/fs: Avoid scalar destinations in emit_uniformize()

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 11:37 AM, Kristian Høgsberg Kristensen
 wrote:
> The scalar destination registers break copy propagation. Instead compute
> the results to a regular register and then reference a component when we
> later use the result as a source.

It might be hairy to get it working, but making copy propagation
handle scalar destinations is the "right" way forward.

If we commit this patch, and later implement support for copy
propagating with scalar destinations, how will we remember to change
this back?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeon/uvd: don't expose HEVC on old UVD hw (v3)

2015-10-22 Thread Alex Deucher

The section for UVD 2 and older was not updated
when HEVC support was added. Reported by Kano
on irc.

v2: integrate the UVD2 and older checks into the
main switch statement.
v3: handle encode checking as well.  Encode is
already checked in the top case statement, so
drop encode checks in the lower case statement.

Signed-off-by: Alex Deucher 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/radeon/radeon_video.c | 50 +++
 1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 3a1834b..32bfc32 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -205,11 +205,12 @@ int rvid_get_video_param(struct pipe_screen *screen,
 enum pipe_video_cap param)
 {
struct r600_common_screen *rscreen = (struct r600_common_screen 
*)screen;
+   enum pipe_video_format codec = u_reduce_video_profile(profile);
 
if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   return u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC &&
+   return codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
rvce_is_fw_version_supported(rscreen);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
@@ -232,38 +233,19 @@ int rvid_get_video_param(struct pipe_screen *screen,
}
}
 
-   /* UVD 2.x limits */
-   if (rscreen->family < CHIP_PALM) {
-   enum pipe_video_format codec = u_reduce_video_profile(profile);
-   switch (param) {
-   case PIPE_VIDEO_CAP_SUPPORTED:
-   /* no support for MPEG4 */
-   return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
-  /* FIXME: VC-1 simple/main profile is broken */
-  profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
-  profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
-   case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   /* MPEG2 only with shaders and no support for
-  interlacing on R6xx style UVD */
-   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
-  rscreen->family > CHIP_RV770;
-   default:
-   break;
-   }
-   }
-
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   switch (u_reduce_video_profile(profile)) {
+   switch (codec) {
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
-   return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   if (rscreen->family < CHIP_PALM)
+   /* no support for MPEG4 */
+   return codec != PIPE_VIDEO_FORMAT_MPEG4;
+   return true;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
-   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
-  entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED;
case PIPE_VIDEO_FORMAT_HEVC:
/* Carrizo only supports HEVC Main */
return rscreen->family >= CHIP_CARRIZO &&
@@ -280,13 +262,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
+   if (rscreen->family < CHIP_PALM) {
+   /* MPEG2 only with shaders and no support for
+  interlacing on R6xx style UVD */
+   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
+  rscreen->family > CHIP_RV770;
+   } else {
+   if (u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_HEVC)
+   return false; //The firmware doesn't support 
interlaced HEVC.
+   return true;
+   }

Re: [Mesa-dev] [PATCH] radeon/uvd: don't expose HEVC on old UVD hw (v3)

2015-10-22 Thread Christian König


On 22.10.2015 21:35, Alex Deucher wrote:

The section for UVD 2 and older was not updated
when HEVC support was added. Reported by Kano
on irc.

v2: integrate the UVD2 and older checks into the
main switch statement.
v3: handle encode checking as well.  Encode is
already checked in the top case statement, so
drop encode checks in the lower case statement.

Signed-off-by: Alex Deucher 
Cc: mesa-sta...@lists.freedesktop.org


Reviewed-by: Christian König 


---
  src/gallium/drivers/radeon/radeon_video.c | 50 +++
  1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 3a1834b..32bfc32 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -205,11 +205,12 @@ int rvid_get_video_param(struct pipe_screen *screen,
 enum pipe_video_cap param)
  {
struct r600_common_screen *rscreen = (struct r600_common_screen 
*)screen;
+   enum pipe_video_format codec = u_reduce_video_profile(profile);
  
  	if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {

switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   return u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC &&
+   return codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
rvce_is_fw_version_supported(rscreen);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
@@ -232,38 +233,19 @@ int rvid_get_video_param(struct pipe_screen *screen,
}
}
  
-	/* UVD 2.x limits */

-   if (rscreen->family < CHIP_PALM) {
-   enum pipe_video_format codec = u_reduce_video_profile(profile);
-   switch (param) {
-   case PIPE_VIDEO_CAP_SUPPORTED:
-   /* no support for MPEG4 */
-   return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
-  /* FIXME: VC-1 simple/main profile is broken */
-  profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
-  profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
-   case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   /* MPEG2 only with shaders and no support for
-  interlacing on R6xx style UVD */
-   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
-  rscreen->family > CHIP_RV770;
-   default:
-   break;
-   }
-   }
-
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   switch (u_reduce_video_profile(profile)) {
+   switch (codec) {
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
-   return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   if (rscreen->family < CHIP_PALM)
+   /* no support for MPEG4 */
+   return codec != PIPE_VIDEO_FORMAT_MPEG4;
+   return true;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
-   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
-  entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED;
case PIPE_VIDEO_FORMAT_HEVC:
/* Carrizo only supports HEVC Main */
return rscreen->family >= CHIP_CARRIZO &&
@@ -280,13 +262,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
+   if (rscreen->family < CHIP_PALM) {
+   /* MPEG2 only with shaders and no support for
+  interlacing on R6xx style UVD */
+   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
+  rscreen->family > CHIP_RV770;
+   } else {
+   if (u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_HEVC)
+   return false; //The

Re: [Mesa-dev] [RFC 16/21] mesa: Fix ARB_texture_compression_bptc functionality leaks

2015-10-22 Thread Chad Versace

On Tue 20 Oct 2015, Nanley Chery wrote:
> On Tue, Oct 20, 2015 at 8:37 AM, Marek Olšák  wrote:
> 
> > NAK. I'd like this extension in compatibility contexts. The fact the
> > spec requires OpenGL 3.1 was just authors' laziness.
> >
> >
> I had thought it might be the case that some specs may unecessarily require
> a certain version,
> when in fact a longer list of dependencies would suffice instead. I'll look
> into this issue. If it's
> the right way to go, I can allow all versions of GL compatibility by simply
> replacing the '31'
> with 'GLL' in a v2.

I second Marek's request. Let's advertise this extension in
compatibility contexts.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 08/21] mesa/extensions: Replace extension::api_set with ::version

2015-10-22 Thread Chad Versace

On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> The api_set field has no users outside of _mesa_extension_supported().
> Remove it and allow the version field to take its place.
> 
> The brunt of the transformation was performed with the following vim commands:
> s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g
> s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g
> s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g
> s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, ES1/g
> s/\( 
> ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6,
>  ES2/g
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/extensions.c   |  21 +-
>  src/mesa/main/extensions_table.h | 636 
> ---
>  2 files changed, 326 insertions(+), 331 deletions(-)

[...]

> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h

[...]

> +#define GLL 0
> +#define GLC 0
> +#define ES1 0
> +#define ES2 0
> +#define  x ~0

> +EXT(EXT_abgr, dummy_true 
> , GLL, GLC,  x ,  x , 1995)
> +EXT(EXT_bgra, dummy_true 
> , GLL,  x ,  x ,  x , 1995)
> +EXT(EXT_blend_color , EXT_blend_color
> , GLL,  x ,  x ,  x , 1995)
> +EXT(EXT_blend_equation_separate , EXT_blend_equation_separate
> , GLL, GLC,  x ,  x , 2003)
> +EXT(EXT_blend_func_separate , EXT_blend_func_separate
> , GLL,  x ,  x ,  x , 1999)
> +EXT(EXT_discard_framebuffer , dummy_true 
> ,  x ,  x , ES1, ES2, 2009)
> +EXT(EXT_blend_minmax, EXT_blend_minmax   
> , GLL,  x , ES1, ES2, 1995)

I like the idea of this patch. It's a cleanup that needs to happen. But,
the little details confuse me.

When I see GLL, GLC, ES1, and ES2 as the entries in the version columns,
the entries confuse me because I expect the tokens to all have different
values. Otherwise, why would the tokens have different names?

I think the table would be less confusing if you used the macros
  #define x ~0
  #define Y 0
or
  #define n ~0
  #define Y 0
or
  #define n ~0
  #define y 0
or anything like that. I like the n/Y the best because the letters are
mnenomic and the contrast of uppercase to lowercase is easy to see when
scanning the table.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/8] i965/fs: Use unsigned immediate 0 when eliminating SHADER_OPCODE_FIND_LIVE_CHANNEL

2015-10-22 Thread Matt Turner

Reviewed-by: Matt Turner 

(though, there's really no reason for copy propagation to be this stupid :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeon/uvd: don't expose HEVC on old UVD hw (v2)

2015-10-22 Thread Christian König


On 22.10.2015 19:50, Alex Deucher wrote:

The section for UVD 2 and older was not updated
when HEVC support was added. Reported by Kano
on irc.

v2: integrate the UVD2 and older checks into the
main switch statement.

Signed-off-by: Alex Deucher 
Cc: mesa-sta...@lists.freedesktop.org
---
  src/gallium/drivers/radeon/radeon_video.c | 46 ---
  1 file changed, 17 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 3a1834b..a874270 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -205,6 +205,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
 enum pipe_video_cap param)
  {
struct r600_common_screen *rscreen = (struct r600_common_screen 
*)screen;
+   enum pipe_video_format codec = u_reduce_video_profile(profile);
  
  	if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {

switch (param) {
@@ -232,34 +233,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
}
}
  
-	/* UVD 2.x limits */

-   if (rscreen->family < CHIP_PALM) {
-   enum pipe_video_format codec = u_reduce_video_profile(profile);
-   switch (param) {
-   case PIPE_VIDEO_CAP_SUPPORTED:
-   /* no support for MPEG4 */
-   return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
-  /* FIXME: VC-1 simple/main profile is broken */
-  profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
-  profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
-   case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   /* MPEG2 only with shaders and no support for
-  interlacing on R6xx style UVD */
-   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
-  rscreen->family > CHIP_RV770;
-   default:
-   break;
-   }
-   }
-
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   switch (u_reduce_video_profile(profile)) {
+   switch (codec) {
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
-   return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   if (rscreen->family < CHIP_PALM)
+   /* no support for MPEG4 */
+   return codec != PIPE_VIDEO_FORMAT_MPEG4;
+   else
+   return entrypoint != 
PIPE_VIDEO_ENTRYPOINT_ENCODE;
On UVD 2.x we don't support video encoding either. So the check should 
probably be:


if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
return false;

if (rscreen->family < CHIP_PALM)
/* no support for MPEG4 */
return codec != PIPE_VIDEO_FORMAT_MPEG4;

return true;


case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
@@ -280,13 +264,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
-   if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
-   return false; //The hardware doesn't support interlaced 
HEVC.
-   return true;
+   if (rscreen->family < CHIP_PALM) {
+   /* MPEG2 only with shaders and no support for
+  interlacing on R6xx style UVD */
+   return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
+  rscreen->family > CHIP_RV770;
+   } else {
+   if (u_reduce_video_profile(profile) == 
PIPE_VIDEO_FORMAT_HEVC)
+   return false; //The hardware doesn't support 
interlaced HEVC.


Actually it's the firmware which doesn't support this and IIRC there are 
plans to get it working.


Regards,
Christian.


+   return true;
+   }
case PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE:
return true;
case PIPE_VIDEO_CAP_MAX_LEVEL:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/9] i965: Add initial assembly validation pass.

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 4:15 AM, Emil Velikov  wrote:
> On 21 October 2015 at 23:58, Matt Turner  wrote:
>> Initially just checks that sources are non-NULL, which would have
>> alerted us to the problem fixed by commit 6c846dc5.
>
> What are the chances of combining this with and/or removing
> fs_visitor::validate() ? Afaict both of these validations are at
> different level and we do not have enough information within
> brw_validate_instructions(), do we ?

Right, they'll validate different sets of things (that aren't possible
for the other to validate), so both are useful and complementary.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/8] i965: Don't use message headers for untyped reads

2015-10-22 Thread Kristian Høgsberg Kristensen

We always set the mask to 0x, which is what it defaults to when no
header is present. Let's drop the header instead.

v2: Only remove header for untyped reads. Typed reads always need the
header.

Reviewed-by: Francisco Jerez 
Reviewed-by: Jordan Justen 
Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +--
 src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index bf2fee9..ebd811f 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2906,11 +2906,10 @@ brw_untyped_surface_read(struct brw_codegen *p,
const unsigned sfid = (devinfo->gen >= 8 || devinfo->is_haswell ?
   HSW_SFID_DATAPORT_DATA_CACHE_1 :
   GEN7_SFID_DATAPORT_DATA_CACHE);
-   const bool align1 = (brw_inst_access_mode(devinfo, p->current) == 
BRW_ALIGN_1);
struct brw_inst *insn = brw_send_indirect_surface_message(
   p, sfid, dst, payload, surface, msg_length,
   brw_surface_payload_size(p, num_channels, true, true),
-  align1);
+  false);
 
brw_set_dp_untyped_surface_read_message(
   p, insn, num_channels);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1c96cf2..23d3418 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4070,7 +4070,7 @@ fs_visitor::lower_logical_sends()
   case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
  lower_surface_logical_send(ibld, inst,
 SHADER_OPCODE_UNTYPED_SURFACE_READ,
-fs_reg(0x));
+fs_reg());
  break;
 
   case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 7/8] i965/fs: Optimize ssbo stores

2015-10-22 Thread Kristian Høgsberg Kristensen

Write groups of enabled components together.

Signed-off-by: Kristian Høgsberg Kristensen 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 56 +++-
 1 file changed, 26 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index a82c616..5fbb32f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1,3 +1,4 @@
+
 /*
  * Copyright © 2010 Intel Corporation
  *
@@ -1759,45 +1760,40 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
nir->info.num_ssbos - 1);
   }
 
-  /* Offset */
-  fs_reg offset_reg = vgrf(glsl_type::uint_type);
-  unsigned const_offset_bytes = 0;
-  if (has_indirect) {
- bld.MOV(offset_reg, get_nir_src(instr->src[2]));
-  } else {
- const_offset_bytes = instr->const_index[0];
- bld.MOV(offset_reg, fs_reg(const_offset_bytes));
-  }
-
   /* Value */
   fs_reg val_reg = get_nir_src(instr->src[0]);
 
   /* Writemask */
   unsigned writemask = instr->const_index[1];
 
-  /* Write each component present in the writemask */
-  unsigned skipped_channels = 0;
-  for (int i = 0; i < instr->num_components; i++) {
- int component_mask = 1 << i;
- if (writemask & component_mask) {
-if (skipped_channels) {
-   if (!has_indirect) {
-  const_offset_bytes += 4 * skipped_channels;
-  bld.MOV(offset_reg, fs_reg(const_offset_bytes));
-   } else {
-  bld.ADD(offset_reg, offset_reg,
-   brw_imm_ud(4 * skipped_channels));
-   }
-   skipped_channels = 0;
-}
+  /* Combine groups of consecutive enabled channels in one write
+   * message. We use ffs to find the first enabled channel and then ffs on
+   * the bit-inverse, down-shifted writemask to determine the length of
+   * the block of enabled bits.
+   */
+  while (writemask) {
+ unsigned first_component = ffs(writemask) - 1;
+ unsigned length = ffs(~(writemask >> first_component)) - 1;
+ fs_reg offset_reg;
 
-emit_untyped_write(bld, surf_index, offset_reg,
-   offset(val_reg, bld, i),
-   1 /* dims */, 1 /* size */,
-   BRW_PREDICATE_NONE);
+ if (!has_indirect) {
+offset_reg = fs_reg(instr->const_index[0] + 4 * first_component);
+ } else {
+offset_reg = vgrf(glsl_type::uint_type);
+bld.ADD(offset_reg,
+retype(get_nir_src(instr->src[2]), BRW_REGISTER_TYPE_UD),
+fs_reg(4 * first_component));
  }
 
- skipped_channels++;
+ emit_untyped_write(bld, surf_index, offset_reg,
+offset(val_reg, bld, first_component),
+1 /* dims */, length,
+BRW_PREDICATE_NONE);
+
+ /* Clear the bits in the writemask that we just wrote, then try
+  * again to see if more channels are left.
+  */
+ writemask &= (15 << (first_component + length));
   }
   break;
}
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/8] i965/fs: Don't uniformize surface index twice

2015-10-22 Thread Kristian Høgsberg

On Thu, Oct 22, 2015 at 12:11 PM, Matt Turner  wrote:
> On Thu, Oct 22, 2015 at 11:36 AM, Kristian Høgsberg Kristensen
>  wrote:
>> The emit_untyped_read and emit_untyped_write helpers already uniformize
>> the surface index argument. No need to do it before calling them.
>>
>> Signed-off-by: Kristian Høgsberg Kristensen 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 5dc63c9..00f200a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1511,7 +1511,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
>> nir_intrinsic_instr *instr
>>   surf_index = vgrf(glsl_type::uint_type);
>>   bld.ADD(surf_index, get_nir_src(instr->src[0]),
>>   fs_reg(stage_prog_data->binding_table.ssbo_start));
>> - surf_index = bld.emit_uniformize(surf_index);
>
> Looks correct -- I see that emit_send() does call uniformize.
>
> But when this was added in commit b234537, it apparently fixed piglit
> tests. Presumably we gained an additional uniformize in the transition
> to the builder infrastructure, making this one no longer necessary?
>
> If that's correct,

Yup, that was for UBOs which don't go through the surface builder infrastructure

Kristian.

> Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 15/21] mesa: Fix EXT_texture_sRGB functionality leaks

2015-10-22 Thread Chad Versace

On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Stop leaks into the following contexts:
>* GLES in _mesa_base_tex_format() and lookup_view_class().
>* Pre-1.1 GL legacy contexts in all uses.
> 
> Stop allowing compressed sRGB formats as valid formats in GLES3
> contexts. I realized this was happening when CTS failures occured after
> fixing the extension functionality leak with the helper function.

Do you mean that this patch *fixes* CTS failures? If so, which CTS for
which GLES version?  There are so many CTS's in the world :(
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 06/21] mesa/extensions: Create _mesa_extension_supported()

2015-10-22 Thread Chad Versace

On Thu 22 Oct 2015, Emil Velikov wrote:
> On 19 October 2015 at 23:36, Nanley Chery  wrote:
> > From: Nanley Chery 
> >
> > Create a function which determines if an extension is supported in the
> > current context.
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/main/extensions.c | 17 +
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> > index 390e026..7137bc9 100644
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -423,6 +423,23 @@ typedef unsigned short extension_index;
> >
> >
> >  /**
> > + * Given an extension enum, return whether or not the extension is 
> > supported
> > + * dependent on the following factors:
> > + * There's driver support and the OpenGL/ES version is at least that
> > + * specified in the extension_table.
> > + */
> > +static inline bool
> > +_mesa_extension_supported(const struct gl_context *ctx, extension_index ei)
> > +{
> > +   const bool *base = (bool *) >Extensions;
> > +   const struct extension *i = extension_table + ei;
> > +   const uint8_t api_set = 1 << ctx->API;
> > +   return (i->api_set & api_set) &&
> > +  (ctx->Version >= i->version[ctx->API]) &&
> > +  base[i->offset];
> Bikeshed: I realise that you're copying most of these, but wouldn't it
> be better if we use more common/intuitive variable names ?
> 
> ei -> i or idx
> i -> ext

I second Emil's suggestion.

Please insert an empty line between the block of variable declarations
and the return statement.

Also, it improves the readability if you rename the local variable
'api_set' to 'api_bit', because it's not really a set. It's just one
bit.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/8] i965/fs: Don't uniformize surface index twice

2015-10-22 Thread Matt Turner

On Thu, Oct 22, 2015 at 11:36 AM, Kristian Høgsberg Kristensen
 wrote:
> The emit_untyped_read and emit_untyped_write helpers already uniformize
> the surface index argument. No need to do it before calling them.
>
> Signed-off-by: Kristian Høgsberg Kristensen 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 5dc63c9..00f200a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1511,7 +1511,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
> nir_intrinsic_instr *instr
>   surf_index = vgrf(glsl_type::uint_type);
>   bld.ADD(surf_index, get_nir_src(instr->src[0]),
>   fs_reg(stage_prog_data->binding_table.ssbo_start));
> - surf_index = bld.emit_uniformize(surf_index);

Looks correct -- I see that emit_send() does call uniformize.

But when this was added in commit b234537, it apparently fixed piglit
tests. Presumably we gained an additional uniformize in the transition
to the builder infrastructure, making this one no longer necessary?

If that's correct,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: remove excess location qualifier validation

2015-10-22 Thread Timothy Arceri

On Thu, 2015-10-22 at 08:55 +0300, Tapani Pälli wrote:
> On 10/22/2015 08:29 AM, Timothy Arceri wrote:
> > Location has never been able to be a negative value because it has
> > always been validated in the parser.
> > 
> > Also the linker doesn't check for negatives like the comment
> > claims.
> 
> Neither does the parser, if one utilizes negative explicit location, 
> parser says:
> 
> error: syntax error, unexpected '-', expecting INTCONSTANT or
> UINTCONSTANT
> 
> I'm not sure if this is quite OK, it should rather accept the
> negative 
> value and the fail here in this check you are about to remove?

Yes I did noticed this. However even if we changed it to accept a
negative value the is another check in the parser that would catch this
before the checks I'm removing here.

if ($3 >= 0) {
   $$.location = $3;
} else {
_mesa_glsl_error(& @3, state, "invalid location %d specified", $3);
 YYERROR;
}

I'm also working on ARB_enhanced_layouts which will allow negative
values to get passed the parser and will be moving the check out of the
parser and into the ast.

This patch is a clean-up before doing that, as the attributes code
doesn't do the validation at all and the one for uniforms should be
shared with the attibutes code.

> 
> > ---
> > 
> >   No piglit regressions and an extra negative test sent for
> >   ARB_explicit_uniform_location [1]
> > 
> >   [1] http://patchwork.freedesktop.org/patch/62573/
> > 
> >   src/glsl/ast_to_hir.cpp | 70 
> > -
> >   1 file changed, 22 insertions(+), 48 deletions(-)
> > 
> > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > index 8549d55..0306530 100644
> > --- a/src/glsl/ast_to_hir.cpp
> > +++ b/src/glsl/ast_to_hir.cpp
> > @@ -2422,21 +2422,6 @@ validate_explicit_location(const struct
> > ast_type_qualifier *qual,
> > const struct gl_context *const ctx = state->ctx;
> > unsigned max_loc = qual->location + var->type
> > ->uniform_locations() - 1;
> >   
> > -  /* ARB_explicit_uniform_location specification states:
> > -   *
> > -   * "The explicitly defined locations and the generated
> > locations
> > -   * must be in the range of 0 to MAX_UNIFORM_LOCATIONS
> > minus one."
> > -   *
> > -   * "Valid locations for default-block uniform variable
> > locations
> > -   * are in the range of 0 to the implementation-defined
> > maximum
> > -   * number of uniform locations."
> > -   */
> > -  if (qual->location < 0) {
> > - _mesa_glsl_error(loc, state,
> > -  "explicit location < 0 for uniform %s",
> > var->name);
> > - return;
> > -  }
> > -
> > if (max_loc >= ctx
> > ->Const.MaxUserAssignableUniformLocations) {
> >_mesa_glsl_error(loc, state, "location(s) consumed by
> > uniform %s "
> > ">= MAX_UNIFORM_LOCATIONS (%u)", var
> > ->name,
> > @@ -2527,41 +2512,30 @@ validate_explicit_location(const struct
> > ast_type_qualifier *qual,
> >  } else {
> > var->data.explicit_location = true;
> >   
> > -  /* This bit of silliness is needed because invalid explicit
> > locations
> > -   * are supposed to be flagged during linking.  Small
> > negative values
> > -   * biased by VERT_ATTRIB_GENERIC0 or FRAG_RESULT_DATA0 could
> > alias
> > -   * built-in values (e.g., -16+VERT_ATTRIB_GENERIC0 =
> > VERT_ATTRIB_POS).
> > -   * The linker needs to be able to differentiate these cases.
> >   This
> > -   * ensures that negative values stay negative.
> > -   */
> > -  if (qual->location >= 0) {
> > - switch (state->stage) {
> > - case MESA_SHADER_VERTEX:
> > -var->data.location = (var->data.mode ==
> > ir_var_shader_in)
> > -   ? (qual->location + VERT_ATTRIB_GENERIC0)
> > -   : (qual->location + VARYING_SLOT_VAR0);
> > -break;
> > +  switch (state->stage) {
> > +  case MESA_SHADER_VERTEX:
> > + var->data.location = (var->data.mode == ir_var_shader_in)
> > +? (qual->location + VERT_ATTRIB_GENERIC0)
> > +: (qual->location + VARYING_SLOT_VAR0);
> > + break;
> >   
> > - case MESA_SHADER_TESS_CTRL:
> > - case MESA_SHADER_TESS_EVAL:
> > - case MESA_SHADER_GEOMETRY:
> > -if (var->data.patch)
> > -   var->data.location = qual->location +
> > VARYING_SLOT_PATCH0;
> > -else
> > -   var->data.location = qual->location +
> > VARYING_SLOT_VAR0;
> > -break;
> > +  case MESA_SHADER_TESS_CTRL:
> > +  case MESA_SHADER_TESS_EVAL:
> > +  case MESA_SHADER_GEOMETRY:
> > + if (var->data.patch)
> > +var->data.location = qual->location +
> > VARYING_SLOT_PATCH0;
> > + else
> > +var->data.location = qual->location +
> > VARYING_SLOT_VAR0;
> > + break;
> >   
> > -

Re: [Mesa-dev] [PATCH 2/9] i965: Fill out instruction list.

2015-10-22 Thread Emil Velikov

On 21 October 2015 at 23:58, Matt Turner  wrote:
> Add some instructions: illegal, movi, sends, sendsc.
>
> Remove some instructions with reused opcodes: msave, mrestore, push,
> pop, goto. I did have some gross code for disassembling opcodes
> per-generation, but there's very little meaningful overlap so it's
> probably not needed.
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 37 
> ++--
>  src/mesa/drivers/dri/i965/brw_disasm.c   | 16 --
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  2 +-
>  3 files changed, 41 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 393f17a..26fc0af 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -838,43 +838,62 @@ enum PACKED brw_horizontal_stride {
>
>  enum opcode {
> /* These are the actual hardware opcodes. */
> +   BRW_OPCODE_ILLEGAL = 0,
> BRW_OPCODE_MOV =1,
> BRW_OPCODE_SEL =2,
> +   BRW_OPCODE_MOVI =   3,   /**< G45+ */
> BRW_OPCODE_NOT =4,
> BRW_OPCODE_AND =5,
> BRW_OPCODE_OR = 6,
> BRW_OPCODE_XOR =7,
> BRW_OPCODE_SHR =8,
> BRW_OPCODE_SHL =9,
> +   // BRW_OPCODE_DIM = 10,  /**< Gen7.5 only */ /* Reused */
> +   // BRW_OPCODE_SMOV =10,  /**< Gen8+   */ /* Reused */
> +   /* Reserved - 11 */
> BRW_OPCODE_ASR =12,
> +   /* Reserved - 13-15 */
> BRW_OPCODE_CMP =16,
> BRW_OPCODE_CMPN =   17,
> BRW_OPCODE_CSEL =   18,  /**< Gen8+ */
> BRW_OPCODE_F32TO16 = 19,  /**< Gen7 only */
> BRW_OPCODE_F16TO32 = 20,  /**< Gen7 only */
> +   /* Reserved - 21-22 */
> BRW_OPCODE_BFREV =  23,  /**< Gen7+ */
> BRW_OPCODE_BFE =24,  /**< Gen7+ */
> BRW_OPCODE_BFI1 =   25,  /**< Gen7+ */
> BRW_OPCODE_BFI2 =   26,  /**< Gen7+ */
> +   /* Reserved - 27-31 */
> BRW_OPCODE_JMPI =   32,
> +   // BRW_OPCODE_BRD = 33,  /**< Gen7+ */
> BRW_OPCODE_IF = 34,
> -   BRW_OPCODE_IFF =35,  /**< Pre-Gen6 */
> +   BRW_OPCODE_IFF =35,  /**< Pre-Gen6*/ /* Reused */
> +   // BRW_OPCODE_BRC = 35,  /**< Gen7+   */ /* Reused */
> BRW_OPCODE_ELSE =   36,
> BRW_OPCODE_ENDIF =  37,
> -   BRW_OPCODE_DO = 38,
> +   BRW_OPCODE_DO = 38,  /**< Pre-Gen6*/ /* Reused */
> +   // BRW_OPCODE_CASE =38,  /**< Gen6 only   */ /* Reused */
> BRW_OPCODE_WHILE =  39,
> BRW_OPCODE_BREAK =  40,
> BRW_OPCODE_CONTINUE = 41,
> BRW_OPCODE_HALT =   42,
> -   BRW_OPCODE_MSAVE =  44,  /**< Pre-Gen6 */
> -   BRW_OPCODE_MRESTORE = 45, /**< Pre-Gen6 */
> -   BRW_OPCODE_PUSH =   46,  /**< Pre-Gen6 */
> -   BRW_OPCODE_GOTO =   46,  /**< Gen8+*/
> -   BRW_OPCODE_POP =47,  /**< Pre-Gen6 */
> +   // BRW_OPCODE_CALLA =   43,  /**< Gen7.5+ */
> +   // BRW_OPCODE_MSAVE =   44,  /**< Pre-Gen6*/ /* Reused */
> +   // BRW_OPCODE_CALL =44,  /**< Gen6+   */ /* Reused */
> +   // BRW_OPCODE_MREST =   45,  /**< Pre-Gen6*/ /* Reused */
> +   // BRW_OPCODE_RET = 45,  /**< Gen6+   */ /* Reused */
> +   // BRW_OPCODE_PUSH =46,  /**< Pre-Gen6*/ /* Reused */
> +   // BRW_OPCODE_FORK =46,  /**< Gen6 only   */ /* Reused */
> +   // BRW_OPCODE_GOTO =46,  /**< Gen8+   */ /* Reused */
> +   // BRW_OPCODE_POP = 47,  /**< Pre-Gen6*/
> BRW_OPCODE_WAIT =   48,
> BRW_OPCODE_SEND =   49,
> BRW_OPCODE_SENDC =  50,
> +   BRW_OPCODE_SENDS =  51,  /**< Gen9+ */
> +   BRW_OPCODE_SENDSC = 52,  /**< Gen9+ */
> +   /* Reserved 53-55 */
> BRW_OPCODE_MATH =   56,  /**< Gen6+ */
> +   /* Reserved 57-63 */
> BRW_OPCODE_ADD =64,
> BRW_OPCODE_MUL =65,
> BRW_OPCODE_AVG =66,
> @@ -893,14 +912,18 @@ enum opcode {
> BRW_OPCODE_SUBB =   79,  /**< Gen7+ */
> BRW_OPCODE_SAD2 =   80,
> BRW_OPCODE_SADA2 =  81,
> +   /* Reserved 82-83 */
> BRW_OPCODE_DP4 =84,
> BRW_OPCODE_DPH =85,
> BRW_OPCODE_DP3 =86,
> BRW_OPCODE_DP2 =87,
> +   /* Reserved 88 */
> BRW_OPCODE_LINE =   89,
> BRW_OPCODE_PLN =90,  /**< G45+ */
> BRW_OPCODE_MAD =91,  /**< Gen6+ */
> BRW_OPCODE_LRP =92,  /**< Gen6+ */
> +   // BRW_OPCODE_MADM =93,  /**< Gen8+ */
> +   /* Reserved 94-124 */
> BRW_OPCODE_NENOP =  125, /**< G45 only */
> BRW_OPCODE_NOP =126,
>
> diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
> b/src/mesa/drivers/dri/i965/brw_disasm.c
> index db23a18..c2dac7c 100644
> --- a/src/mesa/drivers/dri/i965/brw_disasm.c
> +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
> @@ -34,6 +34,7 @@
>
>  const struct opcode_desc opcode_descs[128] = {
Out of curiosity: I cannot see an opcode numerically equal to 127, is
there one ? What if one drops the explicit array size here (and in
later patch) in order to get a lovely build error (by adding a
static_assert) ?

I might be over-thinking it though :-)
-Emil

Re: [Mesa-dev] [PATCH 1/2] main: Remove interface block array index for doing the name comparison

2015-10-22 Thread Timothy Arceri

On Thu, 2015-10-22 at 11:01 +0200, Samuel Iglesias Gonsalvez wrote:
> From ARB_program_query_interface spec:
> 
> "uint GetProgramResourceIndex(uint program, enum programInterface,
>const char *name);
>  [...]
>  If  exactly matches the name string of one of the active
> resources
>  for , the index of the matched resource is
> returned.
>  Additionally, if  would exactly match the name string of an
> active
>  resource if "[0]" were appended to , the index of the matched
>  resource is returned. [...]"
> 
> "A string provided to GetProgramResourceLocation or
>  GetProgramResourceLocationIndex is considered to match an active
> variable
>  if:
> [...]
>* if the string identifies the base name of an active array, where
> the
>  string would exactly match the name of the variable if the
> suffix
>  "[0]" were appended to the string;
> [...]
> "
> 
> But this is only happening on those ARB_program_query_interface's
> queries.
> For the rest of specs we need to keep old behavior. For that reason,
> arb_program_interface_query boolean is added to the affected
> functions.
> 
> Fixes the following two dEQP-GLES31 tests:
> 
> dEQP
> -GLES31.functional.program_interface_query.shader_storage_block.resou
> rce_list.block_array
> dEQP
> -GLES31.functional.program_interface_query.shader_storage_block.resou
> rce_list.block_array_single_element
> Signed-off-by: Samuel Iglesias Gonsalvez 
> Cc: Tapani Pälli 
> ---
>  src/mesa/main/program_resource.c |  6 ++---
>  src/mesa/main/shader_query.cpp   | 58
> +++-
>  src/mesa/main/shaderapi.c|  4 +--
>  src/mesa/main/shaderapi.h|  9 ---
>  src/mesa/main/uniforms.c |  6 ++---
>  5 files changed, 60 insertions(+), 23 deletions(-)
> 
> diff --git a/src/mesa/main/program_resource.c
> b/src/mesa/main/program_resource.c
> index eb71fdd..d029926 100644
> --- a/src/mesa/main/program_resource.c
> +++ b/src/mesa/main/program_resource.c
> @@ -251,7 +251,7 @@ _mesa_GetProgramResourceIndex(GLuint program,
> GLenum programInterface,
> case GL_UNIFORM_BLOCK:
> case GL_SHADER_STORAGE_BLOCK:
>res = _mesa_program_resource_find_name(shProg,
> programInterface, name,
> - _index);
> + _index, true);
>if (!res || array_index > 0)
>   return GL_INVALID_INDEX;
>  
> @@ -377,7 +377,7 @@ _mesa_GetProgramResourceLocation(GLuint program,
> GLenum programInterface,
>   goto fail;
> }
>  
> -   return _mesa_program_resource_location(shProg, programInterface,
> name);
> +   return _mesa_program_resource_location(shProg, programInterface,
> name, true);
>  fail:
> _mesa_error(ctx, GL_INVALID_ENUM,
> "glGetProgramResourceLocation(%s %s)",
> _mesa_enum_to_string(programInterface), name);
> @@ -411,5 +411,5 @@ _mesa_GetProgramResourceLocationIndex(GLuint
> program, GLenum programInterface,
> }
>  
> return _mesa_program_resource_location_index(shProg,
> GL_PROGRAM_OUTPUT,
> -name);
> +name, true);
>  }
> diff --git a/src/mesa/main/shader_query.cpp
> b/src/mesa/main/shader_query.cpp
> index 8182d3d..49766ca 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -218,7 +218,7 @@ _mesa_GetAttribLocation(GLhandleARB program,
> const GLcharARB * name)
> unsigned array_index = 0;
> struct gl_program_resource *res =
>_mesa_program_resource_find_name(shProg, GL_PROGRAM_INPUT,
> name,
> -   _index);
> +   _index, false);
>  
> if (!res)
>return -1;
> @@ -367,7 +367,7 @@ _mesa_GetFragDataIndex(GLuint program, const
> GLchar *name)
>return -1;
>  
> return _mesa_program_resource_location_index(shProg,
> GL_PROGRAM_OUTPUT,
> -name);
> +name, false);
>  }
>  
>  GLint GLAPIENTRY
> @@ -404,7 +404,7 @@ _mesa_GetFragDataLocation(GLuint program, const
> GLchar *name)
> unsigned array_index = 0;
> struct gl_program_resource *res =
>_mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT,
> name,
> -   _index);
> +   _index, false);
>  
> if (!res)
>return -1;
> @@ -533,9 +533,12 @@ valid_array_index(const GLchar *name, unsigned
> *array_index)
>  struct gl_program_resource *
>  _mesa_program_resource_find_name(struct gl_shader_program *shProg,
>   GLenum programInterface, const char
> *name,
> - unsigned *array_index)
> + unsigned *array_index,
> + bool

[Mesa-dev] [PATCH 2/2] nir/instr_set: allow rewrite of SSBO loads

2015-10-22 Thread Iago Toral Quiroga

So we can effectively CSE the ones that are safe to reuse. Makes a shader
such as this:

buffer SSBO {
mat4 sm4;
};

uniform mat4 um4;

void main() {
sm4 *= um4;
}

go from 16 SSBO loads to only 4.
---
 src/glsl/nir/nir_instr_set.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c
index d3f939f..470683f 100644
--- a/src/glsl/nir/nir_instr_set.c
+++ b/src/glsl/nir/nir_instr_set.c
@@ -398,6 +398,13 @@ dest_is_ssa(nir_dest *dest, void *data)
return dest->is_ssa;
 }
 
+static bool
+is_ssbo_load(nir_intrinsic_instr *instr)
+{
+   return instr->intrinsic == nir_intrinsic_load_ssbo ||
+  instr->intrinsic == nir_intrinsic_load_ssbo_indirect;
+}
+
 /* This function determines if uses of an instruction can safely be rewritten
  * to use another identical instruction instead. Note that this function must
  * be kept in sync with hash_instr() and nir_instrs_equal() -- only
@@ -428,11 +435,20 @@ instr_can_rewrite(nir_instr *instr)
   return true;
}
case nir_instr_type_intrinsic: {
+  nir_intrinsic_instr *intrinsic = nir_instr_as_intrinsic(instr);
   const nir_intrinsic_info *info =
- _intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic];
-  return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
- (info->flags & NIR_INTRINSIC_CAN_REORDER) &&
- info->num_variables == 0; /* not implemented yet */
+ _intrinsic_infos[intrinsic->intrinsic];
+  bool can_eliminate_and_reorder =
+ (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
+ (info->flags & NIR_INTRINSIC_CAN_REORDER) &&
+ info->num_variables == 0; /* not implemented yet */
+
+  /* SSBO loads are special: they can be reordered only in certain
+   * situations, so they don't set NIR_INTRINSIC_CAN_REORDER, however,
+   * we do want to CSE them in the cases where it is safe, so return
+   * true for them here.
+   */
+  return can_eliminate_and_reorder ? true : is_ssbo_load(intrinsic);
}
case nir_instr_type_call:
case nir_instr_type_jump:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2] Nir: Allow CSE of SSBO loads

2015-10-22 Thread Iago Toral Quiroga

I implemented this first as a separate optimization pass in GLSL IR [1], but
Curro pointed out that this being pretty much a restricted form of a CSE pass
it would probably make more sense to do it inside CSE (and we no longer have
a CSE pass in GLSL IR).

Unlike other things we CSE in NIR, in the case of SSBO loads we need to make
sure that we invalidate previous entries in the set in the presence of
conflicting instructions (i.e. SSBO writes to the same block and offset) or
in the presence of memory barriers.

If this is accepted I intend to extend this to also cover image reads, which
follow similar behavior.

No regressions observed in piglit or dEQP's SSBO functional tests.

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html

Iago Toral Quiroga (2):
  nir/cse: invalidate SSBO loads in presence of ssbo writes or memory
barriers
  nir/instr_set: allow rewrite of SSBO loads

 src/glsl/nir/nir_instr_set.c |  24 ++--
 src/glsl/nir/nir_opt_cse.c   | 142 +++
 2 files changed, 162 insertions(+), 4 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] main: Remove interface block array index for doing the name comparison

2015-10-22 Thread Samuel Iglesias Gonsálvez



On 22/10/15 13:08, Tapani Pälli wrote:
> On 10/22/2015 12:01 PM, Samuel Iglesias Gonsalvez wrote:
>>  From ARB_program_query_interface spec:
>>
>> "uint GetProgramResourceIndex(uint program, enum programInterface,
>> const char *name);
>>   [...]
>>   If  exactly matches the name string of one of the active
>> resources
>>   for , the index of the matched resource is returned.
>>   Additionally, if  would exactly match the name string of an
>> active
>>   resource if "[0]" were appended to , the index of the matched
>>   resource is returned. [...]"
>>
>> "A string provided to GetProgramResourceLocation or
>>   GetProgramResourceLocationIndex is considered to match an active
>> variable
>>   if:
>> [...]
>> * if the string identifies the base name of an active array, where
>> the
>>   string would exactly match the name of the variable if the suffix
>>   "[0]" were appended to the string;
>> [...]
>> "
>>
>> But this is only happening on those ARB_program_query_interface's
>> queries.
>> For the rest of specs we need to keep old behavior. For that reason,
>> arb_program_interface_query boolean is added to the affected functions.
> 
> This seems strange as the extension spec (and GL core spec) has set of
> different examples that state the new queries to match behavior of old
> functions (using words "are equivalent to calling"), this is why our
> tests also cross-check that new and old functions return same values.
> I'll need to have a look at tests below.
> 
> As example extension spec says GetFragDataIndex to be equivalent to
> calling  GetProgramResourceLocationIndex(program, PROGRAM_OUTPUT, name).
> 

OK, I am going to review that. Thanks for sharing this information.

Sam

>> Fixes the following two dEQP-GLES31 tests:
>>
>> dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
>>
>> dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element
>>
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> Cc: Tapani Pälli 
>> ---
>>   src/mesa/main/program_resource.c |  6 ++---
>>   src/mesa/main/shader_query.cpp   | 58
>> +++-
>>   src/mesa/main/shaderapi.c|  4 +--
>>   src/mesa/main/shaderapi.h|  9 ---
>>   src/mesa/main/uniforms.c |  6 ++---
>>   5 files changed, 60 insertions(+), 23 deletions(-)
>>
>> diff --git a/src/mesa/main/program_resource.c
>> b/src/mesa/main/program_resource.c
>> index eb71fdd..d029926 100644
>> --- a/src/mesa/main/program_resource.c
>> +++ b/src/mesa/main/program_resource.c
>> @@ -251,7 +251,7 @@ _mesa_GetProgramResourceIndex(GLuint program,
>> GLenum programInterface,
>>  case GL_UNIFORM_BLOCK:
>>  case GL_SHADER_STORAGE_BLOCK:
>> res = _mesa_program_resource_find_name(shProg,
>> programInterface, name,
>> - _index);
>> + _index, true);
>> if (!res || array_index > 0)
>>return GL_INVALID_INDEX;
>>   @@ -377,7 +377,7 @@ _mesa_GetProgramResourceLocation(GLuint program,
>> GLenum programInterface,
>>goto fail;
>>  }
>>   -   return _mesa_program_resource_location(shProg, programInterface,
>> name);
>> +   return _mesa_program_resource_location(shProg, programInterface,
>> name, true);
>>   fail:
>>  _mesa_error(ctx, GL_INVALID_ENUM,
>> "glGetProgramResourceLocation(%s %s)",
>>  _mesa_enum_to_string(programInterface), name);
>> @@ -411,5 +411,5 @@ _mesa_GetProgramResourceLocationIndex(GLuint
>> program, GLenum programInterface,
>>  }
>>return _mesa_program_resource_location_index(shProg,
>> GL_PROGRAM_OUTPUT,
>> -name);
>> +name, true);
>>   }
>> diff --git a/src/mesa/main/shader_query.cpp
>> b/src/mesa/main/shader_query.cpp
>> index 8182d3d..49766ca 100644
>> --- a/src/mesa/main/shader_query.cpp
>> +++ b/src/mesa/main/shader_query.cpp
>> @@ -218,7 +218,7 @@ _mesa_GetAttribLocation(GLhandleARB program, const
>> GLcharARB * name)
>>  unsigned array_index = 0;
>>  struct gl_program_resource *res =
>> _mesa_program_resource_find_name(shProg, GL_PROGRAM_INPUT, name,
>> -   _index);
>> +   _index, false);
>>if (!res)
>> return -1;
>> @@ -367,7 +367,7 @@ _mesa_GetFragDataIndex(GLuint program, const
>> GLchar *name)
>> return -1;
>>return _mesa_program_resource_location_index(shProg,
>> GL_PROGRAM_OUTPUT,
>> -name);
>> +name, false);
>>   }
>> GLint GLAPIENTRY
>> @@ -404,7 +404,7 @@ _mesa_GetFragDataLocation(GLuint program, const
>> GLchar *name)
>>  unsigned

Re: [Mesa-dev] [PATCH 1/2] main: Remove interface block array index for doing the name comparison

2015-10-22 Thread Tapani Pälli


On 10/22/2015 12:01 PM, Samuel Iglesias Gonsalvez wrote:

 From ARB_program_query_interface spec:

"uint GetProgramResourceIndex(uint program, enum programInterface,
const char *name);
  [...]
  If  exactly matches the name string of one of the active resources
  for , the index of the matched resource is returned.
  Additionally, if  would exactly match the name string of an active
  resource if "[0]" were appended to , the index of the matched
  resource is returned. [...]"

"A string provided to GetProgramResourceLocation or
  GetProgramResourceLocationIndex is considered to match an active variable
  if:
[...]
* if the string identifies the base name of an active array, where the
  string would exactly match the name of the variable if the suffix
  "[0]" were appended to the string;
[...]
"

But this is only happening on those ARB_program_query_interface's queries.
For the rest of specs we need to keep old behavior. For that reason,
arb_program_interface_query boolean is added to the affected functions.


This seems strange as the extension spec (and GL core spec) has set of 
different examples that state the new queries to match behavior of old 
functions (using words "are equivalent to calling"), this is why our 
tests also cross-check that new and old functions return same values. 
I'll need to have a look at tests below.


As example extension spec says GetFragDataIndex to be equivalent to 
calling  GetProgramResourceLocationIndex(program, PROGRAM_OUTPUT, name).



Fixes the following two dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element

Signed-off-by: Samuel Iglesias Gonsalvez 
Cc: Tapani Pälli 
---
  src/mesa/main/program_resource.c |  6 ++---
  src/mesa/main/shader_query.cpp   | 58 +++-
  src/mesa/main/shaderapi.c|  4 +--
  src/mesa/main/shaderapi.h|  9 ---
  src/mesa/main/uniforms.c |  6 ++---
  5 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c
index eb71fdd..d029926 100644
--- a/src/mesa/main/program_resource.c
+++ b/src/mesa/main/program_resource.c
@@ -251,7 +251,7 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum 
programInterface,
 case GL_UNIFORM_BLOCK:
 case GL_SHADER_STORAGE_BLOCK:
res = _mesa_program_resource_find_name(shProg, programInterface, name,
- _index);
+ _index, true);
if (!res || array_index > 0)
   return GL_INVALID_INDEX;
  
@@ -377,7 +377,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface,

   goto fail;
 }
  
-   return _mesa_program_resource_location(shProg, programInterface, name);

+   return _mesa_program_resource_location(shProg, programInterface, name, 
true);
  fail:
 _mesa_error(ctx, GL_INVALID_ENUM, "glGetProgramResourceLocation(%s %s)",
 _mesa_enum_to_string(programInterface), name);
@@ -411,5 +411,5 @@ _mesa_GetProgramResourceLocationIndex(GLuint program, 
GLenum programInterface,
 }
  
 return _mesa_program_resource_location_index(shProg, GL_PROGRAM_OUTPUT,

-name);
+name, true);
  }
diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 8182d3d..49766ca 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -218,7 +218,7 @@ _mesa_GetAttribLocation(GLhandleARB program, const 
GLcharARB * name)
 unsigned array_index = 0;
 struct gl_program_resource *res =
_mesa_program_resource_find_name(shProg, GL_PROGRAM_INPUT, name,
-   _index);
+   _index, false);
  
 if (!res)

return -1;
@@ -367,7 +367,7 @@ _mesa_GetFragDataIndex(GLuint program, const GLchar *name)
return -1;
  
 return _mesa_program_resource_location_index(shProg, GL_PROGRAM_OUTPUT,

-name);
+name, false);
  }
  
  GLint GLAPIENTRY

@@ -404,7 +404,7 @@ _mesa_GetFragDataLocation(GLuint program, const GLchar 
*name)
 unsigned array_index = 0;
 struct gl_program_resource *res =
_mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, name,
-   _index);
+   _index, false);
  
 if (!res)

return -1;
@@ -533,9 +533,12 @@ valid_array_index(const GLchar *name, unsigned 
*array_index)
  struct gl_program_resource *
  _mesa_program_resource_find_name(struct

Re: [Mesa-dev] [PATCH 7/9] i965: Add initial assembly validation pass.

2015-10-22 Thread Emil Velikov

On 21 October 2015 at 23:58, Matt Turner  wrote:
> Initially just checks that sources are non-NULL, which would have
> alerted us to the problem fixed by commit 6c846dc5.

What are the chances of combining this with and/or removing
fs_visitor::validate() ? Afaict both of these validations are at
different level and we do not have enough information within
brw_validate_instructions(), do we ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] main: Remove interface block array index for doing the name comparison

2015-10-22 Thread Samuel Iglesias Gonsálvez



On 22/10/15 13:06, Timothy Arceri wrote:
> On Thu, 2015-10-22 at 11:01 +0200, Samuel Iglesias Gonsalvez wrote:
>> From ARB_program_query_interface spec:
>>
>> "uint GetProgramResourceIndex(uint program, enum programInterface,
>>const char *name);
>>  [...]
>>  If  exactly matches the name string of one of the active
>> resources
>>  for , the index of the matched resource is
>> returned.
>>  Additionally, if  would exactly match the name string of an
>> active
>>  resource if "[0]" were appended to , the index of the matched
>>  resource is returned. [...]"
>>
>> "A string provided to GetProgramResourceLocation or
>>  GetProgramResourceLocationIndex is considered to match an active
>> variable
>>  if:
>> [...]
>>* if the string identifies the base name of an active array, where
>> the
>>  string would exactly match the name of the variable if the
>> suffix
>>  "[0]" were appended to the string;
>> [...]
>> "
>>
>> But this is only happening on those ARB_program_query_interface's
>> queries.
>> For the rest of specs we need to keep old behavior. For that reason,
>> arb_program_interface_query boolean is added to the affected
>> functions.
>>
>> Fixes the following two dEQP-GLES31 tests:
>>
>> dEQP
>> -GLES31.functional.program_interface_query.shader_storage_block.resou
>> rce_list.block_array
>> dEQP
>> -GLES31.functional.program_interface_query.shader_storage_block.resou
>> rce_list.block_array_single_element
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> Cc: Tapani Pälli 
>> ---
>>  src/mesa/main/program_resource.c |  6 ++---
>>  src/mesa/main/shader_query.cpp   | 58
>> +++-
>>  src/mesa/main/shaderapi.c|  4 +--
>>  src/mesa/main/shaderapi.h|  9 ---
>>  src/mesa/main/uniforms.c |  6 ++---
>>  5 files changed, 60 insertions(+), 23 deletions(-)
>>
>> diff --git a/src/mesa/main/program_resource.c
>> b/src/mesa/main/program_resource.c
>> index eb71fdd..d029926 100644
>> --- a/src/mesa/main/program_resource.c
>> +++ b/src/mesa/main/program_resource.c
>> @@ -251,7 +251,7 @@ _mesa_GetProgramResourceIndex(GLuint program,
>> GLenum programInterface,
>> case GL_UNIFORM_BLOCK:
>> case GL_SHADER_STORAGE_BLOCK:
>>res = _mesa_program_resource_find_name(shProg,
>> programInterface, name,
>> - _index);
>> + _index, true);
>>if (!res || array_index > 0)
>>   return GL_INVALID_INDEX;
>>  
>> @@ -377,7 +377,7 @@ _mesa_GetProgramResourceLocation(GLuint program,
>> GLenum programInterface,
>>   goto fail;
>> }
>>  
>> -   return _mesa_program_resource_location(shProg, programInterface,
>> name);
>> +   return _mesa_program_resource_location(shProg, programInterface,
>> name, true);
>>  fail:
>> _mesa_error(ctx, GL_INVALID_ENUM,
>> "glGetProgramResourceLocation(%s %s)",
>> _mesa_enum_to_string(programInterface), name);
>> @@ -411,5 +411,5 @@ _mesa_GetProgramResourceLocationIndex(GLuint
>> program, GLenum programInterface,
>> }
>>  
>> return _mesa_program_resource_location_index(shProg,
>> GL_PROGRAM_OUTPUT,
>> -name);
>> +name, true);
>>  }
>> diff --git a/src/mesa/main/shader_query.cpp
>> b/src/mesa/main/shader_query.cpp
>> index 8182d3d..49766ca 100644
>> --- a/src/mesa/main/shader_query.cpp
>> +++ b/src/mesa/main/shader_query.cpp
>> @@ -218,7 +218,7 @@ _mesa_GetAttribLocation(GLhandleARB program,
>> const GLcharARB * name)
>> unsigned array_index = 0;
>> struct gl_program_resource *res =
>>_mesa_program_resource_find_name(shProg, GL_PROGRAM_INPUT,
>> name,
>> -   _index);
>> +   _index, false);
>>  
>> if (!res)
>>return -1;
>> @@ -367,7 +367,7 @@ _mesa_GetFragDataIndex(GLuint program, const
>> GLchar *name)
>>return -1;
>>  
>> return _mesa_program_resource_location_index(shProg,
>> GL_PROGRAM_OUTPUT,
>> -name);
>> +name, false);
>>  }
>>  
>>  GLint GLAPIENTRY
>> @@ -404,7 +404,7 @@ _mesa_GetFragDataLocation(GLuint program, const
>> GLchar *name)
>> unsigned array_index = 0;
>> struct gl_program_resource *res =
>>_mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT,
>> name,
>> -   _index);
>> +   _index, false);
>>  
>> if (!res)
>>return -1;
>> @@ -533,9 +533,12 @@ valid_array_index(const GLchar *name, unsigned
>> *array_index)
>>  struct gl_program_resource *
>>  _mesa_program_resource_find_name(struct gl_shader_program *shProg,
>>   GLenum programInterface, const char

[Mesa-dev] [PATCH 1/2] nir/cse: invalidate SSBO loads in presence of ssbo writes or memory barriers

2015-10-22 Thread Iago Toral Quiroga

In the next commit we are going to allow CSE of SSBO loads, so we need
to make sure that we consider how SSBO stores/atomics invalidate previous
loads.
---
 src/glsl/nir/nir_opt_cse.c | 142 +
 1 file changed, 142 insertions(+)

diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
index 93a6635..4a57eca 100644
--- a/src/glsl/nir/nir_opt_cse.c
+++ b/src/glsl/nir/nir_opt_cse.c
@@ -32,6 +32,143 @@
  * Implements common subexpression elimination
  */
 
+static bool
+is_ssbo_atomic_intrinsic_opcode(unsigned opcode)
+{
+   switch (opcode) {
+   case nir_intrinsic_ssbo_atomic_add:
+   case nir_intrinsic_ssbo_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_umin:
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_umax:
+   case nir_intrinsic_ssbo_atomic_and:
+   case nir_intrinsic_ssbo_atomic_or:
+   case nir_intrinsic_ssbo_atomic_xor:
+   case nir_intrinsic_ssbo_atomic_exchange:
+   case nir_intrinsic_ssbo_atomic_comp_swap:
+  return true;
+   default:
+  return false;
+   }
+}
+
+static bool
+is_indirect_ssbo_write(nir_intrinsic_instr *instr)
+{
+   return instr->intrinsic == nir_intrinsic_store_ssbo_indirect ||
+  is_ssbo_atomic_intrinsic_opcode(instr->intrinsic);
+}
+
+static bool
+is_ssbo_write(nir_instr *instr)
+{
+   if (instr->type != nir_instr_type_intrinsic)
+  return false;
+
+   nir_intrinsic_instr *intrinsic = nir_instr_as_intrinsic(instr);
+   if (is_ssbo_atomic_intrinsic_opcode(intrinsic->intrinsic))
+  return true;
+
+   return (intrinsic->intrinsic == nir_intrinsic_store_ssbo ||
+   intrinsic->intrinsic == nir_intrinsic_store_ssbo_indirect);
+}
+
+static nir_src *
+get_indirect_ssbo_store_offset(nir_intrinsic_instr *instr)
+{
+   assert(is_indirect_ssbo_write(instr));
+
+   if (instr->intrinsic == nir_intrinsic_store_ssbo_indirect)
+  return >src[2];
+
+   /* atomic */
+   return >src[1];
+}
+
+static bool
+is_memory_barrier(nir_instr *instr)
+{
+   if (instr->type != nir_instr_type_intrinsic)
+  return false;
+
+   nir_intrinsic_instr *intrinsic = nir_instr_as_intrinsic(instr);
+   return intrinsic->intrinsic == nir_intrinsic_memory_barrier;
+}
+
+/*
+ * Removes any SSBO load from the set that get invalidated by this
+ * SSBO store/atomic instruction.
+ */
+static void
+invalidate_ssbo_loads(nir_intrinsic_instr *store, struct set *instr_set)
+{
+   nir_src *store_block = >src[1];
+   for (struct set_entry *entry = _mesa_set_next_entry(instr_set, NULL);
+entry != NULL; entry = _mesa_set_next_entry(instr_set, entry)) {
+
+  nir_instr *cached = (nir_instr *) entry->key;
+  if (cached->type != nir_instr_type_intrinsic)
+ continue;
+
+  nir_intrinsic_instr *load = nir_instr_as_intrinsic(cached);
+  if (load->intrinsic != nir_intrinsic_load_ssbo &&
+  load->intrinsic != nir_intrinsic_load_ssbo_indirect)
+ continue;
+
+  /* if blocks don't match then the load is still valid */
+  nir_src *load_block = >src[0];
+  if ((load_block->ssa->parent_instr->type ==
+   store_block->ssa->parent_instr->type) &&
+  !nir_srcs_equal(*store_block, *load_block))
+ continue;
+
+  /* indirect store / direct load (or viceversa): assume they can conflict 
*/
+  if ((is_indirect_ssbo_write(store) &&
+   load->intrinsic == nir_intrinsic_load_ssbo) ||
+  (store->intrinsic == nir_intrinsic_store_ssbo &&
+   load->intrinsic == nir_intrinsic_load_ssbo_indirect)) {
+ nir_instr_set_remove(instr_set, (nir_instr *)entry->key);
+  }
+  /* indirect store and load: check if src offsets match */
+  else if (is_indirect_ssbo_write(store) &&
+   load->intrinsic == nir_intrinsic_load_ssbo_indirect) {
+ nir_src *store_offset = get_indirect_ssbo_store_offset(store);
+ nir_src *load_offset = >src[1];
+ if (nir_srcs_equal(*store_offset, *load_offset)) {
+nir_instr_set_remove(instr_set, (nir_instr *)entry->key);
+ }
+  }
+  /* direct store and load: check if const_index offsets match */
+  else if (store->intrinsic == nir_intrinsic_store_ssbo &&
+   load->intrinsic == nir_intrinsic_load_ssbo) {
+ unsigned store_offset = store->const_index[0];
+ unsigned load_offset = load->const_index[0];
+ if (store_offset == load_offset) {
+nir_instr_set_remove(instr_set, (nir_instr *)entry->key);
+ }
+  }
+   }
+}
+
+static void
+invalidate_all_ssbo_loads(struct set *instr_set)
+{
+   for (struct set_entry *entry = _mesa_set_next_entry(instr_set, NULL);
+entry != NULL;
+entry = _mesa_set_next_entry(instr_set, entry)) {
+  nir_instr *cached = (nir_instr *) entry->key;
+  if (cached->type != nir_instr_type_intrinsic)
+ continue;
+
+  nir_intrinsic_instr *intrinsic = nir_instr_as_intrinsic(cached);
+  if (intrinsic->intrinsic !=

[Mesa-dev] [PATCH] mesa: clamp MaxLevel for immutable textures at initialization

2015-10-22 Thread Nicolai Hähnle

The same clamping already happens for glTexParameteri. This change
also fixes a bug in mipmap generation, see
https://bugs.freedesktop.org/show_bug.cgi?id=91993

piglit test cases have been submitted for review (as additions to
arb_texture_storage-texture-storage and arb_texture_view-max-level).
---
 src/mesa/main/textureview.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/textureview.c b/src/mesa/main/textureview.c
index 04b7d73..b6eaa77 100644
--- a/src/mesa/main/textureview.c
+++ b/src/mesa/main/textureview.c
@@ -408,6 +408,8 @@ _mesa_set_texture_view_state(struct gl_context *ctx,
   texObj->NumLayers = 6;
   break;
}
+
+   texObj->MaxLevel = MIN2(texObj->MaxLevel, texObj->ImmutableLevels - 1);
 }
 
 /**
@@ -680,6 +682,7 @@ _mesa_TextureView(GLuint texture, GLenum target, GLuint 
origtexture,
texObj->NumLayers = newViewNumLayers;
texObj->Immutable = GL_TRUE;
texObj->ImmutableLevels = origTexObj->ImmutableLevels;
+   texObj->MaxLevel = MIN2(texObj->MaxLevel, texObj->ImmutableLevels - 1);
texObj->Target = target;
texObj->TargetIndex = _mesa_tex_target_to_index(ctx, target);
assert(texObj->TargetIndex < NUM_TEXTURE_TARGETS);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/6] [v2] i965/fs: Enumerate logical fb writes arguments

2015-10-22 Thread Francisco Jerez

Ben Widawsky  writes:

> Gen9 adds the ability to write out a stencil value, so we need to expand the
> virtual payload by one. Abstracting this now makes that change easier to read.
>
> I was admittedly confused early on about some of the hardcoding. If people
> believe the resulting code is inferior, I am not super attached to the patch.
>
> v2:
> Remove explicit numbering from the enumeration (Matt).
> Use a real naming scheme, and reference it in the opcode definition (Curro)
>   - LOGICAL_SRC_SRC_DEPTH kinda sucks... but it's consistent
> Add a missed hardcoded logical position in get_lowered_simd_width (Ben)
> Add an assertion to make sure the component numbering is correct (Ben)
>
> Cc: Matt Turner 
> Cc: Francisco Jerez 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 22 +-
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  1 +
>  3 files changed, 27 insertions(+), 20 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index e61ad54..a2f59ea 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -911,15 +911,9 @@ enum opcode {
>  
> /**
>  * Same as FS_OPCODE_FB_WRITE but expects its arguments separately as
> -* individual sources instead of as a single payload blob:
> -*
> -* Source 0: [required] Color 0.
> -* Source 1: [optional] Color 1 (for dual source blend messages).
> -* Source 2: [optional] Src0 Alpha.
> -* Source 3: [optional] Source Depth (gl_FragDepth)
> -* Source 4: [optional (gen4-5)] Destination Depth passthrough from thread
> -* Source 5: [optional] Sample Mask (gl_SampleMask).
> -* Source 6: [required] Number of color components (as a UD immediate).
> +* individual sources instead of as a single payload blob. The
> +* position/ordering of the arguments are defined by the enum
> +* fb_write_logical_srcs.
>  */
> FS_OPCODE_FB_WRITE_LOGICAL,
>  
> @@ -1318,6 +1312,16 @@ enum brw_urb_write_flags {
>BRW_URB_WRITE_ALLOCATE | BRW_URB_WRITE_COMPLETE,
>  };
>  
> +enum fb_write_logical_srcs {
> +   FB_WRITE_LOGICAL_SRC_COLOR0,  /* REQUIRED */
> +   FB_WRITE_LOGICAL_SRC_COLOR1,  /* for dual source blend messages */
> +   FB_WRITE_LOGICAL_SRC_SRC0_ALPHA,
> +   FB_WRITE_LOGICAL_SRC_SRC_DEPTH,   /* gl_FragDepth */
> +   FB_WRITE_LOGICAL_SRC_DST_DEPTH,   /* GEN4-5: passthrough from thread */
> +   FB_WRITE_LOGICAL_SRC_OMASK,   /* Sample Mask (gl_SampleMask) */
> +   FB_WRITE_LOGICAL_SRC_COMPONENTS,  /* REQUIRED */
> +};
> +
>  #ifdef __cplusplus
>  /**
>   * Allow brw_urb_write_flags enums to be ORed together.
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index da90467..ef06a70 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -695,10 +695,10 @@ fs_inst::components_read(unsigned i) const
>return 2;
>  
> case FS_OPCODE_FB_WRITE_LOGICAL:
> -  assert(src[6].file == IMM);
> +  assert(src[FB_WRITE_LOGICAL_SRC_COMPONENTS].file == IMM);
>/* First/second FB write color. */
>if (i < 2)
> - return src[6].fixed_hw_reg.dw1.ud;
> + return src[FB_WRITE_LOGICAL_SRC_COMPONENTS].fixed_hw_reg.dw1.ud;
>else
>   return 1;
>  
> @@ -3339,15 +3339,16 @@ lower_fb_write_logical_send(const fs_builder , 
> fs_inst *inst,
>  const brw_wm_prog_key *key,
>  const fs_visitor::thread_payload )
>  {
> -   assert(inst->src[6].file == IMM);
> +   assert(inst->src[FB_WRITE_LOGICAL_SRC_COMPONENTS].file == IMM);
> const brw_device_info *devinfo = bld.shader->devinfo;
> -   const fs_reg  = inst->src[0];
> -   const fs_reg  = inst->src[1];
> -   const fs_reg _alpha = inst->src[2];
> -   const fs_reg _depth = inst->src[3];
> -   const fs_reg _depth = inst->src[4];
> -   fs_reg sample_mask = inst->src[5];
> -   const unsigned components = inst->src[6].fixed_hw_reg.dw1.ud;
> +   const fs_reg  = inst->src[FB_WRITE_LOGICAL_SRC_COLOR0];
> +   const fs_reg  = inst->src[FB_WRITE_LOGICAL_SRC_COLOR1];
> +   const fs_reg _alpha = inst->src[FB_WRITE_LOGICAL_SRC_SRC0_ALPHA];
> +   const fs_reg _depth = inst->src[FB_WRITE_LOGICAL_SRC_SRC_DEPTH];
> +   const fs_reg _depth = inst->src[FB_WRITE_LOGICAL_SRC_DST_DEPTH];
> +   fs_reg sample_mask = inst->src[FB_WRITE_LOGICAL_SRC_OMASK];
> +   const unsigned components =
> +  inst->src[FB_WRITE_LOGICAL_SRC_COMPONENTS].fixed_hw_reg.dw1.ud;
>  
> /* We can potentially have a message length of up to 15, so we have to set
>  * base_mrf to either 0 or 1 in order to fit in m0..m15.
> @@ -4175,7 +4176,8 @@ get_lowered_simd_width(const struct

[Mesa-dev] [PATCH 3/4] gallivm: fix sampling with texture offsets in SoA path

2015-10-22 Thread sroland

From: Roland Scheidegger 

When using nearest filtering and clamp / clamp to edge wrapping results could
be wrong for negative offsets. Fix this by adding the offset before doing
the conversion to int coords (could also use floor instead of trunc int
conversion but probably more complex on "typical" cpu).

This fixes the piglit texwrap offset failures with this filter/wrap combo
(which only leaves the linear/mirror repeat combination broken).
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index b5c06b6..125505e 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -567,12 +567,13 @@ lp_build_sample_wrap_nearest(struct 
lp_build_sample_context *bld,
  coord = lp_build_mul(coord_bld, coord, length_f);
   }
 
+  if (offset) {
+ offset = lp_build_int_to_float(coord_bld, offset);
+ coord = lp_build_add(coord_bld, coord, offset);
+  }
   /* floor */
   /* use itrunc instead since we clamp to 0 anyway */
   icoord = lp_build_itrunc(coord_bld, coord);
-  if (offset) {
- icoord = lp_build_add(int_coord_bld, icoord, offset);
-  }
 
   /* clamp to [0, length - 1]. */
   icoord = lp_build_clamp(int_coord_bld, icoord, int_coord_bld->zero,
@@ -2586,6 +2587,10 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
   derived_sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
   derived_sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
}
+   /*
+* We could force CLAMP to CLAMP_TO_EDGE here if min/mag filter is nearest,
+* so AoS path could be used. Not sure it's worth the trouble...
+*/
 
min_img_filter = derived_sampler_state.min_img_filter;
mag_img_filter = derived_sampler_state.mag_img_filter;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 06/21] mesa/extensions: Create _mesa_extension_supported()

2015-10-22 Thread Nanley Chery

On Thu, Oct 22, 2015 at 3:19 AM, Emil Velikov 
wrote:

> On 19 October 2015 at 23:36, Nanley Chery  wrote:
> > From: Nanley Chery 
> >
> > Create a function which determines if an extension is supported in the
> > current context.
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/main/extensions.c | 17 +
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> > index 390e026..7137bc9 100644
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -423,6 +423,23 @@ typedef unsigned short extension_index;
> >
> >
> >  /**
> > + * Given an extension enum, return whether or not the extension is
> supported
> > + * dependent on the following factors:
> > + * There's driver support and the OpenGL/ES version is at least that
> > + * specified in the extension_table.
> > + */
> > +static inline bool
> > +_mesa_extension_supported(const struct gl_context *ctx, extension_index
> ei)
> > +{
> > +   const bool *base = (bool *) >Extensions;
> > +   const struct extension *i = extension_table + ei;
> > +   const uint8_t api_set = 1 << ctx->API;
> > +   return (i->api_set & api_set) &&
> > +  (ctx->Version >= i->version[ctx->API]) &&
> > +  base[i->offset];
> Bikeshed: I realise that you're copying most of these, but wouldn't it
> be better if we use more common/intuitive variable names ?
>
> ei -> i or idx
> i -> ext
>
>
Yes, I was mostly copying what was around. Thanks for the variable name
suggestions. I'll update this function.


> -Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 06/21] mesa/extensions: Create _mesa_extension_supported()

2015-10-22 Thread Nanley Chery

On Thu, Oct 22, 2015 at 11:08 AM, Chad Versace 
wrote:

> On Thu 22 Oct 2015, Emil Velikov wrote:
> > On 19 October 2015 at 23:36, Nanley Chery  wrote:
> > > From: Nanley Chery 
> > >
> > > Create a function which determines if an extension is supported in the
> > > current context.
> > >
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/mesa/main/extensions.c | 17 +
> > >  1 file changed, 17 insertions(+)
> > >
> > > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> > > index 390e026..7137bc9 100644
> > > --- a/src/mesa/main/extensions.c
> > > +++ b/src/mesa/main/extensions.c
> > > @@ -423,6 +423,23 @@ typedef unsigned short extension_index;
> > >
> > >
> > >  /**
> > > + * Given an extension enum, return whether or not the extension is
> supported
> > > + * dependent on the following factors:
> > > + * There's driver support and the OpenGL/ES version is at least that
> > > + * specified in the extension_table.
> > > + */
> > > +static inline bool
> > > +_mesa_extension_supported(const struct gl_context *ctx,
> extension_index ei)
> > > +{
> > > +   const bool *base = (bool *) >Extensions;
> > > +   const struct extension *i = extension_table + ei;
> > > +   const uint8_t api_set = 1 << ctx->API;
> > > +   return (i->api_set & api_set) &&
> > > +  (ctx->Version >= i->version[ctx->API]) &&
> > > +  base[i->offset];
> > Bikeshed: I realise that you're copying most of these, but wouldn't it
> > be better if we use more common/intuitive variable names ?
> >
> > ei -> i or idx
> > i -> ext
>
> I second Emil's suggestion.
>

> Please insert an empty line between the block of variable declarations
> and the return statement.
>
> Also, it improves the readability if you rename the local variable
> 'api_set' to 'api_bit', because it's not really a set. It's just one
> bit.
>

I'll incorporate both changes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: check inst->predicate when clearing flag_live at dead code eliminate

2015-10-22 Thread Alejandro Piñeiro

Detected by Matt Turner while reviewing commit
a59359ecd22154cc2b3f88bb8c599f21af8a3934
---
 src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp   | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 4b5548a..1eaf147 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -105,7 +105,7 @@ fs_visitor::dead_code_eliminate()
 }
  }
 
- if (inst->writes_flag()) {
+ if (inst->writes_flag() && !inst->predicate) {
 BITSET_CLEAR(flag_live, inst->flag_subreg);
  }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
index 284e0a8..e8a51d6 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
@@ -145,7 +145,7 @@ vec4_visitor::dead_code_eliminate()
 }
  }
 
- if (inst->writes_flag()) {
+ if (inst->writes_flag() && !inst->predicate) {
 for (unsigned c = 0; c < 4; c++)
BITSET_CLEAR(flag_live, c);
  }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 03/21] mesa/extensions: Wrap array entries in macros

2015-10-22 Thread Nanley Chery

On Thu, Oct 22, 2015 at 3:06 AM, Emil Velikov 
wrote:

> On 20 October 2015 at 16:43, Nanley Chery  wrote:
> > On Tue, Oct 20, 2015 at 8:16 AM, Marek Olšák  wrote:
> >>
> >> Also, the FIXME comment should be on its own line.
> >>
> >
> > I moved it aside to make editing the table easier. However, since the
> > formatting of the
> > table is unlikely to change much after this series, I agree that I should
> > move it back to
> > its original position.
> >
> Actually the designated initalisers should be fine in core mesa. If in
> doubt wrt MSVC compat, just grep MSVC.*COMPAT through whole of mesa.
> The 2013 version adds support for this feature.
> Alternatively feel free to ask Brian/Jose, as they have a fair bit of
> experience in the area.
>
>
That's good news. Thanks for the information.


> That aside, I'm in favour of keeping the comments as is. The editing
> comment does not apply imho.
>
>
Since, there isn't a unanimous opinion on this, I'll leave the FIXME
to save some rebasing time. Are you referring to my git comment?
Or to my previous reply about why I moved the FIXME?

-Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] i965/fs: Make emit_uniformize a no-op for immediates

2015-10-22 Thread Kristian Høgsberg

On Tue, Oct 20, 2015 at 11:56 AM, Francisco Jerez  wrote:
> Kristian Høgsberg  writes:
>
>> On Tue, Oct 20, 2015 at 3:16 AM, Francisco Jerez  
>> wrote:
>>> Kristian Høgsberg  writes:
>>>
 On Mon, Oct 19, 2015 at 4:19 AM, Francisco Jerez  
 wrote:
> Neil Roberts  writes:
>
>> Just a thought, would it be better to move this check into the
>> eliminate_find_live_channel optimisation? That way it could catch
>> sources that become immediates through later optimisations. One problem
>> with this that I've seen before is that eliminating the
>> FIND_LIVE_CHANNEL doesn't cause the subsequent BROADCAST to be
>> eliminated because the copy propagation doesn't work.
>
> I believe in this particular case the BROADCAST instruction should
> already be eliminated (by opt_algebraic), because its first source is an
> immediate, so this optimization would in fact be redundant with
> opt_algebraic+constant propagation if it weren't because the latter is
> unable to handle scalar copies.

 Yes, is happens as you say: first the FIND_LIVE_CHANNEL gets
 eliminated because it's outside control flow
 (fs_visitor::eliminate_find_live_channel). Then the broadcast gets
 reduced to a MOV in opt_algebraic because src0 is uniform (immediate
 constant). The problem then is the dst of the MOV has stride 0 which
 can_propagate_from() then bails on.

> Another possibility that would likely
> be more general than this (because it would handle cases in which, as
> Neil said, the argument becomes an immediate only after optimization,
> and because it would also avoid the issue with liveness analysis
> extending the live ranges of scalar values incorrectly [1]) would be to
> extend the destination register of the BROADCAST instructions to a
> vector [2] and then add a case to the switch statement of
> try_constant_propagate() so that the immediate MOV resulting from
> opt_algebraic is propagated into surface instructions (see attachment).

 I wrote exactly this code to make constant propagate work, plus adding
 the extra opcodes to the switch statement. It works and I could
 certainly send that out after this, but

 1) This doesn't mean we shouldn't do the if (src.file == IMM)
 shortcut. It saves the compiler a bit of work in the very common case
 of
 non-indirect buffer access.

 2) I'm not even sure it makes sense to extend copy-propagation to do
 this (which is why I went back to just the IMM test). Anything that
 would be an immediate at this point should be an immediate, if not
 we're missing something in nir.

>>> Still this doesn't address the root of the problem, which is that
>>> emit_uniformize() emits scalar code that the rest of the compiler is not
>>> able to handle properly.
>>
>> This is not a case of papering over the root cause, it's about not
>> creating the root cause in the first place. The output of
>> emit_uniformize() always ends up as either a surface or a sampler
>> index, which we only look at later in the generator. There are no
>> other cases where the result of emit_uniformize() might be part of an
>> expression that we can copy propagate or otherwise optimize. If the
>> input to emit_uniformize() isn't an immediate where it could be, nir
>> optimization needs fixing. So if we add these two lines to
>> emit_uniformize() to pass immediates straight through, we avoid
>> generating code that we have to extend the copy prop pass to handle.
>>
>
> Kristian, there are legitimate uses of emit_uniformize() in which the
> argument is not an immediate but still can be optimized out later on --
> E.g. for images it will frequently be a uniform register, or a
> non-constant expression calculated within uniform control flow (in which
> case the FIND_LIVE_CHANNEL instruction emitted here will be reduced to a
> constant MOV by eliminate_find_live_chaso nnel()).  In such cases we still
> want copy-propagation to kick in, but it wont because of the problem I
> was talking about with scalar writes.  Even if the instructions emitted
> by emit_uniformize() cannot be optimized out, liveness analysis will
> overestimate the live ranges of the temporaries used by
> emit_uniformize() for the same reason, potentially causing the register
> allocator to spill or run out of registers.

Yeah, fair point. I went and took a look at non-constant surface index
and sent out a new series that make that work better as well as ssbo
write optimizations. I took out the imm shortcut and put in the
generic optimizations. It makes non-const surface index a little
better, but as I wrote in the cover letter, it still doesn't work that
well as we don't propagate the surface index into the send
instructions.

> Anyway don't take me wrong,

[Mesa-dev] [Bug 38869] Cannot create a valid pBuffer with height and/or width of zero

2015-10-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=38869

--- Comment #5 from Corentin Wallez  ---
Bumping this as I've run into this bug (feature?) while porting ANGLE to run on
GLX. When ANGLE is initialized we make the context current on a dummy PBuffer
created with {None} as attributes, which makes it a (0, 0)-sized Pbuffer.

Doing so on Mesa 10.5.9 on a HD Graphics 5500 causes the following to be
thrown:
  X Error of failed request:  BadValue (integer parameter out of range for
operation)
Major opcode of failed request:  53 (X_CreatePixmap)

It seems that the driver might be trying to back the PBuffer using a Pixmap of
the same size. The documentation of XCreatePixmap says that the width and
height should be non-zero or BadValue is generated.

Let me know if you need more information.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] gallivm: fix tex offsets with mirror repeat linear

2015-10-22 Thread sroland

From: Roland Scheidegger 

Can't see why anyone would ever want to use this, but it was clearly broken.
This fixes the piglit texwrap offset test using this combination.
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 125505e..26bfa0d 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -405,16 +405,17 @@ lp_build_sample_wrap_linear(struct 
lp_build_sample_context *bld,
   break;
 
case PIPE_TEX_WRAP_MIRROR_REPEAT:
+  if (offset) {
+ offset = lp_build_int_to_float(coord_bld, offset);
+ offset = lp_build_div(coord_bld, offset, length_f);
+ coord = lp_build_add(coord_bld, coord, offset);
+  }
   /* compute mirror function */
   coord = lp_build_coord_mirror(bld, coord);
 
   /* scale coord to length */
   coord = lp_build_mul(coord_bld, coord, length_f);
   coord = lp_build_sub(coord_bld, coord, half);
-  if (offset) {
- offset = lp_build_int_to_float(coord_bld, offset);
- coord = lp_build_add(coord_bld, coord, offset);
-  }
 
   /* convert to int, compute lerp weight */
   lp_build_ifloor_fract(coord_bld, coord, , );
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] softpipe: fix using non-zero layer in non-array view from array resource

2015-10-22 Thread sroland

From: Roland Scheidegger 

For vertex/geometry shader sampling, this is the same as for llvmpipe - just
use the original resource target.
For fragment shader sampling though (which does not use first-layer based mip
offsets) adjust the sampling code to use first_layer in the non-array cases.
While here also fix up some code which looked wrong wrt buffer texel fetch
(no piglit change).
---
 src/gallium/drivers/softpipe/sp_state_sampler.c |  8 +++
 src/gallium/drivers/softpipe/sp_tex_sample.c| 32 +
 2 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_state_sampler.c 
b/src/gallium/drivers/softpipe/sp_state_sampler.c
index d7a3360..23ec4ef 100644
--- a/src/gallium/drivers/softpipe/sp_state_sampler.c
+++ b/src/gallium/drivers/softpipe/sp_state_sampler.c
@@ -214,10 +214,10 @@ prepare_shader_sampling(
   row_stride[j] = sp_tex->stride[j];
   img_stride[j] = sp_tex->img_stride[j];
}
-   if (view->target == PIPE_TEXTURE_1D_ARRAY ||
-   view->target == PIPE_TEXTURE_2D_ARRAY ||
-   view->target == PIPE_TEXTURE_CUBE ||
-   view->target == PIPE_TEXTURE_CUBE_ARRAY) {
+   if (tex->target == PIPE_TEXTURE_1D_ARRAY ||
+   tex->target == PIPE_TEXTURE_2D_ARRAY ||
+   tex->target == PIPE_TEXTURE_CUBE ||
+   tex->target == PIPE_TEXTURE_CUBE_ARRAY) {
   num_layers = view->u.tex.last_layer - 
view->u.tex.first_layer + 1;
   for (j = first_level; j <= last_level; j++) {
  mip_offsets[j] += view->u.tex.first_layer *
diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index 8a09350..e3e28a3 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -1033,6 +1033,7 @@ img_filter_2d_linear_repeat_POT(const struct 
sp_sampler_view *sp_sview,
   
addr.value = 0;
addr.bits.level = args->level;
+   addr.bits.z = sp_sview->base.u.tex.first_layer;
 
/* Can we fetch all four at once:
 */
@@ -1081,6 +1082,7 @@ img_filter_2d_nearest_repeat_POT(const struct 
sp_sampler_view *sp_sview,
 
addr.value = 0;
addr.bits.level = args->level;
+   addr.bits.z = sp_sview->base.u.tex.first_layer;
 
out = get_texel_2d_no_border(sp_sview, addr, x0, y0);
for (c = 0; c < TGSI_QUAD_SIZE; c++)
@@ -,6 +1113,7 @@ img_filter_2d_nearest_clamp_POT(const struct 
sp_sampler_view *sp_sview,
 
addr.value = 0;
addr.bits.level = args->level;
+   addr.bits.z = sp_sview->base.u.tex.first_layer;
 
x0 = util_ifloor(u);
if (x0 < 0) 
@@ -1154,7 +1157,8 @@ img_filter_1d_nearest(const struct sp_sampler_view 
*sp_sview,
 
sp_samp->nearest_texcoord_s(args->s, width, args->offset[0], );
 
-   out = get_texel_2d(sp_sview, sp_samp, addr, x, 0);
+   out = get_texel_1d_array(sp_sview, sp_samp, addr, x,
+sp_sview->base.u.tex.first_layer);
for (c = 0; c < TGSI_QUAD_SIZE; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
@@ -1215,6 +1219,7 @@ img_filter_2d_nearest(const struct sp_sampler_view 
*sp_sview,
  
addr.value = 0;
addr.bits.level = args->level;
+   addr.bits.z = sp_sview->base.u.tex.first_layer;
 
sp_samp->nearest_texcoord_s(args->s, width, args->offset[0], );
sp_samp->nearest_texcoord_t(args->t, height, args->offset[1], );
@@ -1396,8 +1401,10 @@ img_filter_1d_linear(const struct sp_sampler_view 
*sp_sview,
 
sp_samp->linear_texcoord_s(args->s, width, args->offset[0], , , );
 
-   tx0 = get_texel_2d(sp_sview, sp_samp, addr, x0, 0);
-   tx1 = get_texel_2d(sp_sview, sp_samp, addr, x1, 0);
+   tx0 = get_texel_1d_array(sp_sview, sp_samp, addr, x0,
+sp_sview->base.u.tex.first_layer);
+   tx1 = get_texel_1d_array(sp_sview, sp_samp, addr, x1,
+sp_sview->base.u.tex.first_layer);
 
/* interpolate R, G, B, A */
for (c = 0; c < TGSI_QUAD_SIZE; c++)
@@ -1523,6 +1530,7 @@ img_filter_2d_linear(const struct sp_sampler_view 
*sp_sview,
 
addr.value = 0;
addr.bits.level = args->level;
+   addr.bits.z = sp_sview->base.u.tex.first_layer;
 
sp_samp->linear_texcoord_s(args->s, width,  args->offset[0], , , );
sp_samp->linear_texcoord_t(args->t, height, args->offset[1], , , );
@@ -3252,10 +3260,22 @@ sp_get_texels(const struct sp_sampler_view *sp_sview,
 
switch (sp_sview->base.target) {
case PIPE_BUFFER:
+  for (j = 0; j < TGSI_QUAD_SIZE; j++) {
+ const int x = CLAMP(v_i[j] + offset[0] +
+ sp_sview->base.u.buf.first_element,
+ sp_sview->base.u.buf.first_element,
+ sp_sview->base.u.buf.last_element);
+ tx = get_texel_2d_no_border(sp_sview, addr, x, 0);
+ for (c = 0; c < 4; c++) {
+

[Mesa-dev] [PATCH 1/4] llvmpipe: fix using non-zero layer in non-array view from array resource

2015-10-22 Thread sroland

From: Roland Scheidegger 

Just need to use resource target not view target when calculating
first-layer based mip offsets. (This is a gl specific problem since
d3d10 does not distinguish between non-array and array resources neither
at the resource nor view level, only at the shader level.)
Fixes new piglit arb_texture_view sampling-2d-array-as-2d-layer test.
---
 src/gallium/drivers/llvmpipe/lp_setup.c | 8 
 src/gallium/drivers/llvmpipe/lp_state_sampler.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 4c8167a..1778b13 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -854,10 +854,10 @@ lp_setup_set_fragment_sampler_views(struct 
lp_setup_context *setup,
  jit_tex->img_stride[j] = lp_tex->img_stride[j];
   }
 
-  if (view->target == PIPE_TEXTURE_1D_ARRAY ||
-  view->target == PIPE_TEXTURE_2D_ARRAY ||
-  view->target == PIPE_TEXTURE_CUBE ||
-  view->target == PIPE_TEXTURE_CUBE_ARRAY) {
+  if (res->target == PIPE_TEXTURE_1D_ARRAY ||
+  res->target == PIPE_TEXTURE_2D_ARRAY ||
+  res->target == PIPE_TEXTURE_CUBE ||
+  res->target == PIPE_TEXTURE_CUBE_ARRAY) {
  /*
   * For array textures, we don't have first_layer, instead
   * adjust last_layer (stored as depth) plus the mip level 
offsets
diff --git a/src/gallium/drivers/llvmpipe/lp_state_sampler.c 
b/src/gallium/drivers/llvmpipe/lp_state_sampler.c
index b205f02..1e05587 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_sampler.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_sampler.c
@@ -275,10 +275,10 @@ prepare_shader_sampling(
   row_stride[j] = lp_tex->row_stride[j];
   img_stride[j] = lp_tex->img_stride[j];
}
-   if (view->target == PIPE_TEXTURE_1D_ARRAY ||
-   view->target == PIPE_TEXTURE_2D_ARRAY ||
-   view->target == PIPE_TEXTURE_CUBE ||
-   view->target == PIPE_TEXTURE_CUBE_ARRAY) {
+   if (tex->target == PIPE_TEXTURE_1D_ARRAY ||
+   tex->target == PIPE_TEXTURE_2D_ARRAY ||
+   tex->target == PIPE_TEXTURE_CUBE ||
+   tex->target == PIPE_TEXTURE_CUBE_ARRAY) {
   num_layers = view->u.tex.last_layer - 
view->u.tex.first_layer + 1;
   for (j = first_level; j <= last_level; j++) {
  mip_offsets[j] += view->u.tex.first_layer *
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Introducing OpenSWR: High performance software rasterizer

2015-10-22 Thread Jose Fonseca


On 22/10/15 00:43, Rowley, Timothy O wrote:



On Oct 20, 2015, at 5:58 PM, Jose Fonseca  wrote:

Thanks for the explanations.  It's closer now, but still a bit of gap:

$ KNOB_MAX_THREADS_PER_CORE=0 ./gloss
SWR create screen!
This processor supports AVX2.
--> numThreads = 3
1102 frames in 5.002 seconds = 220.312 FPS
1133 frames in 5.001 seconds = 226.555 FPS
1130 frames in 5.002 seconds = 225.91 FPS
^C
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=2 ./gloss
1456 frames in 5 seconds = 291.2 FPS
1617 frames in 5.003 seconds = 323.206 FPS
1571 frames in 5.002 seconds = 314.074 FPS


A bit more of an apples to apples comparison might be single-threaded llvmpipe 
(LP_NUM_THREADS=1) and single-threaded swr (KNOB_SINGLE_THREADED=1).  Running 
gloss and glxgears (another favorite “benchmark” :) ) under these conditions 
show swr running a bit slower, though a little closer than your numbers.



Indeed that seems a better comparison.

$ KNOB_SINGLE_THREADED=1 ./gloss
SWR create screen!
This processor supports AVX2.
733 frames in 5.003 seconds = 146.512 FPS
787 frames in 5.004 seconds = 157.274 FPS
793 frames in 5.005 seconds = 158.442 FPS
799 frames in 5.001 seconds = 159.768 FPS
787 frames in 5.005 seconds = 157.243 FPS
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=0 ./gloss
939 frames in 5.002 seconds = 187.725 FPS
1032 frames in 5.001 seconds = 206.359 FPS
1017 frames in 5.002 seconds = 203.319 FPS
1021 frames in 5 seconds = 204.2 FPS
1039 frames in 5.002 seconds = 207.717 FPS

> Examining performance traces, we think swr’s concept of hot-tiles, 
the working memory representation of the render target, and the 
associated load/store functions contribute to most of the difference. 
We might be able to optimize those conversions; additionally fast clear 
would help these demos.  For larger workloads this small per-frame cost 
doesn’t really affect the performance.



These initial observations from you and others regarding performance have been 
interesting.  Our performance work has been with large workloads on high core 
count configurations, where while some of the decisions such as a dedicated 
core for the application/API might have cost performance a bit, the percentage 
is much less than on the dual and quad core processors.  We’ll look into some 
changes/tuning that will benefit both extremes, though we might have to end up 
conceding that llvmpipe will be faster at glxgears. :-)


I don't care for gears -- it practically measure present/blit rate --, 
but gloss spite simple is sensitive to texturing performance.



Final thoughts: I understand this project has its own history, but I echo what 
Roland said -- it would be nice to unify with llvmpipe at one point, in some 
way or fashion.  Our (VMware's) focus has been desktop composition, but there's 
no reason why a single SW renderer can't satisfy both ends of the spectrum, 
especially for JIT enable renderers, since they can emit at runtime the code 
most suited for the workload.


We would be happy for someone to take some of the ideas from swr to speed up 
llvmpipe, but for now our development will continue on the swr core and driver. 
 We’re not planning on replacing llvmpipe - its intent of working on any 
architecture is admirable.  In the ideal world the solution would be something 
that combines the best traits of both rasterizers, but at this point the 
shortest path to having a performant solution for our customers is with swr.


Fair enough.

They do share a lot already, Mesa, gallium statetracker, and gallivm. 
If further development in openswr is planned, it might require to jump 
through a few hoops, but I think it's worth to figure out what would 
take to get this merged into master so that, whenever there are 
interface changes, openswer won't get the short stick.



That said, it's really nice seeing Mesa and Gallium enabling this sort of 
experiments with SW rendering.


Yes, we were quite happy with how fast we were able to get a new driver 
functioning with gallium.  The major thing slowing us was the documentation, 
which is not uniform in coverage.  There was a lot of reading other drivers’ 
source to figure out how things were supposed to work.


Yes, that's a fair comment.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 5/8] i965/fs: Avoid scalar destinations in emit_uniformize()

2015-10-22 Thread Kristian Høgsberg

On Thu, Oct 22, 2015 at 12:24 PM, Matt Turner  wrote:
> On Thu, Oct 22, 2015 at 11:37 AM, Kristian Høgsberg Kristensen
>  wrote:
>> The scalar destination registers break copy propagation. Instead compute
>> the results to a regular register and then reference a component when we
>> later use the result as a source.
>
> It might be hairy to get it working, but making copy propagation
> handle scalar destinations is the "right" way forward.
>
> If we commit this patch, and later implement support for copy
> propagating with scalar destinations, how will we remember to change
> this back?

What's the harm in leaving it in?

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: don't load state parameters if there are none

2015-10-22 Thread Brian Paul


On 10/22/2015 05:02 PM, Marek Olšák wrote:

From: Marek Olšák 

Out of 7063 shaders from my shader-db:
- 6564 (93%) shaders don't have any state parameters.
- 347 (5%) shaders have 1 state parameter for WPOS lowering.
- The remaining 2% have more state parameters, usually matrices.
---
  src/mesa/state_tracker/st_atom_constbuf.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_atom_constbuf.c 
b/src/mesa/state_tracker/st_atom_constbuf.c
index acaa85d..20f8b3d 100644
--- a/src/mesa/state_tracker/st_atom_constbuf.c
+++ b/src/mesa/state_tracker/st_atom_constbuf.c
@@ -73,7 +73,8 @@ void st_upload_constants( struct st_context *st,
 * the parameters list are explicitly set by the user with glUniform,
 * glProgramParameter(), etc.
 */
-  _mesa_load_state_parameters(st->ctx, params);
+  if (params->StateFlags)
+ _mesa_load_state_parameters(st->ctx, params);

/* We always need to get a new buffer, to keep the drivers simple and
 * avoid gratuitous rendering synchronization.



Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] util/format: add helper util_format_is_snorm8

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_format.c | 19 +++
 src/gallium/auxiliary/util/u_format.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_format.c 
b/src/gallium/auxiliary/util/u_format.c
index c1ce408..79630bf 100644
--- a/src/gallium/auxiliary/util/u_format.c
+++ b/src/gallium/auxiliary/util/u_format.c
@@ -170,6 +170,25 @@ util_format_is_snorm(enum pipe_format format)
 }
 
 boolean
+util_format_is_snorm8(enum pipe_format format)
+{
+   const struct util_format_description *desc = 
util_format_description(format);
+   int i;
+
+   if (desc->is_mixed)
+  return FALSE;
+
+   i = util_format_get_first_non_void_channel(format);
+   if (i == -1)
+  return FALSE;
+
+   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
+  !desc->channel[i].pure_integer &&
+  desc->channel[i].normalized &&
+  desc->channel[i].size == 8;
+}
+
+boolean
 util_format_is_luminance_alpha(enum pipe_format format)
 {
const struct util_format_description *desc =
diff --git a/src/gallium/auxiliary/util/u_format.h 
b/src/gallium/auxiliary/util/u_format.h
index 42b39ff..a1b1b28 100644
--- a/src/gallium/auxiliary/util/u_format.h
+++ b/src/gallium/auxiliary/util/u_format.h
@@ -686,6 +686,9 @@ util_format_is_pure_uint(enum pipe_format format);
 boolean
 util_format_is_snorm(enum pipe_format format);
 
+boolean
+util_format_is_snorm8(enum pipe_format format);
+
 /**
  * Check if the src format can be blitted to the destination format with
  * a simple memcpy.  For example, blitting from RGBA to RGBx is OK, but not
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radeonsi: only apply the SNORM blit workaround to *8_SNORM

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

Like the comment says. This fixes DCC, which doesn't like blitting RG16
as RGBA8.
---
 src/gallium/drivers/radeonsi/si_blit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 082ea85..aa526ee 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -536,7 +536,7 @@ void si_resource_copy_region(struct pipe_context *ctx,
src_force_level = src_level;
} else if (!util_blitter_is_copy_supported(sctx->blitter, dst, src) ||
   /* also *8_SNORM has precision issues, use UNORM instead */
-  util_format_is_snorm(src->format)) {
+  util_format_is_snorm8(src->format)) {
if (util_format_is_subsampled_422(src->format)) {
src_templ.format = PIPE_FORMAT_R8G8B8A8_UINT;
dst_templ.format = PIPE_FORMAT_R8G8B8A8_UINT;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 04/21] mesa/extensions: Move entries entries to seperate file

2015-10-22 Thread Nanley Chery

On Thu, Oct 22, 2015 at 3:13 AM, Emil Velikov 
wrote:

> On 19 October 2015 at 23:36, Nanley Chery  wrote:
> > From: Nanley Chery 
> >
> > With this infrastructure set in place, we can now reuse the entries to
> > generate useful code.
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/main/extensions.c   | 321
> +--
> >  src/mesa/main/extensions_table.h | 320
> ++
> >  2 files changed, 321 insertions(+), 320 deletions(-)
> >  create mode 100644 src/mesa/main/extensions_table.h
> >
> As a general rule of thumb if you add any file new in mesa, it should
> be referenced in a Makefile.* somewhere. Please squash the following
>
> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> index ed9848c..f86590b 100644
> --- a/src/mesa/Makefile.sources
> +++ b/src/mesa/Makefile.sources
> @@ -77,6 +77,7 @@ MAIN_FILES = \
>main/execmem.c \
>main/extensions.c \
>main/extensions.h \
> +   main/extensions_table.h \
>main/fbobject.c \
>main/fbobject.h \
>main/feedback.c \
>
>
Will do. Thanks for the tip.


>
> -Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: fix mipmap generation for immutable textures with incomplete pyramids

2015-10-22 Thread Nicolai Hähnle

(This is an alternative to my previous patch, "mesa: clamp MaxLevel for
immutable textures at initialization"; this patch has no opinion about
how the spec should be interpreted.)

Without the clamping by NumLevels, the state tracker would reallocate the
texture storage (incorrect) and even fail to copy the base level image
after reallocation, leading to the graphical glitch of
https://bugs.freedesktop.org/show_bug.cgi?id=91993 .

A piglit test has been submitted for review as well (subtest of
arb_texture_storage-texture-storage).
---
 src/mesa/state_tracker/st_gen_mipmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_gen_mipmap.c 
b/src/mesa/state_tracker/st_gen_mipmap.c
index 26e1c21..3125b2a 100644
--- a/src/mesa/state_tracker/st_gen_mipmap.c
+++ b/src/mesa/state_tracker/st_gen_mipmap.c
@@ -61,6 +61,8 @@ compute_num_levels(struct gl_context *ctx,
 
numLevels = texObj->BaseLevel + baseImage->MaxNumLevels;
numLevels = MIN2(numLevels, (GLuint) texObj->MaxLevel + 1);
+   if (texObj->Immutable)
+  numLevels = MIN2(numLevels, texObj->NumLevels);
assert(numLevels >= 1);
 
return numLevels;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 20/21] mesa: Remove gl_extensions::dummy

2015-10-22 Thread Nanley Chery

On Thu, Oct 22, 2015 at 3:40 AM, Emil Velikov 
wrote:

> On 19 October 2015 at 23:44, Nanley Chery  wrote:
> > From: Nanley Chery 
> >
> > This variable existed to provide an unsigned error value for
> > name_to_offset(). Since o(extension_sentinel) is also a valid unsigned
> > error value, save space and replace this mysterious variable with
> > a less mysterious one (extension_sentinel).
> >
> That's the first time I see someone calling 0, a "unsigned error
> value". Perhaps reword the commit message a bit ? The comment/function
> description should be updated as well.
>
>
How's the following?

This variable existed to enable 0 as a valid error value for
name_to_offset(). Since o(extension_sentinel) is also a valid
error value, save space and replace this mysterious variable
with a less mysterious one (extension_sentinel).


> -Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 92625] [BYT] piglit glx_arb_sync_control@timing subtests fail

2015-10-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=92625

Bug ID: 92625
   Summary: [BYT] piglit glx_arb_sync_control@timing subtests fail
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: GLX
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: jairo.daniel.miramontes.ca...@intel.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 119123
  --> https://bugs.freedesktop.org/attachment.cgi?id=119123=edit
Dmesg.log

The following Subtests of the glx_arb_sync_control@timing test case are failing
on the below configuration.

<%glx@glx_arb_sync_control@timing -divisor 1
glx@glx_arb_sync_control@timing -divisor 2
glx@glx_arb_sync_control@timing -fullscreen -divisor 1
glx@glx_arb_sync_control@timing -msc-delta 1
glx@glx_arb_sync_control@timing -msc-delta 2
glx@glx_arb_sync_control@timing -waitformsc -divisor 1
glx@glx_arb_sync_control@timing -waitformsc -msc-delta 1%>


==Hardware configuration==
--

-- Platform: --
Processor: Intel(R) Celeron(R) CPU  N2820  @ 2.13GHz
Model: Toshiba Satellite C55t
Memory : 4GB

-- Software --
Linux distribution: Ubuntu 14.04.03 LTS 64Bits
BIOS: 1.10


==Test Environment==
--
kernel: 4.3.0-rc5-drm-intel-nightly
xorg-server-1.17.2
libdrm-2.4.65
xf86-video-intel-2.99.917
mesa-11.1.0-devel (git 6f39546)
libva-1.6.1
intel-driver-1.6.1
cairo-1.14.2

Additional information :
dmesg output (option "drm.debug=0x06")

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] gallium/util: add a test for NULL fragment shaders

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

Just to validate that radeonsi doesn't crash.
---
 src/gallium/auxiliary/util/u_tests.c | 38 
 1 file changed, 38 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_tests.c 
b/src/gallium/auxiliary/util/u_tests.c
index a94e5cc..006dfa9 100644
--- a/src/gallium/auxiliary/util/u_tests.c
+++ b/src/gallium/auxiliary/util/u_tests.c
@@ -450,6 +450,43 @@ null_constant_buffer(struct pipe_context *ctx)
util_report_result(pass);
 }
 
+static void
+null_fragment_shader(struct pipe_context *ctx)
+{
+   struct cso_context *cso;
+   struct pipe_resource *cb;
+   void *vs;
+   struct pipe_rasterizer_state rs = {0};
+   struct pipe_query *query;
+   union pipe_query_result qresult;
+
+   cso = cso_create_context(ctx);
+   cb = util_create_texture2d(ctx->screen, 256, 256,
+  PIPE_FORMAT_R8G8B8A8_UNORM);
+   util_set_common_states_and_clear(cso, ctx, cb);
+
+   /* No rasterization. */
+   rs.rasterizer_discard = 1;
+   cso_set_rasterizer(cso, );
+
+   vs = util_set_passthrough_vertex_shader(cso, ctx, false);
+
+   query = ctx->create_query(ctx, PIPE_QUERY_PRIMITIVES_GENERATED, 0);
+   ctx->begin_query(ctx, query);
+   util_draw_fullscreen_quad(cso);
+   ctx->end_query(ctx, query);
+   ctx->get_query_result(ctx, query, true, );
+
+   /* Cleanup. */
+   cso_destroy_context(cso);
+   ctx->delete_vs_state(ctx, vs);
+   ctx->destroy_query(ctx, query);
+   pipe_resource_reference(, NULL);
+
+   /* Check PRIMITIVES_GENERATED. */
+   util_report_result(qresult.u64 == 2);
+}
+
 /**
  * Run all tests. This should be run with a clean context after
  * context_create.
@@ -459,6 +496,7 @@ util_run_tests(struct pipe_screen *screen)
 {
struct pipe_context *ctx = screen->context_create(screen, NULL, 0);
 
+   null_fragment_shader(ctx);
tgsi_vs_window_space_position(ctx);
null_sampler_view(ctx, TGSI_TEXTURE_2D);
null_sampler_view(ctx, TGSI_TEXTURE_BUFFER);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/7] radeonsi: don't unbind shaders in delete_shader_selector

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

It's moot with shaders that any context can use and release.
---
 src/gallium/drivers/radeonsi/si_shader.c|  3 ++
 src/gallium/drivers/radeonsi/si_state_shaders.c | 40 -
 2 files changed, 3 insertions(+), 40 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a119cbd..fe1ca67 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4253,6 +4253,9 @@ void si_shader_destroy(struct si_shader *shader)
FREE(shader->gs_copy_shader);
}
 
+if (shader->pm4)
+   si_pm4_free_state_simple(shader->pm4);
+
if (shader->scratch_bo)
r600_resource_reference(>scratch_bo, NULL);
 
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 478b851..3e44451 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -914,51 +914,11 @@ static void si_bind_ps_shader(struct pipe_context *ctx, 
void *state)
 
 static void si_delete_shader_selector(struct pipe_context *ctx, void *state)
 {
-   struct si_context *sctx = (struct si_context *)ctx;
struct si_shader_selector *sel = (struct si_shader_selector *)state;
struct si_shader *p = sel->first_variant, *c;
-   struct si_shader_ctx_state *current_shader[SI_NUM_SHADERS] = {
-   [PIPE_SHADER_VERTEX] = >vs_shader,
-   [PIPE_SHADER_TESS_CTRL] = >tcs_shader,
-   [PIPE_SHADER_TESS_EVAL] = >tes_shader,
-   [PIPE_SHADER_GEOMETRY] = >gs_shader,
-   [PIPE_SHADER_FRAGMENT] = >ps_shader,
-   };
-
-   if (current_shader[sel->type]->cso == sel) {
-   current_shader[sel->type]->cso = NULL;
-   current_shader[sel->type]->current = NULL;
-   }
 
while (p) {
c = p->next_variant;
-   switch (sel->type) {
-   case PIPE_SHADER_VERTEX:
-   if (p->key.vs.as_ls)
-   si_pm4_delete_state(sctx, ls, p->pm4);
-   else if (p->key.vs.as_es)
-   si_pm4_delete_state(sctx, es, p->pm4);
-   else
-   si_pm4_delete_state(sctx, vs, p->pm4);
-   break;
-   case PIPE_SHADER_TESS_CTRL:
-   si_pm4_delete_state(sctx, hs, p->pm4);
-   break;
-   case PIPE_SHADER_TESS_EVAL:
-   if (p->key.tes.as_es)
-   si_pm4_delete_state(sctx, es, p->pm4);
-   else
-   si_pm4_delete_state(sctx, vs, p->pm4);
-   break;
-   case PIPE_SHADER_GEOMETRY:
-   si_pm4_delete_state(sctx, gs, p->pm4);
-   si_pm4_delete_state(sctx, vs, p->gs_copy_shader->pm4);
-   break;
-   case PIPE_SHADER_FRAGMENT:
-   si_pm4_delete_state(sctx, ps, p->pm4);
-   break;
-   }
-
si_shader_destroy(p);
free(p);
p = c;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] radeonsi: allow unbinding pixel shaders and remove the dummy shader

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c  |  2 --
 src/gallium/drivers/radeonsi/si_pipe.h  |  3 ---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 18 +-
 3 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 5f910c9..e211e92 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -55,8 +55,6 @@ static void si_destroy_context(struct pipe_context *context)
 
if (sctx->pstipple_sampler_state)
sctx->b.b.delete_sampler_state(>b.b, 
sctx->pstipple_sampler_state);
-   if (sctx->dummy_pixel_shader)
-   sctx->b.b.delete_fs_state(>b.b, sctx->dummy_pixel_shader);
if (sctx->fixed_func_tcs_shader.cso)
sctx->b.b.delete_tcs_state(>b.b, 
sctx->fixed_func_tcs_shader.cso);
if (sctx->custom_dsa_flush)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index d7a2282..6305724 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -203,9 +203,6 @@ struct si_context {
struct si_pm4_state *init_config;
boolinit_config_has_vgt_flush;
struct si_pm4_state *vgt_shader_config[4];
-   /* With rasterizer discard, there doesn't have to be a pixel shader.
-* In that case, we bind this one: */
-   void*dummy_pixel_shader;
 
/* shaders */
struct si_shader_ctx_state  ps_shader;
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index ce8d1cc..25cba84 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -864,16 +864,6 @@ static void si_bind_tes_shader(struct pipe_context *ctx, 
void *state)
si_update_viewports_and_scissors(sctx);
 }
 
-static void si_make_dummy_ps(struct si_context *sctx)
-{
-   if (!sctx->dummy_pixel_shader) {
-   sctx->dummy_pixel_shader =
-   util_make_fragment_cloneinput_shader(>b.b, 0,
-
TGSI_SEMANTIC_GENERIC,
-
TGSI_INTERPOLATE_CONSTANT);
-   }
-}
-
 static void si_bind_ps_shader(struct pipe_context *ctx, void *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
@@ -883,14 +873,8 @@ static void si_bind_ps_shader(struct pipe_context *ctx, 
void *state)
if (sctx->ps_shader.cso == sel)
return;
 
-   /* use a dummy shader if binding a NULL shader */
-   if (!sel) {
-   si_make_dummy_ps(sctx);
-   sel = sctx->dummy_pixel_shader;
-   }
-
sctx->ps_shader.cso = sel;
-   sctx->ps_shader.current = sel->first_variant;
+   sctx->ps_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >cb_target_mask);
 }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] radeonsi: allow unbinding vertex shaders

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

Draw calls without a vertex shader are skipped.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 25cba84..4a3a04c 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -799,11 +799,11 @@ static void si_bind_vs_shader(struct pipe_context *ctx, 
void *state)
struct si_context *sctx = (struct si_context *)ctx;
struct si_shader_selector *sel = state;
 
-   if (sctx->vs_shader.cso == sel || !sel)
+   if (sctx->vs_shader.cso == sel)
return;
 
sctx->vs_shader.cso = sel;
-   sctx->vs_shader.current = sel->first_variant;
+   sctx->vs_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >clip_regs);
si_update_viewports_and_scissors(sctx);
 }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/7] RadeonSI: Unbind shaders properly

2015-10-22 Thread Marek Olšák

This series unbinds shaders properly in bind_(vs/fs/etc.)_state, because any 
context can release them now, therefore we can't rely on unbinding in 
delete_shader. This is basically a fix for sharable shader.

It also gets rid of the dummy pixel shader. Instead, the driver just skips draw 
calls that have rasterizer_discard==0 and ps==NULL set at the same time. 
rasterizer_discard==1 and ps==NULL works as expected.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] radeonsi: add checks for a NULL pixel shader

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

This will allow removing the dummy PS.
---
 src/gallium/drivers/radeonsi/si_state.c |  1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 73 ++---
 2 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 243bdc6..8cd3cd2 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -266,6 +266,7 @@ static void si_emit_cb_target_mask(struct si_context *sctx, 
struct r600_atom *at
 * Reproducible with Unigine Heaven 4.0 and drirc missing.
 */
if (blend->dual_src_blend &&
+   sctx->ps_shader.cso &&
(sctx->ps_shader.cso->ps_colors_written & 0x3) != 0x3)
mask = 0;
 
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index eea00e0..ce8d1cc 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -956,13 +956,15 @@ static void si_emit_spi_map(struct si_context *sctx, 
struct r600_atom *atom)
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
struct si_shader *ps = sctx->ps_shader.current;
struct si_shader *vs = si_get_vs_state(sctx);
-   struct tgsi_shader_info *psinfo = >selector->info;
+   struct tgsi_shader_info *psinfo;
struct tgsi_shader_info *vsinfo = >selector->info;
unsigned i, j, tmp, num_written = 0;
 
-   if (!ps->nparam)
+   if (!ps || !ps->nparam)
return;
 
+   psinfo = >selector->info;
+
radeon_set_context_reg_seq(cs, R_028644_SPI_PS_INPUT_CNTL_0, 
ps->nparam);
 
for (i = 0; i < psinfo->num_inputs; i++) {
@@ -1025,7 +1027,12 @@ static void si_emit_spi_ps_input(struct si_context 
*sctx, struct r600_atom *atom
 {
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
struct si_shader *ps = sctx->ps_shader.current;
-   unsigned input_ena = ps->spi_ps_input_ena;
+   unsigned input_ena;
+
+   if (!ps)
+   return;
+
+   input_ena = ps->spi_ps_input_ena;
 
/* we need to enable at least one of them, otherwise we hang the GPU */
assert(G_0286CC_PERSP_SAMPLE_ENA(input_ena) ||
@@ -1531,23 +1538,38 @@ bool si_update_shaders(struct si_context *sctx)
 
si_update_vgt_shader_config(sctx);
 
-   r = si_shader_select(ctx, >ps_shader);
-   if (r)
-   return false;
-   si_pm4_bind_state(sctx, ps, sctx->ps_shader.current->pm4);
-
-   if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
-   sctx->sprite_coord_enable != rs->sprite_coord_enable ||
-   sctx->flatshade != rs->flatshade) {
-   sctx->sprite_coord_enable = rs->sprite_coord_enable;
-   sctx->flatshade = rs->flatshade;
-   si_mark_atom_dirty(sctx, >spi_map);
-   }
+   if (sctx->ps_shader.cso) {
+   r = si_shader_select(ctx, >ps_shader);
+   if (r)
+   return false;
+   si_pm4_bind_state(sctx, ps, sctx->ps_shader.current->pm4);
+
+   if (si_pm4_state_changed(sctx, ps) || 
si_pm4_state_changed(sctx, vs) ||
+   sctx->sprite_coord_enable != rs->sprite_coord_enable ||
+   sctx->flatshade != rs->flatshade) {
+   sctx->sprite_coord_enable = rs->sprite_coord_enable;
+   sctx->flatshade = rs->flatshade;
+   si_mark_atom_dirty(sctx, >spi_map);
+   }
+
+   if (si_pm4_state_changed(sctx, ps) ||
+   sctx->force_persample_interp != rs->force_persample_interp) 
{
+   sctx->force_persample_interp = 
rs->force_persample_interp;
+   si_mark_atom_dirty(sctx, >spi_ps_input);
+   }
 
-   if (si_pm4_state_changed(sctx, ps) ||
-   sctx->force_persample_interp != rs->force_persample_interp) {
-   sctx->force_persample_interp = rs->force_persample_interp;
-   si_mark_atom_dirty(sctx, >spi_ps_input);
+   if (sctx->ps_db_shader_control != 
sctx->ps_shader.current->db_shader_control) {
+   sctx->ps_db_shader_control = 
sctx->ps_shader.current->db_shader_control;
+   si_mark_atom_dirty(sctx, >db_render_state);
+   }
+
+   if (sctx->smoothing_enabled != 
sctx->ps_shader.current->key.ps.poly_line_smoothing) {
+   sctx->smoothing_enabled = 
sctx->ps_shader.current->key.ps.poly_line_smoothing;
+   si_mark_atom_dirty(sctx, >msaa_config);
+
+   if (sctx->b.chip_class == SI)
+   si_mark_atom_dirty(sctx, 
>db_render_state);
+   }
}
 
if (si_pm4_state_changed(sctx, ls) ||
@@ -1559,19

[Mesa-dev] [PATCH 3/7] radeonsi: add draw_vbo check for a NULL pixel shader

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c  | 1 +
 src/gallium/drivers/radeonsi/si_state.h  | 1 +
 src/gallium/drivers/radeonsi/si_state_draw.c | 7 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 8cd3cd2..131fc06 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -698,6 +698,7 @@ static void *si_create_rs_state(struct pipe_context *ctx,
rs->clamp_fragment_color = state->clamp_fragment_color;
rs->flatshade = state->flatshade;
rs->sprite_coord_enable = state->sprite_coord_enable;
+   rs->rasterizer_discard = state->rasterizer_discard;
rs->pa_sc_line_stipple = state->line_stipple_enable ?

S_028A0C_LINE_PATTERN(state->line_stipple_pattern) |

S_028A0C_REPEAT_COUNT(state->line_stipple_factor) : 0;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index fba6619..8b9a311 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -61,6 +61,7 @@ struct si_state_rasterizer {
boolpoly_smooth;
booluses_poly_offset;
boolclamp_fragment_color;
+   boolrasterizer_discard;
 };
 
 struct si_dsa_stencil_ref_part {
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ce6c98c..59845c4 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -728,6 +728,7 @@ static void si_get_draw_start_count(struct si_context *sctx,
 void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
 {
struct si_context *sctx = (struct si_context *)ctx;
+   struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
struct pipe_index_buffer ib = {};
unsigned mask;
 
@@ -735,7 +736,11 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
(info->indexed || !info->count_from_stream_output))
return;
 
-   if (!sctx->ps_shader.cso || !sctx->vs_shader.cso) {
+   if (!sctx->vs_shader.cso) {
+   assert(0);
+   return;
+   }
+   if (!sctx->ps_shader.cso && (!rs || !rs->rasterizer_discard)) {
assert(0);
return;
}
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] radeonsi: properly unbind shader states

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

Any context can release a shader now, so we can't rely on unbinding
in delete_shader_selector.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 34 +
 1 file changed, 34 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 4a3a04c..478b851 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -798,10 +798,24 @@ static void si_bind_vs_shader(struct pipe_context *ctx, 
void *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_shader_selector *sel = state;
+   struct si_shader *old = sctx->vs_shader.current;
 
if (sctx->vs_shader.cso == sel)
return;
 
+   if (old) {
+   if (old->key.vs.as_es) {
+   sctx->queued.named.es = NULL;
+   sctx->emitted.named.es = NULL;
+   } else if (old->key.vs.as_ls) {
+   sctx->queued.named.ls = NULL;
+   sctx->emitted.named.ls = NULL;
+   } else {
+   sctx->queued.named.vs = NULL;
+   sctx->emitted.named.vs = NULL;
+   }
+   }
+
sctx->vs_shader.cso = sel;
sctx->vs_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >clip_regs);
@@ -817,6 +831,9 @@ static void si_bind_gs_shader(struct pipe_context *ctx, 
void *state)
if (sctx->gs_shader.cso == sel)
return;
 
+   sctx->queued.named.gs = NULL;
+   sctx->emitted.named.gs = NULL;
+
sctx->gs_shader.cso = sel;
sctx->gs_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >clip_regs);
@@ -836,6 +853,9 @@ static void si_bind_tcs_shader(struct pipe_context *ctx, 
void *state)
if (sctx->tcs_shader.cso == sel)
return;
 
+   sctx->queued.named.hs = NULL;
+   sctx->emitted.named.hs = NULL;
+
sctx->tcs_shader.cso = sel;
sctx->tcs_shader.current = sel ? sel->first_variant : NULL;
 
@@ -848,10 +868,21 @@ static void si_bind_tes_shader(struct pipe_context *ctx, 
void *state)
struct si_context *sctx = (struct si_context *)ctx;
struct si_shader_selector *sel = state;
bool enable_changed = !!sctx->tes_shader.cso != !!sel;
+   struct si_shader *old = sctx->vs_shader.current;
 
if (sctx->tes_shader.cso == sel)
return;
 
+   if (old) {
+   if (old->key.tes.as_es) {
+   sctx->queued.named.es = NULL;
+   sctx->emitted.named.es = NULL;
+   } else {
+   sctx->queued.named.vs = NULL;
+   sctx->emitted.named.vs = NULL;
+   }
+   }
+
sctx->tes_shader.cso = sel;
sctx->tes_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >clip_regs);
@@ -873,6 +904,9 @@ static void si_bind_ps_shader(struct pipe_context *ctx, 
void *state)
if (sctx->ps_shader.cso == sel)
return;
 
+   sctx->queued.named.ps = NULL;
+   sctx->emitted.named.ps = NULL;
+
sctx->ps_shader.cso = sel;
sctx->ps_shader.current = sel ? sel->first_variant : NULL;
si_mark_atom_dirty(sctx, >cb_target_mask);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] radeonsi: add one more SWITCH_ON_EOI requirement for Hawaii and VI

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

The VI condition depends on geometry shaders and MAX_PRIMGRP_IN_WAVE.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 1aa5456..3b606b2 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -223,6 +223,7 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context 
*sctx,
struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
unsigned prim = info->mode;
unsigned primgroup_size = 128; /* recommended without a GS */
+   unsigned max_primgroup_in_wave = 2;
 
/* SWITCH_ON_EOP(0) is always preferable. */
bool wd_switch_on_eop = false;
@@ -296,6 +297,13 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
if (sctx->b.screen->info.max_se > 2 && !wd_switch_on_eop)
ia_switch_on_eoi = true;
 
+   /* Required by Hawaii and, for some special cases, by VI. */
+   if (ia_switch_on_eoi &&
+   (sctx->b.family == CHIP_HAWAII ||
+(sctx->b.chip_class == VI &&
+ (sctx->gs_shader.cso || max_primgroup_in_wave != 2
+   partial_vs_wave = true;
+
/* Instancing bug on Bonaire. */
if (sctx->b.family == CHIP_BONAIRE && ia_switch_on_eoi &&
(info->indirect || info->instance_count > 1))
@@ -319,7 +327,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context 
*sctx,
S_028AA8_PARTIAL_ES_WAVE_ON(partial_es_wave) |
S_028AA8_PRIMGROUP_SIZE(primgroup_size - 1) |
S_028AA8_WD_SWITCH_ON_EOP(sctx->b.chip_class >= CIK ? 
wd_switch_on_eop : 0) |
-   S_028AA8_MAX_PRIMGRP_IN_WAVE(sctx->b.chip_class >= VI ? 2 : 0);
+   S_028AA8_MAX_PRIMGRP_IN_WAVE(sctx->b.chip_class >= VI ?
+max_primgroup_in_wave : 0);
 }
 
 static unsigned si_get_ls_hs_config(struct si_context *sctx,
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] radeonsi: add SWITCH_ON_EOI requirement for 4 SE parts

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 635fb5f..96448ee 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -292,6 +292,10 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
if (info->count_from_stream_output)
wd_switch_on_eop = true;
 
+   /* Required on CIK and later. */
+   if (sctx->b.screen->info.max_se > 2 && !wd_switch_on_eop)
+   ia_switch_on_eoi = true;
+
/* If the WD switch is false, the IA switch must be false too. 
*/
assert(wd_switch_on_eop || !ia_switch_on_eop);
}
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] radeonsi: remove unnecessary PARTIAL_VS_WAVE setting for streamout

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

hardware does this automatically
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 59845c4..635fb5f 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -269,10 +269,6 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
wd_switch_on_eop = true;
}
 
-   if (sctx->b.streamout.streamout_enabled ||
-   sctx->b.streamout.prims_gen_query_enabled)
-   partial_vs_wave = true;
-
if (sctx->b.chip_class >= CIK) {
/* WD_SWITCH_ON_EOP has no effect on GPUs with less than
 * 4 shader engines. Set 1 to pass the assertion below.
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/7] RadeonSI: Fixes for IA_MULTI_VGT_PARAM

2015-10-22 Thread Marek Olšák

These are from internal documentation. It doesn't say why things must be done 
this way, just that drivers should do it.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] radeonsi: make PARTIAL_ES_WAVE globally dependent on SWITCH_ON_EOI

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

This catches the other cases that enable SWITCH_ON_EOI.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 3b606b2..3962003 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -247,13 +247,10 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
/* primgroup_size must be set to a multiple of NUM_PATCHES */
primgroup_size = (primgroup_size / num_patches) * num_patches;
 
-   /* SWITCH_ON_EOI must be set if PrimID is used.
-* If SWITCH_ON_EOI is set, PARTIAL_ES_WAVE must be set too. */
+   /* SWITCH_ON_EOI must be set if PrimID is used. */
if ((sctx->tcs_shader.cso && 
sctx->tcs_shader.cso->info.uses_primid) ||
-   sctx->tes_shader.cso->info.uses_primid) {
+   sctx->tes_shader.cso->info.uses_primid)
ia_switch_on_eoi = true;
-   partial_es_wave = true;
-   }
 
/* Bug with tessellation and GS on Bonaire and older 2 SE 
chips. */
if ((sctx->b.family == CHIP_TAHITI ||
@@ -313,6 +310,10 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
assert(wd_switch_on_eop || !ia_switch_on_eop);
}
 
+   /* If SWITCH_ON_EOI is set, PARTIAL_ES_WAVE must be set too. */
+   if (ia_switch_on_eoi)
+   partial_es_wave = true;
+
/* Hw bug with single-primitive instances and SWITCH_ON_EOI
 * on multi-SE chips. */
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] radeonsi: merge two ifs setting WD_SWITCH_ON_EOP

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 3962003..e58e474 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -276,7 +276,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context 
*sctx,
prim == PIPE_PRIM_LINE_LOOP ||
prim == PIPE_PRIM_TRIANGLE_FAN ||
prim == PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY ||
-   info->primitive_restart)
+   info->primitive_restart ||
+   info->count_from_stream_output)
wd_switch_on_eop = true;
 
/* Hawaii hangs if instancing is enabled and WD_SWITCH_ON_EOP 
is 0.
@@ -286,10 +287,6 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
(info->indirect || info->instance_count > 1))
wd_switch_on_eop = true;
 
-   /* USE_OPAQUE doesn't work when WD_SWITCH_ON_EOP is 0. */
-   if (info->count_from_stream_output)
-   wd_switch_on_eop = true;
-
/* Required on CIK and later. */
if (sctx->b.screen->info.max_se > 2 && !wd_switch_on_eop)
ia_switch_on_eoi = true;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/7] radeonsi: add another requirement for PARTIAL_ES_WAVE

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c   | 29 +++-
 src/gallium/drivers/radeonsi/si_pipe.h   |  2 ++
 src/gallium/drivers/radeonsi/si_state.c  |  2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c |  4 
 4 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index e211e92..e653799 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -576,6 +576,32 @@ static bool si_initialize_pipe_config(struct si_screen 
*sscreen)
return true;
 }
 
+static bool si_init_gs_info(struct si_screen *sscreen)
+{
+   switch (sscreen->b.family) {
+   case CHIP_OLAND:
+   case CHIP_HAINAN:
+   case CHIP_KAVERI:
+   case CHIP_KABINI:
+   case CHIP_MULLINS:
+   case CHIP_ICELAND:
+   case CHIP_CARRIZO:
+   sscreen->gs_table_depth = 16;
+   return true;
+   case CHIP_TAHITI:
+   case CHIP_PITCAIRN:
+   case CHIP_VERDE:
+   case CHIP_BONAIRE:
+   case CHIP_HAWAII:
+   case CHIP_TONGA:
+   case CHIP_FIJI:
+   sscreen->gs_table_depth = 32;
+   return true;
+   default:
+   return false;
+   }
+}
+
 struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws)
 {
struct si_screen *sscreen = CALLOC_STRUCT(si_screen);
@@ -593,7 +619,8 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
sscreen->b.b.resource_create = r600_resource_create_common;
 
if (!r600_common_screen_init(>b, ws) ||
-   !si_initialize_pipe_config(sscreen)) {
+   !si_initialize_pipe_config(sscreen) ||
+   !si_init_gs_info(sscreen)) {
FREE(sscreen);
return NULL;
}
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 6305724..5352977 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -42,6 +42,7 @@
 #define SI_BASE_VERTEX_UNKNOWN INT_MIN
 #define SI_RESTART_INDEX_UNKNOWN INT_MIN
 #define SI_NUM_SMOOTH_AA_SAMPLES 8
+#define SI_GS_PER_ES 128
 
 /* Instruction cache. */
 #define SI_CONTEXT_INV_ICACHE  (R600_CONTEXT_PRIVATE_FLAG << 0)
@@ -85,6 +86,7 @@ struct si_compute;
 
 struct si_screen {
struct r600_common_screen   b;
+   unsignedgs_table_depth;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 131fc06..5345d88 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3264,7 +3264,7 @@ static void si_init_config(struct si_context *sctx)
si_pm4_set_reg(pm4, R_028A1C_VGT_HOS_MIN_TESS_LEVEL, fui(0));
 
/* FIXME calculate these values somehow ??? */
-   si_pm4_set_reg(pm4, R_028A54_VGT_GS_PER_ES, 0x80);
+   si_pm4_set_reg(pm4, R_028A54_VGT_GS_PER_ES, SI_GS_PER_ES);
si_pm4_set_reg(pm4, R_028A58_VGT_ES_PER_GS, 0x40);
si_pm4_set_reg(pm4, R_028A5C_VGT_GS_PER_VS, 0x2);
 
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index e58e474..822e2d5 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -311,6 +311,10 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
if (ia_switch_on_eoi)
partial_es_wave = true;
 
+   /* GS requirement. */
+   if (SI_GS_PER_ES / primgroup_size >= sctx->screen->gs_table_depth - 3)
+   partial_es_wave = true;
+
/* Hw bug with single-primitive instances and SWITCH_ON_EOI
 * on multi-SE chips. */
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/7] radeonsi: only apply the instancing bug workaround to Bonaire

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 96448ee..1aa5456 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -296,6 +296,11 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
if (sctx->b.screen->info.max_se > 2 && !wd_switch_on_eop)
ia_switch_on_eoi = true;
 
+   /* Instancing bug on Bonaire. */
+   if (sctx->b.family == CHIP_BONAIRE && ia_switch_on_eoi &&
+   (info->indirect || info->instance_count > 1))
+   partial_vs_wave = true;
+
/* If the WD switch is false, the IA switch must be false too. 
*/
assert(wd_switch_on_eop || !ia_switch_on_eop);
}
@@ -308,11 +313,6 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
  u_prims_for_vertices(info->mode, info->count) <= 1)))
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
 
-   /* Instancing bug on 2 SE chips. */
-   if (sctx->b.screen->info.max_se == 2 && ia_switch_on_eoi &&
-   (info->indirect || info->instance_count > 1))
-   partial_vs_wave = true;
-
return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
S_028AA8_SWITCH_ON_EOI(ia_switch_on_eoi) |
S_028AA8_PARTIAL_VS_WAVE_ON(partial_vs_wave) |
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/validate: Add better validation of load/store types

2015-10-22 Thread Connor Abbott

Reviewed-by: Connor Abbott 

On Thu, Oct 22, 2015 at 7:53 PM, Jason Ekstrand  wrote:
> ---
>  src/glsl/nir/nir_validate.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
> index c6fedf9..a42e830 100644
> --- a/src/glsl/nir/nir_validate.c
> +++ b/src/glsl/nir/nir_validate.c
> @@ -398,15 +398,27 @@ validate_intrinsic_instr(nir_intrinsic_instr *instr, 
> validate_state *state)
> }
>
> switch (instr->intrinsic) {
> -   case nir_intrinsic_load_var:
> +   case nir_intrinsic_load_var: {
> +  const struct glsl_type *type =
> + nir_deref_tail(>variables[0]->deref)->type;
> +  assert(glsl_type_is_vector_or_scalar(type));
> +  assert(instr->num_components == glsl_get_vector_elements(type));
>assert(instr->variables[0]->var->data.mode != nir_var_shader_out);
>break;
> -   case nir_intrinsic_store_var:
> +   }
> +   case nir_intrinsic_store_var: {
> +  const struct glsl_type *type =
> + nir_deref_tail(>variables[0]->deref)->type;
> +  assert(glsl_type_is_vector_or_scalar(type));
> +  assert(instr->num_components == glsl_get_vector_elements(type));
>assert(instr->variables[0]->var->data.mode != nir_var_shader_in &&
>   instr->variables[0]->var->data.mode != nir_var_uniform &&
>   instr->variables[0]->var->data.mode != nir_var_shader_storage);
>break;
> +   }
> case nir_intrinsic_copy_var:
> +  assert(nir_deref_tail(>variables[0]->deref)->type ==
> + nir_deref_tail(>variables[1]->deref)->type);
>assert(instr->variables[0]->var->data.mode != nir_var_shader_in &&
>   instr->variables[0]->var->data.mode != nir_var_uniform &&
>   instr->variables[0]->var->data.mode != nir_var_shader_storage);
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 92626] [BYT] piglit glx@glx-query-drawable subtests failing

2015-10-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=92626

Bug ID: 92626
   Summary: [BYT] piglit glx@glx-query-drawable subtests failing
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: GLX
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: jairo.daniel.miramontes.ca...@intel.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 119124
  --> https://bugs.freedesktop.org/attachment.cgi?id=119124=edit
dmesg log

The following Subtests of the glx@glx-query-drawable test case are failing on
the below configuration.

<%
glx@glx-query-drawable-glx_height
glx@glx-query-drawable-glx_width
%>


==Hardware configuration==
--

-- Platform: --
Processor: Intel(R) Celeron(R) CPU  N2820  @ 2.13GHz
Model: Toshiba Satellite C55t
Memory : 4GB

-- Software --
Linux distribution: Ubuntu 14.04.03 LTS 64Bits
BIOS: 1.10


==Test Environment==
--
kernel: 4.3.0-rc5-drm-intel-nightly
xorg-server-1.17.2
libdrm-2.4.65
xf86-video-intel-2.99.917
mesa-11.1.0-devel (git 6f39546)
libva-1.6.1
intel-driver-1.6.1
cairo-1.14.2

Additional information :
dmesg output (option "drm.debug=0x06")

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 03/21] mesa/extensions: Wrap array entries in macros

2015-10-22 Thread Marek Olšák

On Fri, Oct 23, 2015 at 12:22 AM, Nanley Chery  wrote:
>
>
> On Thu, Oct 22, 2015 at 3:06 AM, Emil Velikov 
> wrote:
>>
>> On 20 October 2015 at 16:43, Nanley Chery  wrote:
>> > On Tue, Oct 20, 2015 at 8:16 AM, Marek Olšák  wrote:
>> >>
>> >> Also, the FIXME comment should be on its own line.
>> >>
>> >
>> > I moved it aside to make editing the table easier. However, since the
>> > formatting of the
>> > table is unlikely to change much after this series, I agree that I
>> > should
>> > move it back to
>> > its original position.
>> >
>> Actually the designated initalisers should be fine in core mesa. If in
>> doubt wrt MSVC compat, just grep MSVC.*COMPAT through whole of mesa.
>> The 2013 version adds support for this feature.
>> Alternatively feel free to ask Brian/Jose, as they have a fair bit of
>> experience in the area.
>>
>
> That's good news. Thanks for the information.
>
>>
>> That aside, I'm in favour of keeping the comments as is. The editing
>> comment does not apply imho.
>>
>
> Since, there isn't a unanimous opinion on this, I'll leave the FIXME
> to save some rebasing time. Are you referring to my git comment?
> Or to my previous reply about why I moved the FIXME?

Yeah, feel free to fix the FIXME later in the series.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/validate: Add better validation of load/store types

2015-10-22 Thread Jason Ekstrand

---
 src/glsl/nir/nir_validate.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
index c6fedf9..a42e830 100644
--- a/src/glsl/nir/nir_validate.c
+++ b/src/glsl/nir/nir_validate.c
@@ -398,15 +398,27 @@ validate_intrinsic_instr(nir_intrinsic_instr *instr, 
validate_state *state)
}
 
switch (instr->intrinsic) {
-   case nir_intrinsic_load_var:
+   case nir_intrinsic_load_var: {
+  const struct glsl_type *type =
+ nir_deref_tail(>variables[0]->deref)->type;
+  assert(glsl_type_is_vector_or_scalar(type));
+  assert(instr->num_components == glsl_get_vector_elements(type));
   assert(instr->variables[0]->var->data.mode != nir_var_shader_out);
   break;
-   case nir_intrinsic_store_var:
+   }
+   case nir_intrinsic_store_var: {
+  const struct glsl_type *type =
+ nir_deref_tail(>variables[0]->deref)->type;
+  assert(glsl_type_is_vector_or_scalar(type));
+  assert(instr->num_components == glsl_get_vector_elements(type));
   assert(instr->variables[0]->var->data.mode != nir_var_shader_in &&
  instr->variables[0]->var->data.mode != nir_var_uniform &&
  instr->variables[0]->var->data.mode != nir_var_shader_storage);
   break;
+   }
case nir_intrinsic_copy_var:
+  assert(nir_deref_tail(>variables[0]->deref)->type ==
+ nir_deref_tail(>variables[1]->deref)->type);
   assert(instr->variables[0]->var->data.mode != nir_var_shader_in &&
  instr->variables[0]->var->data.mode != nir_var_uniform &&
  instr->variables[0]->var->data.mode != nir_var_shader_storage);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] util/format: add helper util_format_is_snorm8

2015-10-22 Thread Michel Dänzer

On 23.10.2015 08:20, Marek Olšák wrote:
> From: Marek Olšák 

This series is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] RadeonSI: Fixes for IA_MULTI_VGT_PARAM

2015-10-22 Thread Michel Dänzer

On 23.10.2015 08:20, Marek Olšák wrote:
> These are from internal documentation. It doesn't say why things must be done 
> this way, just that drivers should do it.
> 
> Please review.

The series is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: don't load state parameters if there are none

2015-10-22 Thread Marek Olšák

From: Marek Olšák 

Out of 7063 shaders from my shader-db:
- 6564 (93%) shaders don't have any state parameters.
- 347 (5%) shaders have 1 state parameter for WPOS lowering.
- The remaining 2% have more state parameters, usually matrices.
---
 src/mesa/state_tracker/st_atom_constbuf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_atom_constbuf.c 
b/src/mesa/state_tracker/st_atom_constbuf.c
index acaa85d..20f8b3d 100644
--- a/src/mesa/state_tracker/st_atom_constbuf.c
+++ b/src/mesa/state_tracker/st_atom_constbuf.c
@@ -73,7 +73,8 @@ void st_upload_constants( struct st_context *st,
* the parameters list are explicitly set by the user with glUniform,
* glProgramParameter(), etc.
*/
-  _mesa_load_state_parameters(st->ctx, params);
+  if (params->StateFlags)
+ _mesa_load_state_parameters(st->ctx, params);
 
   /* We always need to get a new buffer, to keep the drivers simple and
* avoid gratuitous rendering synchronization.
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 153 matches

Mail list logo