Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path

2015-12-15 Thread Brian Paul

On 12/14/2015 08:38 PM, srol...@vmware.com wrote:

From: Roland Scheidegger 

We just ignored them altogether. While this feature is rather old-fashioned
supporting it is actually rather trivial.
This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag
and all (7) of point-vertex-id).
---
  src/gallium/auxiliary/draw/draw_llvm.c | 77 +++---
  .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |  1 +
  2 files changed, 54 insertions(+), 24 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index a966e45..18a3d81 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm,
  LLVMValueRef* aos,
  int attrib,
  int num_outputs,
-LLVMValueRef clipmask)
+LLVMValueRef clipmask,
+boolean need_edgeflag)
  {
 LLVMBuilderRef builder = gallivm->builder;
 LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib);
@@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm,
 */
assert(DRAW_TOTAL_CLIP_PLANES==14);
/* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */
-  vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES);
-  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), 
vertex_id_pad_edgeflag);
+  if (!need_edgeflag) {
+ vertex_id_pad_edgeflag = (0x << 16) | (1 << 
DRAW_TOTAL_CLIP_PLANES);
+  }
+  else {
+ vertex_id_pad_edgeflag = (0x << 16);
+  }
+  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type),
+   vertex_id_pad_edgeflag);
/* OR with the clipmask */
cliptmp = LLVMBuildOr(builder, val, clipmask, "");
for (i = 0; i < vector_length; i++) {
@@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm,
 LLVMValueRef clipmask,
 int num_outputs,
 struct lp_type soa_type,
-   boolean have_clipdist)
+   boolean need_edgeflag)
  {
 LLVMBuilderRef builder = gallivm->builder;
 unsigned chan, attrib, i;
@@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm,
aos,
attrib,
num_outputs,
-  clipmask);
+  clipmask,
+  need_edgeflag);
 }
  #if DEBUG_STORE
 lp_build_printf(gallivm, "   # storing end\n");
@@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm,
struct gallivm_state *gallivm,
struct lp_type vs_type,
LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
-  boolean clip_xy,
-  boolean clip_z,
-  boolean clip_user,
-  boolean clip_halfz,
-  unsigned ucp_enable,
+  struct draw_llvm_variant_key *key,
LLVMValueRef context_ptr,
boolean *have_clipdist)
  {
@@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm,
 const unsigned pos = llvm->draw->vs.position_output;
 const unsigned cv = llvm->draw->vs.clipvertex_output;
 int num_written_clipdistance = 
llvm->draw->vs.vertex_shader->info.num_written_clipdistance;
-   bool have_cd = false;
+   boolean have_cd = false;
+   boolean clip_user = key->clip_user;
+   unsigned ucp_enable = key->ucp_enable;
 unsigned cd[2];

 cd[0] = llvm->draw->vs.clipdistance_output[0];
@@ -1196,7 +1202,7 @@ generate_clipmask(struct draw_llvm *llvm,
 }

 /* Cliptest, for hardwired planes */
-   if (clip_xy) {
+   if (key->clip_xy) {
/* plane 1 */
test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , 
pos_w);
temp = shift;
@@ -1224,9 +1230,9 @@ generate_clipmask(struct draw_llvm *llvm,
mask = LLVMBuildOr(builder, mask, test, "");
 }

-   if (clip_z) {
+   if (key->clip_z) {
temp = lp_build_const_int_vec(gallivm, i32_type, 16);
-  if (clip_halfz) {
+  if (key->clip_halfz) {
   /* plane 5 */
   test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, 
pos_z);
   test = LLVMBuildAnd(builder, test, temp, "");
@@ -1313,6 +1319,20 @@ generate_clipmask(struct draw_llvm *llvm,
   }
}
 }
+   if (key->need_edgeflags) {
+  /*
+   * This isn't really part of clipmask but stored the same in vertex
+   * header later, so do it here.
+   */
+  unsigned edge_attr = llvm->draw->vs.edgeflag_output;
+  LLVMValueRef one = lp_build_const_vec(gallivm, f32_type, 1.0);
+  LLVMValueRef edgeflag = LLVMBuildLoad(builder, outputs[edge_attr][0], 
"");
+  test = lp_build_compare(gallivm, f32_type, 

[Mesa-dev] [Bug 92570] 10 bit h264 OMX UVD decode outputs NV12

2015-12-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92570

--- Comment #3 from Andy Furniss  ---
(In reply to Andy Furniss from comment #2)

> If so why not output nv16 or something else 10 bit?

lol at me re-reading this and remembering that nv16 is 8 bit 422.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage

2015-12-15 Thread Jason Ekstrand
On Tue, Dec 15, 2015 at 9:30 AM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga"  wrote:
>>>
>>> We want to make sure that the driver does not disable the FS unit if
>>> the shader code only has SSBO writes (i.e. no color or depth output).
>>>
>>> We could go a step further and check if the shader storage is actually
>>> used for writing, but does not seem worth the trouble. Also, we do the
>>> same thing for atomic buffers.
>>>
>>> Fixes the following CTS test:
>>> ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs
>>> ---
>>>  src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++-
>>>  src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 +
>>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c
>> b/src/mesa/drivers/dri/i965/gen7_wm_state.c
>>> index 06d5e65..d292b13 100644
>>> --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
>>> +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
>>> @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw)
>>>dw1 |= GEN7_WM_KILL_ENABLE;
>>> }
>>>
>>> -   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) {
>>> +   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) ||
>>> +   _mesa_active_fragment_shader_has_shader_storage(>ctx)) {
>>
>> Ugh... We also need to be checking for images.
>>
>
> The same bit is set when the shader has images or a bunch of other
> things a couple of lines below.  No idea why atomic counters are handled
> separately.

Right.  So I guess this series is correct if not optimal.  I think I'm
still a fan of has_side_effects, but I don't care too much how it's
done as long as we make some effort to be consistent.

>> How about we change it to active_fragment_shader_has_side_effects and make
>> it check all three?
>>
>>>dw1 |= GEN7_WM_DISPATCH_ENABLE;
>>> }
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>>> index 945f710..8769269 100644
>>> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
>>> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>>> @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw,
>>>  * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS |
>> _NEW_COLOR
>>>  */
>>> if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) ||
>>> +_mesa_active_fragment_shader_has_shader_storage(>ctx) ||
>>>  prog_data->base.nr_image_params) &&
>>> !brw_color_buffer_write_enabled(brw))
>>>dw1 |= GEN8_PSX_SHADER_HAS_UAV;
>>> --
>>> 1.9.1
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path

2015-12-15 Thread Roland Scheidegger
Am 15.12.2015 um 17:25 schrieb Brian Paul:
> On 12/14/2015 08:38 PM, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> We just ignored them altogether. While this feature is rather
>> old-fashioned
>> supporting it is actually rather trivial.
>> This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2
>> gl-2.0-edgeflag
>> and all (7) of point-vertex-id).
>> ---
>>   src/gallium/auxiliary/draw/draw_llvm.c | 77
>> +++---
>>   .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |  1 +
>>   2 files changed, 54 insertions(+), 24 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c
>> b/src/gallium/auxiliary/draw/draw_llvm.c
>> index a966e45..18a3d81 100644
>> --- a/src/gallium/auxiliary/draw/draw_llvm.c
>> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
>> @@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm,
>>   LLVMValueRef* aos,
>>   int attrib,
>>   int num_outputs,
>> -LLVMValueRef clipmask)
>> +LLVMValueRef clipmask,
>> +boolean need_edgeflag)
>>   {
>>  LLVMBuilderRef builder = gallivm->builder;
>>  LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib);
>> @@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm,
>>  */
>> assert(DRAW_TOTAL_CLIP_PLANES==14);
>> /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */
>> -  vertex_id_pad_edgeflag = (0x << 16) | (1 <<
>> DRAW_TOTAL_CLIP_PLANES);
>> -  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type),
>> vertex_id_pad_edgeflag);
>> +  if (!need_edgeflag) {
>> + vertex_id_pad_edgeflag = (0x << 16) | (1 <<
>> DRAW_TOTAL_CLIP_PLANES);
>> +  }
>> +  else {
>> + vertex_id_pad_edgeflag = (0x << 16);
>> +  }
>> +  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type),
>> +   vertex_id_pad_edgeflag);
>> /* OR with the clipmask */
>> cliptmp = LLVMBuildOr(builder, val, clipmask, "");
>> for (i = 0; i < vector_length; i++) {
>> @@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm,
>>  LLVMValueRef clipmask,
>>  int num_outputs,
>>  struct lp_type soa_type,
>> -   boolean have_clipdist)
>> +   boolean need_edgeflag)
>>   {
>>  LLVMBuilderRef builder = gallivm->builder;
>>  unsigned chan, attrib, i;
>> @@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm,
>> aos,
>> attrib,
>> num_outputs,
>> -  clipmask);
>> +  clipmask,
>> +  need_edgeflag);
>>  }
>>   #if DEBUG_STORE
>>  lp_build_printf(gallivm, "   # storing end\n");
>> @@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm,
>> struct gallivm_state *gallivm,
>> struct lp_type vs_type,
>> LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
>> -  boolean clip_xy,
>> -  boolean clip_z,
>> -  boolean clip_user,
>> -  boolean clip_halfz,
>> -  unsigned ucp_enable,
>> +  struct draw_llvm_variant_key *key,
>> LLVMValueRef context_ptr,
>> boolean *have_clipdist)
>>   {
>> @@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm,
>>  const unsigned pos = llvm->draw->vs.position_output;
>>  const unsigned cv = llvm->draw->vs.clipvertex_output;
>>  int num_written_clipdistance =
>> llvm->draw->vs.vertex_shader->info.num_written_clipdistance;
>> -   bool have_cd = false;
>> +   boolean have_cd = false;
>> +   boolean clip_user = key->clip_user;
>> +   unsigned ucp_enable = key->ucp_enable;
>>  unsigned cd[2];
>>
>>  cd[0] = llvm->draw->vs.clipdistance_output[0];
>> @@ -1196,7 +1202,7 @@ generate_clipmask(struct draw_llvm *llvm,
>>  }
>>
>>  /* Cliptest, for hardwired planes */
>> -   if (clip_xy) {
>> +   if (key->clip_xy) {
>> /* plane 1 */
>> test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER,
>> pos_x , pos_w);
>> temp = shift;
>> @@ -1224,9 +1230,9 @@ generate_clipmask(struct draw_llvm *llvm,
>> mask = LLVMBuildOr(builder, mask, test, "");
>>  }
>>
>> -   if (clip_z) {
>> +   if (key->clip_z) {
>> temp = lp_build_const_int_vec(gallivm, i32_type, 16);
>> -  if (clip_halfz) {
>> +  if (key->clip_halfz) {
>>/* plane 5 */
>>test = lp_build_compare(gallivm, f32_type,
>> PIPE_FUNC_GREATER, zero, pos_z);
>>test = LLVMBuildAnd(builder, test, temp, "");
>> @@ -1313,6 +1319,20 @@ generate_clipmask(struct draw_llvm *llvm,
>>}
>> }
>>  }
>> +   if (key->need_edgeflags) {
>> +  

Re: [Mesa-dev] [PATCH 1/8] nir: Silence missing field initializer warnings for nir_src

2015-12-15 Thread Ian Romanick
On 12/14/2015 07:36 PM, Jason Ekstrand wrote:
> On Mon, Dec 14, 2015 at 5:12 PM, Ian Romanick  wrote:
>> On 12/14/2015 04:39 PM, Ilia Mirkin wrote:
>>> On Mon, Dec 14, 2015 at 7:28 PM, Ian Romanick  wrote:
 On 12/14/2015 03:38 PM, Ilia Mirkin wrote:
> It's a pretty standard feature of compilers to init things to 0 and
> not have the full structure specified like that... what compiler are
> you seeing these with? Can we just fix the glitch with a
> -Wno-stupid-warnings?

 I have observed this with several versions of GCC.

 In C, you can avoid this with a trailing comma like:

 #define NIR_SRC_INIT (nir_src) { { NULL }, }

 However, nir.h is also used in some C++ code where that doesn't help.

 To be honest, I'm not a big fan of these macros.  Without C99 designated
 initalizers, maintaining initializers like these (or the ones in
 src/glsl/builtin_variables.cpp) is a real pain.  We can't use those, and
 we can't use C++ constructors.  We have no good options available. :(

 I thought about replacing them with a static inline function that
 returns a zero-initialized struct.  The compiler should generate the
 same code.  However, that doesn't work with uses like those in patch 3.

 I'm also a little curious why you didn't raise this issue when I sent
 these patches out in August.  I removed the patch from the series that
 you objected to back then.
>>>
>>> I have absolutely no recollection of any of that. Perhaps I saw "nir"
>>> and thought to myself, "don't care, let them do whatever, this won't
>>> ever affect me". Which is a sentiment I'm happy to continue with, by
>>> the way.
>>
>> Fair enough. :)  The patch I removed was one that removed the gl_context
>> parameter from a function in dd_function_table.
>>
>> http://patchwork.freedesktop.org/patch/58048/
>>
>>> I know that doing
>>>
>>> x = {}
>>>
>>> is a gcc extension, but I thought that {0} should always work (with
>>> enough {} nesting in case the first element is a struct). Perhaps it
>>
>> {0} is, basically what we're doing now, and GCC complains about it with
>> -Wmissing-field-initializers or -Wextra.  When we added C-style struct
> 
> I'm not a big fan of spending time fixing warnings that you have to
> add -Wextra to get.  However, if there are C++ issues, then those
> definitely need to get fixed.

Those options found real bugs in builtin_variables.cpp, and I'm a big
fan of that.

>> and array initializers to GLSL, we discussed adding this sort of
>> implicit zero initialization.  I did some digging in the C89 and C99
>> specs, and I have some recollection that in this case the missing fields
>> get undefined values... but, starting with C99, {0, } implicitly
>> initializes the missing fields to zero.  I also seem to recall that bit
>> of weirdness in C is why quite a few people were opposed to adding it to
>> GLSL.  This was several years ago, so my memory may not be completely
>> reliable.
>>
>>> doesn't in C++? I could believe that, although I'd be surprised.
>>
>> The initializer support in C++ intentionally quite a bit more primitive
>> than in C99.  The language designers want you to use constructors
>> whether it's the best tool for the job or not... which is why there are
>> no designated initializers.
> 
> So, I've got a patch somewhere that switches based on __cplusplus and
> defines NIR_SRC_INIT as either the C99 thing or nir_src() for C++.

I thought about doing something like that too.  Having to maintain and
keep in sync two separate versions of the initializer / constructor
doesn't sound like a maintainable solution either.  At best, it's the
kind of thing that I expect someone to see in a year, say "WTF?", and
submit a patch to change.

At worst, in a year we decide to add some field to nir_src that isn't
zero initialized, and we forget to update one of the initializers... and
end up with a hard to find bug.

> Would that solve this problem?  There was also a bug recently about us
> not building with oricle studio that it would probably fix.  If so,
> let's do that rather than a gigantic mess of braces and zeros.

We explicitly removed support for Oracle Studio, so that's not a
consideration.

> --Jason
> 
>>> Anyways, didn't mean to stir the pot too much, just thought there
>>> might be a simpler way out of all this.
>>
>> Well, there are. :) We just can't use them due to some combination of
>> MSVC, C++, and C99.
>>
>>> Cheers,
>>>
>>>   -ilia
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw: handle edge flags in llvm path

2015-12-15 Thread sroland
From: Roland Scheidegger 

We just ignored them altogether. While this feature is rather old-fashioned
supporting it is actually rather trivial.
This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag
and all (7) of point-vertex-id).

v2: comment fixes, and make the use of the edgeflag in clipmask consistent
with when it's actually there (should be impossible to hit a case where the
difference would actually matter but still...)
---
 src/gallium/auxiliary/draw/draw_llvm.c | 86 +++---
 .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |  1 +
 2 files changed, 61 insertions(+), 26 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index a966e45..89ed045 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm,
 LLVMValueRef* aos,
 int attrib,
 int num_outputs,
-LLVMValueRef clipmask)
+LLVMValueRef clipmask,
+boolean need_edgeflag)
 {
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib);
@@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm,
*/
   assert(DRAW_TOTAL_CLIP_PLANES==14);
   /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */
-  vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES);
-  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), 
vertex_id_pad_edgeflag);
+  if (!need_edgeflag) {
+ vertex_id_pad_edgeflag = (0x << 16) | (1 << 
DRAW_TOTAL_CLIP_PLANES);
+  }
+  else {
+ vertex_id_pad_edgeflag = (0x << 16);
+  }
+  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type),
+   vertex_id_pad_edgeflag);
   /* OR with the clipmask */
   cliptmp = LLVMBuildOr(builder, val, clipmask, "");
   for (i = 0; i < vector_length; i++) {
@@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm,
LLVMValueRef clipmask,
int num_outputs,
struct lp_type soa_type,
-   boolean have_clipdist)
+   boolean need_edgeflag)
 {
LLVMBuilderRef builder = gallivm->builder;
unsigned chan, attrib, i;
@@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm,
   aos,
   attrib,
   num_outputs,
-  clipmask);
+  clipmask,
+  need_edgeflag);
}
 #if DEBUG_STORE
lp_build_printf(gallivm, "   # storing end\n");
@@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm,
   struct gallivm_state *gallivm,
   struct lp_type vs_type,
   LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
-  boolean clip_xy,
-  boolean clip_z,
-  boolean clip_user,
-  boolean clip_halfz,
-  unsigned ucp_enable,
+  struct draw_llvm_variant_key *key,
   LLVMValueRef context_ptr,
   boolean *have_clipdist)
 {
@@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm,
const unsigned pos = llvm->draw->vs.position_output;
const unsigned cv = llvm->draw->vs.clipvertex_output;
int num_written_clipdistance = 
llvm->draw->vs.vertex_shader->info.num_written_clipdistance;
-   bool have_cd = false;
+   boolean have_cd = false;
+   boolean clip_user = key->clip_user;
+   unsigned ucp_enable = key->ucp_enable;
unsigned cd[2];
 
cd[0] = llvm->draw->vs.clipdistance_output[0];
@@ -1196,7 +1202,11 @@ generate_clipmask(struct draw_llvm *llvm,
}
 
/* Cliptest, for hardwired planes */
-   if (clip_xy) {
+   /*
+* XXX should take guardband into account (currently not in key).
+* Otherwise might run the draw pipeline stages for nothing.
+*/
+   if (key->clip_xy) {
   /* plane 1 */
   test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , 
pos_w);
   temp = shift;
@@ -1224,9 +1234,9 @@ generate_clipmask(struct draw_llvm *llvm,
   mask = LLVMBuildOr(builder, mask, test, "");
}
 
-   if (clip_z) {
+   if (key->clip_z) {
   temp = lp_build_const_int_vec(gallivm, i32_type, 16);
-  if (clip_halfz) {
+  if (key->clip_halfz) {
  /* plane 5 */
  test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, 
pos_z);
  test = LLVMBuildAnd(builder, test, temp, "");
@@ -1313,6 +1323,20 @@ generate_clipmask(struct draw_llvm *llvm,
  }
   }
}
+   if (key->need_edgeflags) {
+  /*
+   * This isn't really part of clipmask but stored the same in vertex
+   * header later, so do it here.
+   */
+  unsigned 

Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path

2015-12-15 Thread Brian Paul


Reviewed-by: Brian Paul 


On 12/15/2015 10:06 AM, srol...@vmware.com wrote:

From: Roland Scheidegger 

We just ignored them altogether. While this feature is rather old-fashioned
supporting it is actually rather trivial.
This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag
and all (7) of point-vertex-id).

v2: comment fixes, and make the use of the edgeflag in clipmask consistent
with when it's actually there (should be impossible to hit a case where the
difference would actually matter but still...)
---
  src/gallium/auxiliary/draw/draw_llvm.c | 86 +++---
  .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |  1 +
  2 files changed, 61 insertions(+), 26 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index a966e45..89ed045 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm,
  LLVMValueRef* aos,
  int attrib,
  int num_outputs,
-LLVMValueRef clipmask)
+LLVMValueRef clipmask,
+boolean need_edgeflag)
  {
 LLVMBuilderRef builder = gallivm->builder;
 LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib);
@@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm,
 */
assert(DRAW_TOTAL_CLIP_PLANES==14);
/* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */
-  vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES);
-  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), 
vertex_id_pad_edgeflag);
+  if (!need_edgeflag) {
+ vertex_id_pad_edgeflag = (0x << 16) | (1 << 
DRAW_TOTAL_CLIP_PLANES);
+  }
+  else {
+ vertex_id_pad_edgeflag = (0x << 16);
+  }
+  val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type),
+   vertex_id_pad_edgeflag);
/* OR with the clipmask */
cliptmp = LLVMBuildOr(builder, val, clipmask, "");
for (i = 0; i < vector_length; i++) {
@@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm,
 LLVMValueRef clipmask,
 int num_outputs,
 struct lp_type soa_type,
-   boolean have_clipdist)
+   boolean need_edgeflag)
  {
 LLVMBuilderRef builder = gallivm->builder;
 unsigned chan, attrib, i;
@@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm,
aos,
attrib,
num_outputs,
-  clipmask);
+  clipmask,
+  need_edgeflag);
 }
  #if DEBUG_STORE
 lp_build_printf(gallivm, "   # storing end\n");
@@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm,
struct gallivm_state *gallivm,
struct lp_type vs_type,
LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
-  boolean clip_xy,
-  boolean clip_z,
-  boolean clip_user,
-  boolean clip_halfz,
-  unsigned ucp_enable,
+  struct draw_llvm_variant_key *key,
LLVMValueRef context_ptr,
boolean *have_clipdist)
  {
@@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm,
 const unsigned pos = llvm->draw->vs.position_output;
 const unsigned cv = llvm->draw->vs.clipvertex_output;
 int num_written_clipdistance = 
llvm->draw->vs.vertex_shader->info.num_written_clipdistance;
-   bool have_cd = false;
+   boolean have_cd = false;
+   boolean clip_user = key->clip_user;
+   unsigned ucp_enable = key->ucp_enable;
 unsigned cd[2];

 cd[0] = llvm->draw->vs.clipdistance_output[0];
@@ -1196,7 +1202,11 @@ generate_clipmask(struct draw_llvm *llvm,
 }

 /* Cliptest, for hardwired planes */
-   if (clip_xy) {
+   /*
+* XXX should take guardband into account (currently not in key).
+* Otherwise might run the draw pipeline stages for nothing.
+*/
+   if (key->clip_xy) {
/* plane 1 */
test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , 
pos_w);
temp = shift;
@@ -1224,9 +1234,9 @@ generate_clipmask(struct draw_llvm *llvm,
mask = LLVMBuildOr(builder, mask, test, "");
 }

-   if (clip_z) {
+   if (key->clip_z) {
temp = lp_build_const_int_vec(gallivm, i32_type, 16);
-  if (clip_halfz) {
+  if (key->clip_halfz) {
   /* plane 5 */
   test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, 
pos_z);
   test = LLVMBuildAnd(builder, test, temp, "");
@@ -1313,6 +1323,20 @@ generate_clipmask(struct draw_llvm *llvm,
   }
}
 }
+   if (key->need_edgeflags) {
+   

Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage

2015-12-15 Thread Francisco Jerez
Jason Ekstrand  writes:

> On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga"  wrote:
>>
>> We want to make sure that the driver does not disable the FS unit if
>> the shader code only has SSBO writes (i.e. no color or depth output).
>>
>> We could go a step further and check if the shader storage is actually
>> used for writing, but does not seem worth the trouble. Also, we do the
>> same thing for atomic buffers.
>>
>> Fixes the following CTS test:
>> ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs
>> ---
>>  src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++-
>>  src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 +
>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c
> b/src/mesa/drivers/dri/i965/gen7_wm_state.c
>> index 06d5e65..d292b13 100644
>> --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
>> @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw)
>>dw1 |= GEN7_WM_KILL_ENABLE;
>> }
>>
>> -   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) {
>> +   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) ||
>> +   _mesa_active_fragment_shader_has_shader_storage(>ctx)) {
>
> Ugh... We also need to be checking for images.
>

The same bit is set when the shader has images or a bunch of other
things a couple of lines below.  No idea why atomic counters are handled
separately.

> How about we change it to active_fragment_shader_has_side_effects and make
> it check all three?
>
>>dw1 |= GEN7_WM_DISPATCH_ENABLE;
>> }
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> index 945f710..8769269 100644
>> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw,
>>  * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS |
> _NEW_COLOR
>>  */
>> if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) ||
>> +_mesa_active_fragment_shader_has_shader_storage(>ctx) ||
>>  prog_data->base.nr_image_params) &&
>> !brw_color_buffer_write_enabled(brw))
>>dw1 |= GEN8_PSX_SHADER_HAS_UAV;
>> --
>> 1.9.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters

2015-12-15 Thread Anuj Phogat
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensen
 wrote:
> ---
>  src/glsl/builtin_variables.cpp  |  5 +
>  src/glsl/glsl_parser_extras.cpp |  1 +
>  src/glsl/glsl_parser_extras.h   |  2 ++
>  src/glsl/nir/nir.c  |  8 
>  src/glsl/nir/nir_intrinsics.h   |  2 ++
>  src/glsl/nir/shader_enums.h | 20 
>  src/glsl/standalone_scaffolding.cpp |  1 +
>  src/mesa/main/extensions_table.h|  1 +
>  src/mesa/main/mtypes.h  |  1 +
>  9 files changed, 41 insertions(+)
>
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index e8eab80..e82c99e 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars()
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
> if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> +   if (state->ARB_shader_draw_parameters_enable) {
> +  add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB");
> +  add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, 
> "gl_BaseInstanceARB");
> +  add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB");
> +   }
> if (state->AMD_vertex_shader_layer_enable) {
>var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
>var->data.interpolation = INTERP_QUALIFIER_FLAT;
> diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
> index 29cf0c6..8c46f14 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -608,6 +608,7 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
> EXT(ARB_shader_atomic_counters,   true,  false, 
> ARB_shader_atomic_counters),
> EXT(ARB_shader_bit_encoding,  true,  false, 
> ARB_shader_bit_encoding),
> EXT(ARB_shader_clock, true,  false, ARB_shader_clock),
> +   EXT(ARB_shader_draw_parameters,   true,  false, 
> ARB_shader_draw_parameters),
> EXT(ARB_shader_image_load_store,  true,  false, 
> ARB_shader_image_load_store),
> EXT(ARB_shader_image_size,true,  false, 
> ARB_shader_image_size),
> EXT(ARB_shader_precision, true,  false, 
> ARB_shader_precision),
> diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
> index a4bda77..afb99af 100644
> --- a/src/glsl/glsl_parser_extras.h
> +++ b/src/glsl/glsl_parser_extras.h
> @@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state {
> bool ARB_shader_bit_encoding_warn;
> bool ARB_shader_clock_enable;
> bool ARB_shader_clock_warn;
> +   bool ARB_shader_draw_parameters_enable;
> +   bool ARB_shader_draw_parameters_warn;
> bool ARB_shader_image_load_store_enable;
> bool ARB_shader_image_load_store_warn;
> bool ARB_shader_image_size_enable;
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 35fc1de..4b70e7c 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val)
>return nir_intrinsic_load_vertex_id;
> case SYSTEM_VALUE_INSTANCE_ID:
>return nir_intrinsic_load_instance_id;
> +   case SYSTEM_VALUE_DRAW_ID:
> +  return nir_intrinsic_load_draw_id;
> +   case SYSTEM_VALUE_BASE_INSTANCE:
> +  return nir_intrinsic_load_base_instance;
> case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
>return nir_intrinsic_load_vertex_id_zero_base;
> case SYSTEM_VALUE_BASE_VERTEX:
> @@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op 
> intrin)
>return SYSTEM_VALUE_VERTEX_ID;
> case nir_intrinsic_load_instance_id:
>return SYSTEM_VALUE_INSTANCE_ID;
> +   case nir_intrinsic_load_draw_id:
> +  return SYSTEM_VALUE_DRAW_ID;
> +   case nir_intrinsic_load_base_instance:
> +  return SYSTEM_VALUE_BASE_INSTANCE;
> case nir_intrinsic_load_vertex_id_zero_base:
>return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
> case nir_intrinsic_load_base_vertex:
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index 9811fb3..917c805 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0)
>  SYSTEM_VALUE(vertex_id_zero_base, 1, 0)
>  SYSTEM_VALUE(base_vertex, 1, 0)
>  SYSTEM_VALUE(instance_id, 1, 0)
> +SYSTEM_VALUE(base_instance, 1, 0)
> +SYSTEM_VALUE(draw_id, 1, 0)
>  SYSTEM_VALUE(sample_id, 1, 0)
>  SYSTEM_VALUE(sample_pos, 2, 0)
>  SYSTEM_VALUE(sample_mask_in, 1, 0)
> diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h
> index dd0e0ba..0be217c 100644
> --- a/src/glsl/nir/shader_enums.h
> +++ b/src/glsl/nir/shader_enums.h
> @@ -379,6 +379,26 @@ typedef enum
>  * \sa SYSTEM_VALUE_VERTEX_ID, 

Re: [Mesa-dev] [PATCH 1/7] mesa/vbo: Add draw_id field to struct _mesa_prim

2015-12-15 Thread Anuj Phogat
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensen
 wrote:
> The drivers will need this for passing in gl_DrawIDARB. For indirect
> multidraw calls, we get the prim array and prim[i].draw_id == i and is
> redundant. But for non-indirect calls, we get one primitive at a time
> and need the draw_id field.
> ---
>  src/mesa/vbo/vbo.h| 1 +
>  src/mesa/vbo/vbo_exec_array.c | 5 +
>  2 files changed, 6 insertions(+)
>
> diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
> index 00e843c..cef3b8c 100644
> --- a/src/mesa/vbo/vbo.h
> +++ b/src/mesa/vbo/vbo.h
> @@ -58,6 +58,7 @@ struct _mesa_prim {
> GLint basevertex;
> GLuint num_instances;
> GLuint base_instance;
> +   GLuint draw_id;
>
> GLsizeiptr indirect_offset;
>  };
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index e27fdd9..7ff78dc 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -1,3 +1,4 @@
> +
>  /**
>   *
>   * Copyright 2003 VMware, Inc.
> @@ -1341,6 +1342,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
> GLenum mode,
>  prim[i].indexed = 1;
>   prim[i].num_instances = 1;
>   prim[i].base_instance = 0;
> + prim[i].draw_id = i;
>   prim[i].is_indirect = 0;
>  if (basevertex != NULL)
> prim[i].basevertex = basevertex[i];
> @@ -1371,6 +1373,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
> GLenum mode,
>  prim[0].indexed = 1;
>   prim[0].num_instances = 1;
>   prim[0].base_instance = 0;
> + prim[0].draw_id = i;
>   prim[0].is_indirect = 0;
>  if (basevertex != NULL)
> prim[0].basevertex = basevertex[i];
> @@ -1598,6 +1601,7 @@ vbo_validated_multidrawarraysindirect(struct gl_context 
> *ctx,
>prim[i].mode = mode;
>prim[i].indirect_offset = offset;
>prim[i].is_indirect = 1;
> +  prim[i].draw_id = i;
> }
>
> check_buffers_are_unmapped(exec->array.inputs);
> @@ -1684,6 +1688,7 @@ vbo_validated_multidrawelementsindirect(struct 
> gl_context *ctx,
>prim[i].indexed = 1;
>prim[i].indirect_offset = offset;
>prim[i].is_indirect = 1;
> +  prim[i].draw_id = i;
> }
>
> check_buffers_are_unmapped(exec->array.inputs);
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB

2015-12-15 Thread Anuj Phogat
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensen
 wrote:
> We already have gl_BaseVertexARB in the .x component of the SGVS vec4
> and plug gl_BaseInstanceARB into the last free component (.y).
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_context.h   |  9 --
>  src/mesa/drivers/dri/i965/brw_draw.c  | 12 ++--
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 35 
> ++-
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  3 +-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 ++-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  6 +++-
>  src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++-
>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 +++-
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 35 
> ++-
>  11 files changed, 102 insertions(+), 38 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 218d9c7..58ee966 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -547,6 +547,8 @@ struct brw_vs_prog_data {
>
> bool uses_vertexid;
> bool uses_instanceid;
> +   bool uses_basevertex;
> +   bool uses_baseinstance;
Missed bool uses_drawid ?
>  };
>
>  struct brw_tcs_prog_data
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index a845541..1378402 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -905,8 +905,13 @@ struct brw_context
> uint32_t pma_stall_bits;
>
> struct {
> -  /** The value of gl_BaseVertex for the current _mesa_prim. */
> -  int gl_basevertex;
> +  struct {
> + /** The value of gl_BaseVertex for the current _mesa_prim. */
> + int gl_basevertex;
> +
> + /** The value of gl_BaseInstance for the current _mesa_prim. */
> + int gl_baseinstance;
> +  } params;
Missed gl_drawid and gl_drawid_bo ?
>
>/**
> * Buffer and offset used for GL_ARB_shader_draw_parameters
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 8398471..298ac06 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx,
>   }
>}
>
> -  brw->draw.gl_basevertex =
> +  brw->draw.params.gl_basevertex =
>   prims[i].indexed ? prims[i].basevertex : prims[i].start;
> -
> +  brw->draw.params.gl_baseinstance = prims[i].base_instance;
>drm_intel_bo_unreference(brw->draw.draw_params_bo);
>
>if (prims[i].is_indirect) {
> @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx,
>   brw->draw.draw_params_offset = 0;
>}
>
> +  /* gl_DrawID always needs its own vertex buffer since it's not part of
> +   * the indirect parameter buffer. */
> +  if (brw->vs.prog_data->uses_drawid) {
> + brw->draw.gl_drawid = prims[i].drawid;
brw->draw.gl_drawid = prims[i].draw_id;
> + drm_intel_bo_unreference(brw->draw.draw_id_bo);
> + brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
> +  }
> +
>if (brw->gen < 6)
>  brw_set_prim(brw, [i]);
>else
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index ea0f6f2..ccf963c 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -592,8 +592,10 @@ void
>  brw_prepare_shader_draw_parameters(struct brw_context *brw)
>  {
> /* For non-indirect draws, upload gl_BaseVertex. */
> -   if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) 
> {
> -  intel_upload_data(brw, >draw.gl_basevertex, 4, 4,
> +   if ((brw->vs.prog_data->uses_basevertex ||
> +brw->vs.prog_data->uses_baseinstance) &&
> +   brw->draw.draw_params_bo == NULL) {
> +  intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4,
> >draw.draw_params_bo,
>  >draw.draw_params_offset);
> }
> @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw)
> brw_emit_query_begin(brw);
>
> unsigned nr_elements = brw->vb.nr_enabled;
> -   if (brw->vs.prog_data->uses_vertexid || 
> brw->vs.prog_data->uses_instanceid)
> +   if (brw->vs.prog_data->uses_vertexid || 
> brw->vs.prog_data->uses_instanceid ||
> +   brw->vs.prog_data->uses_basevertex || 
> brw->vs.prog_data->uses_baseinstance)
>++nr_elements;
>
> /* If the VS doesn't read any inputs (calculating vertex position from
> @@ -693,8 +696,10 @@ brw_emit_vertices(struct brw_context *brw)
> /* Now emit VB and VEP 

Re: [Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer

2015-12-15 Thread Jordan Justen
Ah! I had just also discovered this issue yesterday in some related
work but I didn't get the chance to try the CTS yet! :)

For the subject I had: "Gen 8 requires 64 byte alignment for push
constant data"

On 2015-12-15 03:55:15, Iago Toral Quiroga wrote:
> Page 502 of the Command Reference Broadwell PRM says that CURBE Total
> Data Length must be 64-bit aligned.

I think both the base and the size alignments are bumped from 32 to
64. Could you add the base address? How about giving the
volume/chapter/section in the spec reference rather than the page
number?

Also, could you update the call to brw_state_batch to also use 64 byte
alignment for the base on gen8+?

-Jordan

> 
> Fixes the following CTS tests:
> ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c 
> b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 1fde69c..dbd1967 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
>  
> unsigned push_constant_data_size =
>(prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> +   unsigned reg_aligned_constant_size =
> +  ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> unsigned push_constant_regs = reg_aligned_constant_size / 32;
> unsigned threads = get_cs_thread_count(cs_prog_data);
>  
> @@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
>  
>const unsigned push_constant_data_size =
>   (local_id_dwords + prog_data->nr_params) * 
> sizeof(gl_constant_value);
> -  const unsigned reg_aligned_constant_size = 
> ALIGN(push_constant_data_size, 32);
> +  const unsigned reg_aligned_constant_size =
> + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>const unsigned param_aligned_count =
>   reg_aligned_constant_size / sizeof(*param);
>  
> -- 
> 1.9.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters

2015-12-15 Thread Kristian Høgsberg Kristensen
---
 src/glsl/builtin_variables.cpp  |  5 +
 src/glsl/glsl_parser_extras.cpp |  1 +
 src/glsl/glsl_parser_extras.h   |  2 ++
 src/glsl/nir/nir.c  |  8 
 src/glsl/nir/nir_intrinsics.h   |  2 ++
 src/glsl/nir/shader_enums.h | 20 
 src/glsl/standalone_scaffolding.cpp |  1 +
 src/mesa/main/extensions_table.h|  1 +
 src/mesa/main/mtypes.h  |  1 +
 9 files changed, 41 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index e8eab80..e82c99e 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars()
   add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
   add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
+   if (state->ARB_shader_draw_parameters_enable) {
+  add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB");
+  add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, 
"gl_BaseInstanceARB");
+  add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB");
+   }
if (state->AMD_vertex_shader_layer_enable) {
   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
   var->data.interpolation = INTERP_QUALIFIER_FLAT;
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 29cf0c6..8c46f14 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -608,6 +608,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_shader_atomic_counters,   true,  false, 
ARB_shader_atomic_counters),
EXT(ARB_shader_bit_encoding,  true,  false, 
ARB_shader_bit_encoding),
EXT(ARB_shader_clock, true,  false, ARB_shader_clock),
+   EXT(ARB_shader_draw_parameters,   true,  false, 
ARB_shader_draw_parameters),
EXT(ARB_shader_image_load_store,  true,  false, 
ARB_shader_image_load_store),
EXT(ARB_shader_image_size,true,  false, 
ARB_shader_image_size),
EXT(ARB_shader_precision, true,  false, 
ARB_shader_precision),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index a4bda77..afb99af 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state {
bool ARB_shader_bit_encoding_warn;
bool ARB_shader_clock_enable;
bool ARB_shader_clock_warn;
+   bool ARB_shader_draw_parameters_enable;
+   bool ARB_shader_draw_parameters_warn;
bool ARB_shader_image_load_store_enable;
bool ARB_shader_image_load_store_warn;
bool ARB_shader_image_size_enable;
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 35fc1de..4b70e7c 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_vertex_id;
case SYSTEM_VALUE_INSTANCE_ID:
   return nir_intrinsic_load_instance_id;
+   case SYSTEM_VALUE_DRAW_ID:
+  return nir_intrinsic_load_draw_id;
+   case SYSTEM_VALUE_BASE_INSTANCE:
+  return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
case SYSTEM_VALUE_BASE_VERTEX:
@@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_VERTEX_ID;
case nir_intrinsic_load_instance_id:
   return SYSTEM_VALUE_INSTANCE_ID;
+   case nir_intrinsic_load_draw_id:
+  return SYSTEM_VALUE_DRAW_ID;
+   case nir_intrinsic_load_base_instance:
+  return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
case nir_intrinsic_load_base_vertex:
diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index 9811fb3..917c805 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0)
 SYSTEM_VALUE(vertex_id_zero_base, 1, 0)
 SYSTEM_VALUE(base_vertex, 1, 0)
 SYSTEM_VALUE(instance_id, 1, 0)
+SYSTEM_VALUE(base_instance, 1, 0)
+SYSTEM_VALUE(draw_id, 1, 0)
 SYSTEM_VALUE(sample_id, 1, 0)
 SYSTEM_VALUE(sample_pos, 2, 0)
 SYSTEM_VALUE(sample_mask_in, 1, 0)
diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h
index dd0e0ba..0be217c 100644
--- a/src/glsl/nir/shader_enums.h
+++ b/src/glsl/nir/shader_enums.h
@@ -379,6 +379,26 @@ typedef enum
 * \sa SYSTEM_VALUE_VERTEX_ID, SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
 */
SYSTEM_VALUE_BASE_VERTEX,
+
+   /**
+* Value of \c baseinstance passed to instanced draw entry points
+*
+* \sa SYSTEM_VALUE_INSTANCE_ID
+*/
+   SYSTEM_VALUE_BASE_INSTANCE,
+
+   /**
+* From _ARB_shader_draw_parameters:
+*
+*   

[Mesa-dev] [PATCH 5/7] i965: Add support for gl_DrawIDARB and enable extension

2015-12-15 Thread Kristian Høgsberg Kristensen
We have to break open a new vec4 for gl_DrawIDARB. We've used up all
space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
own separate vertex buffer anyway.  This is because we point the vb for
base vertex and base instance into the draw parameter BO for indirect
draw calls, but the draw id is generated by mesa in a different buffer.
---
 src/mesa/drivers/dri/i965/brw_compiler.h  |  1 +
 src/mesa/drivers/dri/i965/brw_context.h   |  9 +
 src/mesa/drivers/dri/i965/brw_draw.c  |  8 ++--
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 45 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  2 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 10 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 +++-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 -
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  5 +++
 src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 34 -
 src/mesa/drivers/dri/i965/intel_extensions.c  |  1 +
 12 files changed, 132 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index 58ee966..2333f4a 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -549,6 +549,7 @@ struct brw_vs_prog_data {
bool uses_instanceid;
bool uses_basevertex;
bool uses_baseinstance;
+   bool uses_drawid;
 };
 
 struct brw_tcs_prog_data
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 1378402..97ebf06 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -919,6 +919,15 @@ struct brw_context
*/
   drm_intel_bo *draw_params_bo;
   uint32_t draw_params_offset;
+
+  /**
+   * The value of gl_DrawID for the current _mesa_prim. This always comes
+   * in from it's own vertex buffer since it's not part of the indirect
+   * draw parameters.
+   */
+  int gl_drawid;
+  drm_intel_bo *draw_id_bo;
+  uint32_t draw_id_offset;
} draw;
 
struct {
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 298ac06..b0710c67 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -513,11 +513,9 @@ brw_try_draw_prims(struct gl_context *ctx,
 
   /* gl_DrawID always needs its own vertex buffer since it's not part of
* the indirect parameter buffer. */
-  if (brw->vs.prog_data->uses_drawid) {
- brw->draw.gl_drawid = prims[i].drawid;
- drm_intel_bo_unreference(brw->draw.draw_id_bo);
- brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
-  }
+  brw->draw.gl_drawid = prims[i].draw_id;
+  drm_intel_bo_unreference(brw->draw.draw_id_bo);
+  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
 
   if (brw->gen < 6)
 brw_set_prim(brw, [i]);
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index ccf963c..e601190 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -599,6 +599,12 @@ brw_prepare_shader_draw_parameters(struct brw_context *brw)
>draw.draw_params_bo,
 >draw.draw_params_offset);
}
+
+   if (brw->vs.prog_data->uses_drawid) {
+  intel_upload_data(brw, >draw.gl_drawid, 
sizeof(brw->draw.gl_drawid), 4,
+   >draw.draw_id_bo,
+>draw.draw_id_offset);
+   }
 }
 
 /**
@@ -663,6 +669,8 @@ brw_emit_vertices(struct brw_context *brw)
if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid 
||
brw->vs.prog_data->uses_basevertex || 
brw->vs.prog_data->uses_baseinstance)
   ++nr_elements;
+   if (brw->vs.prog_data->uses_drawid)
+  nr_elements++;
 
/* If the VS doesn't read any inputs (calculating vertex position from
 * a state variable for some reason, for example), emit a single pad
@@ -699,7 +707,8 @@ brw_emit_vertices(struct brw_context *brw)
const bool uses_draw_params =
   brw->vs.prog_data->uses_basevertex ||
   brw->vs.prog_data->uses_baseinstance;
-   const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params;
+   const unsigned nr_buffers = brw->vb.nr_buffers +
+  uses_draw_params + brw->vs.prog_data->uses_drawid;
 
if (nr_buffers) {
   if (brw->gen >= 6) {
@@ -726,6 +735,16 @@ brw_emit_vertices(struct brw_context *brw)
   0,  /* stride */
   0); /* step rate */
   }
+
+  if (brw->vs.prog_data->uses_drawid) {
+ EMIT_VERTEX_BUFFER_STATE(brw, brw->vb.nr_buffers + 1,
+  brw->draw.draw_id_bo,
+  

[Mesa-dev] [PATCH 7/7] i965: Reduce vertex state reemission

2015-12-15 Thread Kristian Høgsberg Kristensen
We can inspect VS prog_data for iterations i > 0, and only flag
BRW_NEW_VERTICES when one of our system values change.

This change also flags BRW_NEW_VERTICES in one case we were missing
before: if we're doing an indirect draw, prims[i].basevertex is always 0
and the real base vertex value is in the indirect parameter
buffer. Thus, if a program uses base vertex or base instance, and the
draw call is indirect, flag BRW_NEW_VERTICES.  A new piglit test,
spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this.
---
 src/mesa/drivers/dri/i965/brw_draw.c | 44 
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index b0710c67..9e400ca 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -491,9 +491,44 @@ brw_try_draw_prims(struct gl_context *ctx,
  }
   }
 
-  brw->draw.params.gl_basevertex =
+  /* Determine if we need to flag BRW_NEW_VERTICES for updating the
+   * gl_BaseVertexARB, gl_BaseInstanceARB or gl_DrawIDARB values. As
+   * above, we don't need to check first iteration, since the flag is set
+   * before the loop. We also can't rely on vs prog_data in the first
+   * iteration, but after drawing once, we've uploaded the programs and
+   * can look at prog_data.
+   *
+   * Despite the prims[] name, eache iteration correspond to a draw call
+   * from a glMulti* style draw call. We need to re-upload vertex state if
+   *
+   *  1) the program uses gl_DrawIDARB (changes every iteration),
+   *
+   *  2) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the
+   * draw call is indirect (meaning we can't check if the value change
+   * or not), or
+   *
+   *  3) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the
+   *  value changed
+   */
+  const int new_basevertex =
  prims[i].indexed ? prims[i].basevertex : prims[i].start;
-  brw->draw.params.gl_baseinstance = prims[i].base_instance;
+  const int new_baseinstance = prims[i].base_instance;
+  if (i > 0) {
+ const bool uses_draw_parameters =
+brw->vs.prog_data->uses_basevertex ||
+brw->vs.prog_data->uses_baseinstance;
+
+ if (brw->vs.prog_data->uses_drawid ||
+ (uses_draw_parameters && prims[i].is_indirect) ||
+ (brw->vs.prog_data->uses_basevertex &&
+  brw->draw.params.gl_basevertex != new_basevertex) ||
+ (brw->vs.prog_data->uses_baseinstance &&
+  brw->draw.params.gl_baseinstance != new_baseinstance))
+brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
+  }
+
+  brw->draw.params.gl_basevertex = new_basevertex;
+  brw->draw.params.gl_baseinstance = new_baseinstance;
   drm_intel_bo_unreference(brw->draw.draw_params_bo);
 
   if (prims[i].is_indirect) {
@@ -512,10 +547,11 @@ brw_try_draw_prims(struct gl_context *ctx,
   }
 
   /* gl_DrawID always needs its own vertex buffer since it's not part of
-   * the indirect parameter buffer. */
+   * the indirect parameter buffer.
+   */
   brw->draw.gl_drawid = prims[i].draw_id;
   drm_intel_bo_unreference(brw->draw.draw_id_bo);
-  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
+  brw->draw.draw_id_bo = NULL;
 
   if (brw->gen < 6)
 brw_set_prim(brw, [i]);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB

2015-12-15 Thread Kristian Høgsberg Kristensen
We already have gl_BaseVertexARB in the .x component of the SGVS vec4
and plug gl_BaseInstanceARB into the last free component (.y).
---
 src/mesa/drivers/dri/i965/brw_compiler.h  |  2 ++
 src/mesa/drivers/dri/i965/brw_context.h   |  9 --
 src/mesa/drivers/dri/i965/brw_draw.c  | 12 ++--
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 35 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  3 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 ++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  6 +++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++--
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 +++-
 src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 35 ++-
 11 files changed, 102 insertions(+), 38 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index 218d9c7..58ee966 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -547,6 +547,8 @@ struct brw_vs_prog_data {
 
bool uses_vertexid;
bool uses_instanceid;
+   bool uses_basevertex;
+   bool uses_baseinstance;
 };
 
 struct brw_tcs_prog_data
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a845541..1378402 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -905,8 +905,13 @@ struct brw_context
uint32_t pma_stall_bits;
 
struct {
-  /** The value of gl_BaseVertex for the current _mesa_prim. */
-  int gl_basevertex;
+  struct {
+ /** The value of gl_BaseVertex for the current _mesa_prim. */
+ int gl_basevertex;
+
+ /** The value of gl_BaseInstance for the current _mesa_prim. */
+ int gl_baseinstance;
+  } params;
 
   /**
* Buffer and offset used for GL_ARB_shader_draw_parameters
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 8398471..298ac06 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx,
  }
   }
 
-  brw->draw.gl_basevertex =
+  brw->draw.params.gl_basevertex =
  prims[i].indexed ? prims[i].basevertex : prims[i].start;
-
+  brw->draw.params.gl_baseinstance = prims[i].base_instance;
   drm_intel_bo_unreference(brw->draw.draw_params_bo);
 
   if (prims[i].is_indirect) {
@@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx,
  brw->draw.draw_params_offset = 0;
   }
 
+  /* gl_DrawID always needs its own vertex buffer since it's not part of
+   * the indirect parameter buffer. */
+  if (brw->vs.prog_data->uses_drawid) {
+ brw->draw.gl_drawid = prims[i].drawid;
+ drm_intel_bo_unreference(brw->draw.draw_id_bo);
+ brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
+  }
+
   if (brw->gen < 6)
 brw_set_prim(brw, [i]);
   else
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index ea0f6f2..ccf963c 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -592,8 +592,10 @@ void
 brw_prepare_shader_draw_parameters(struct brw_context *brw)
 {
/* For non-indirect draws, upload gl_BaseVertex. */
-   if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) {
-  intel_upload_data(brw, >draw.gl_basevertex, 4, 4,
+   if ((brw->vs.prog_data->uses_basevertex ||
+brw->vs.prog_data->uses_baseinstance) &&
+   brw->draw.draw_params_bo == NULL) {
+  intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4,
>draw.draw_params_bo,
 >draw.draw_params_offset);
}
@@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw)
brw_emit_query_begin(brw);
 
unsigned nr_elements = brw->vb.nr_enabled;
-   if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid)
+   if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid 
||
+   brw->vs.prog_data->uses_basevertex || 
brw->vs.prog_data->uses_baseinstance)
   ++nr_elements;
 
/* If the VS doesn't read any inputs (calculating vertex position from
@@ -693,8 +696,10 @@ brw_emit_vertices(struct brw_context *brw)
/* Now emit VB and VEP state packets.
 */
 
-   unsigned nr_buffers =
-  brw->vb.nr_buffers + brw->vs.prog_data->uses_vertexid;
+   const bool uses_draw_params =
+  brw->vs.prog_data->uses_basevertex ||
+  brw->vs.prog_data->uses_baseinstance;
+   const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params;
 
if (nr_buffers) {
   if (brw->gen >= 6) {
@@ -713,7 +718,7 @@ brw_emit_vertices(struct brw_context 

[Mesa-dev] [PATCH 3/7] i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered

2015-12-15 Thread Kristian Høgsberg Kristensen
fs_visitor::emit_vs_system_value() looks like it's trying to handle
SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the
backend.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 68f2548..d5193a9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -46,6 +46,7 @@ fs_visitor::emit_vs_system_value(int location)
   vs_prog_data->uses_vertexid = true;
   break;
case SYSTEM_VALUE_VERTEX_ID:
+  unreachable("should have been lowered");
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   reg->reg_offset = 2;
   vs_prog_data->uses_vertexid = true;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] GL_ARB_shader_draw_parameters

2015-12-15 Thread Kristian Høgsberg Kristensen
Hi,

Here's 7 patches to implement GL_ARB_shader_draw_parameters:

  https://www.opengl.org/registry/specs/ARB/shader_draw_parameters.txt

and I have few new piglit tests for the extension as well.

Kristian

Kristian Høgsberg Kristensen (7):
  mesa/vbo: Add draw_id field to struct _mesa_prim
  mesa: Add core mesa support for GL_ARB_shader_draw_parameters
  i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered
  i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
  i965: Add support for gl_DrawIDARB and enable extension
  nir: Teach nir_opt_algebraic about adding and subtracting the same
thing
  i965: Reduce vertex state reemission

 src/glsl/builtin_variables.cpp|  5 ++
 src/glsl/glsl_parser_extras.cpp   |  1 +
 src/glsl/glsl_parser_extras.h |  2 +
 src/glsl/nir/nir.c|  8 +++
 src/glsl/nir/nir_intrinsics.h |  2 +
 src/glsl/nir/nir_opt_algebraic.py |  4 ++
 src/glsl/nir/shader_enums.h   | 20 ++
 src/glsl/standalone_scaffolding.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_compiler.h  |  3 +
 src/mesa/drivers/dri/i965/brw_context.h   | 18 +-
 src/mesa/drivers/dri/i965/brw_draw.c  | 44 -
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 78 +++
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  5 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 18 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 17 -
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 20 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 18 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 11 +++-
 src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 65 +++
 src/mesa/drivers/dri/i965/intel_extensions.c  |  1 +
 src/mesa/main/extensions_table.h  |  1 +
 src/mesa/main/mtypes.h|  1 +
 src/mesa/vbo/vbo.h|  1 +
 src/mesa/vbo/vbo_exec_array.c |  5 ++
 24 files changed, 311 insertions(+), 38 deletions(-)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] mesa/vbo: Add draw_id field to struct _mesa_prim

2015-12-15 Thread Kristian Høgsberg Kristensen
The drivers will need this for passing in gl_DrawIDARB. For indirect
multidraw calls, we get the prim array and prim[i].draw_id == i and is
redundant. But for non-indirect calls, we get one primitive at a time
and need the draw_id field.
---
 src/mesa/vbo/vbo.h| 1 +
 src/mesa/vbo/vbo_exec_array.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
index 00e843c..cef3b8c 100644
--- a/src/mesa/vbo/vbo.h
+++ b/src/mesa/vbo/vbo.h
@@ -58,6 +58,7 @@ struct _mesa_prim {
GLint basevertex;
GLuint num_instances;
GLuint base_instance;
+   GLuint draw_id;
 
GLsizeiptr indirect_offset;
 };
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index e27fdd9..7ff78dc 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1,3 +1,4 @@
+
 /**
  * 
  * Copyright 2003 VMware, Inc.
@@ -1341,6 +1342,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
GLenum mode,
 prim[i].indexed = 1;
  prim[i].num_instances = 1;
  prim[i].base_instance = 0;
+ prim[i].draw_id = i;
  prim[i].is_indirect = 0;
 if (basevertex != NULL)
prim[i].basevertex = basevertex[i];
@@ -1371,6 +1373,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
GLenum mode,
 prim[0].indexed = 1;
  prim[0].num_instances = 1;
  prim[0].base_instance = 0;
+ prim[0].draw_id = i;
  prim[0].is_indirect = 0;
 if (basevertex != NULL)
prim[0].basevertex = basevertex[i];
@@ -1598,6 +1601,7 @@ vbo_validated_multidrawarraysindirect(struct gl_context 
*ctx,
   prim[i].mode = mode;
   prim[i].indirect_offset = offset;
   prim[i].is_indirect = 1;
+  prim[i].draw_id = i;
}
 
check_buffers_are_unmapped(exec->array.inputs);
@@ -1684,6 +1688,7 @@ vbo_validated_multidrawelementsindirect(struct gl_context 
*ctx,
   prim[i].indexed = 1;
   prim[i].indirect_offset = offset;
   prim[i].is_indirect = 1;
+  prim[i].draw_id = i;
}
 
check_buffers_are_unmapped(exec->array.inputs);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing

2015-12-15 Thread Kristian Høgsberg Kristensen
This optimizes a + b - b to just a. Modest shader-db results (BDW):

  total instructions in shared programs: 7842452 -> 7841862 (-0.01%)
  instructions in affected programs: 61938 -> 61348 (-0.95%)
  total loops in shared programs:2131 -> 2131 (0.00%)
  helped:263
  HURT:  0
  GAINED:0
  LOST:  0

but the optimization turns

  gl_VertexID - gl_BaseVertexARB

into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the
i965 hardware supports natively. That means we can avoid using the
internal vertex buffer for gl_BaseVertexARB in this case.
---
 src/glsl/nir/nir_opt_algebraic.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index cb715c0..1fdad3d 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -62,6 +62,10 @@ optimizations = [
(('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))),
(('fadd', ('fneg', a), a), 0.0),
(('iadd', ('ineg', a), a), 0),
+   (('iadd', ('ineg', a), ('iadd', a, b)), b),
+   (('iadd', a, ('iadd', ('ineg', a), b)), b),
+   (('fadd', ('fneg', a), ('fadd', a, b)), b),
+   (('fadd', a, ('fadd', ('fneg', a), b)), b),
(('fmul', a, 0.0), 0.0),
(('imul', a, 0), 0),
(('umul_unorm_4x8', a, 0), 0),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa: fix interface matching done in validate_io

2015-12-15 Thread Timothy Arceri
On Tue, 2015-12-15 at 07:58 +0200, Tapani Pälli wrote:
> On 12/15/2015 03:31 AM, Timothy Arceri wrote:
> > On Mon, 2015-12-14 at 10:29 +0200, Tapani Pälli wrote:
> > > Patch makes following changes for interface matching:
> > > 
> > > - do not try to match builtin variables
> > > - handle swizzle in input name, as example 'a.z' should
> > >   match with 'a'
> > > - check that amount of inputs and outputs matches
> > > 
> > > These changes make interface matching tests to work in:
> > > ES31-CTS.sepshaderobjs.StateInteraction
> > > 
> > > Test does not still pass completely due to errors in rendering
> > > output. IMO this is unrelated to interface matching.
> > > 
> > > v2: add spec reference, return true on desktop since we do not
> > >  have failing cases for it, inputs and outputs amount do not
> > >  need to match on desktop.
> > > 
> > > Signed-off-by: Tapani Pälli 
> > Hi Tapani,
> > 
> > Just a general comment first.
> > 
> > I think we should first move _mesa_validate_pipeline_io() and
> >   validate_io() to src/mesa/main/pipelineobj.c I don't think it
> > belongs
> > here right?
> 
> Sure, it can be done now. Original intention was to use program 
> resources and that is why it ended up being here.
> 
> > 
> > > ---
> > >   src/mesa/main/shader_query.cpp | 54
> > > ++
> > >   1 file changed, 50 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/mesa/main/shader_query.cpp
> > > b/src/mesa/main/shader_query.cpp
> > > index ced10a9..bc01b97 100644
> > > --- a/src/mesa/main/shader_query.cpp
> > > +++ b/src/mesa/main/shader_query.cpp
> > > @@ -1377,19 +1377,38 @@ validate_io(const struct gl_shader
> > > *input_stage,
> > >   const struct gl_shader *output_stage, bool isES)
> > >   {
> > >  assert(input_stage && output_stage);
> > > +   unsigned inputs = 0, outputs = 0;
> > > +
> > > +   /* Currently no matching done for desktop. */
> > I think the spec quote should be moved here as it applies to all
> > the
> > rules in the function then you can also have the comment explaining
> > why
> > validation for desktop it not done.
> 
> OK
> 
> > I've also filed a spec bug for desktop for the reasons I outlined
> > in
> > irc previously. It would be great if you could quote the bug here
> > also.
> > Something like:
> > 
> > /* FIXME: Update once Khronos spec bug #15331 is resolved. */
> 
> Sure, will add.
> 
> > > +   if (!isES)
> > > +  return true;
> > >   
> > >  /* For each output in a, find input in b and do any required
> > > checks. */
> > >  foreach_in_list(ir_instruction, out, input_stage->ir) {
> > > ir_variable *out_var = out->as_variable();
> > 
> > It's existing code but it would also be nice to have a patch that
> > renames input_stage/output_stage to producer_stage/consumer_stage
> > this
> > it what they are called in the linker code. Maybe its just me but
> > getting the outputs from input_stage just looks wrong.
> 
> OK, can change this.
> 
> > 
> > > -  if (!out_var || out_var->data.mode != ir_var_shader_out)
> > > +  if (!out_var || out_var->data.mode != ir_var_shader_out ||
> > > +  is_gl_identifier(out_var->name))
> > >continue;
> > >   
> > > +  outputs++;
> > > +
> > > +  inputs = 0;
> > > foreach_in_list(ir_instruction, in, output_stage->ir) {
> > Two comments here:
> > 
> > 1. Take a look at cross_validate_outputs_to_inputs() in
> > link_varyings.cpp for a way to avoid the nested loop? Although it
> > may
> > cause even more overhaed using the symbol table not sure.
> 
> I don't know if symbol table can be trusted as variables that get 
> optimized away or changed in some way are still there. Only way to be
> sure is to iterate IR or use resource list. Also, symbol table gets
> destroyed after linking. My first implementation was using a hash but
> that was also bad idea because variables names do not necessarily
> match 
> exactly.


The code in in cross_validate_outputs_to_inputs() doesn't use *the*
symbol table it use *a* which it builds from iterating over the
producers IR. But I guess it will have the same problem as the hash
table. I wonder why the linking code uses it rather than a plain hash
table.


> 
> > 2. Take a look at the same function for matching via explicit
> > location.
> > Does the CTS not test for mismatched explicit locations? Maybe we
> > should create a piglit test for this as your existing code doesn't
> > take
> > into account explicit locations.
> 
> No, I haven't seen this test using explicit locations. This patch
> also 
> makes the interface matching pass.

Right but it would break any varyings with explicit locations that
don't have a matching names which is legal.

"An output variable is considered to match an input variable in the
subsequent shader if:

  –the two variables match in name, type, and qualification; or
  –the two variables are declared with the same 

Re: [Mesa-dev] [PATCH 2/8] st/va: cleanup filter color standard handling

2015-12-15 Thread Emil Velikov
On 11 December 2015 at 12:33, Christian König  wrote:
> From: Christian König 
>
> Signed-off-by: Christian König 
> ---
>  src/gallium/state_trackers/va/surface.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/surface.c 
> b/src/gallium/state_trackers/va/surface.c
> index c052c8f..4a18a6f 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -697,11 +697,11 @@ vlVaQueryVideoProcFilterCaps(VADriverContextP ctx, 
> VAContextID context,
> return VA_STATUS_SUCCESS;
>  }
>
> -static VAProcColorStandardType 
> vpp_input_color_standards[VAProcColorStandardCount] = {
> +static VAProcColorStandardType vpp_input_color_standards[] = {
> VAProcColorStandardBT601
>  };
>
> -static VAProcColorStandardType 
> vpp_output_color_standards[VAProcColorStandardCount] = {
> +static VAProcColorStandardType vpp_output_color_standards[] = {
> VAProcColorStandardBT601
>  };
>
I was going to suggest to constifying them while we're here, yet it
seems that the VAAPI will just discard them. The whole API seems to
have only a few const qualifiers :-(

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] r600: fix viewport clipping magic.

2015-12-15 Thread Dave Airlie
> I have to NAK this series, but I was able to find something about the issue.
>
> If oViewport is used, VGT_REUSE_OFF must disable reuse. That's the correct 
> fix.
>
> If oViewport is constant, reuse can be enabled, but
> VTE_VPORT_PROVOKE_DISABLE must be set.

Okay I can confirm setting VGT_REUSE_OFF fixed the bug. I haven't got time
this week to smash out real patches and test them, but I'll try and
get to it soon.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware

2015-12-15 Thread Abdiel Janulgue


On 12/10/2015 06:23 AM, Jason Ekstrand wrote:
> While we're at it, we also add support for the possibility that the
> indirect is, in fact, a constant.  This shouldn't happen in the common case
> (if it does, that means NIR failed to constant-fold something), but it's
> possible so we should handle it.

Perhaps this should re-ordered before patch 3?

> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 
> +++---
>  2 files changed, 42 insertions(+), 13 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 9eaf8d0..a2ec03e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -4424,6 +4424,10 @@ get_lowered_simd_width(const struct brw_device_info 
> *devinfo,
> case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
>return 8;
>  
> +   case SHADER_OPCODE_MOV_INDIRECT:
> +  /* Prior to Broadwell, we only have 8 address subregisters */
> +  return devinfo->gen < 8 ? 8 : inst->exec_size;
> +
> default:
>return inst->exec_size;
> }
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index d86eee1..7fa6d84 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -351,22 +351,47 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
>  
> unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr;
>  
> -   /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
> -   struct brw_reg addr = vec8(brw_address_reg(0));
> +   if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) {
> +  imm_byte_offset += indirect_byte_offset.ud;
>  
> -   /* The destination stride of an instruction (in bytes) must be greater
> -* than or equal to the size of the rest of the instruction.  Since the
> -* address register is of type UW, we can't use a D-type instruction.
> -* In order to get around this, re re-type to UW and use a stride.
> -*/
> -   indirect_byte_offset =
> -  retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> +  reg.nr = imm_byte_offset / REG_SIZE;
> +  reg.subnr = imm_byte_offset % REG_SIZE;
> +  brw_MOV(p, dst, reg);
> +   } else {
> +  /* Prior to Broadwell, there are only 8 address registers. */
> +  assert(inst->exec_size == 8 || devinfo->gen >= 8);
>  
> -   /* Prior to Broadwell, there are only 8 address registers. */
> -   assert(inst->exec_size == 8 || devinfo->gen >= 8);
> +  /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
> +  struct brw_reg addr = vec8(brw_address_reg(0));
>  
> -   brw_MOV(p, addr, indirect_byte_offset);
> -   brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type));
> +  /* The destination stride of an instruction (in bytes) must be greater
> +   * than or equal to the size of the rest of the instruction.  Since the
> +   * address register is of type UW, we can't use a D-type instruction.
> +   * In order to get around this, re re-type to UW and use a stride.
> +   */
> +  indirect_byte_offset =
> + retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> +
> +  if (devinfo->gen < 8) {
> + /* Prior to broadwell, we have a restriction that the bottom 5 bits
> +  * of the base offset and the bottom 5 bits of the indirect must add
> +  * to less than 32.  In other words, the hardware needs to be able 
> to
> +  * add the bottom five bits of the two to get the subnumber and add
> +  * the next 7 bits of each to get the actual register number.  Since
> +  * the indirect may cause us to cross a register boundary, this 
> makes
> +  * it almost useless.  We could try and do something clever where we
> +  * use a actual base offset if base_offset % 32 == 0 but that would
> +  * mean we were generating different code depending on the base
> +  * offset.  Instead, for the sake of consistency, we'll just do the
> +  * add ourselves.
> +  */
> + brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
> + brw_MOV(p, dst, retype(brw_VxH_indirect(0, 0), dst.type));
> +  } else {
> + brw_MOV(p, addr, indirect_byte_offset);
> + brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), 
> dst.type));
> +  }
> +   }
>  }
>  
>  void
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91724] GL/gl_mangle.h misses symbols from GLES/gl.h

2015-12-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91724

--- Comment #4 from Frederic Devernay  ---
I updated the gist for the newest Mesa release:
https://gist.github.com/devernay/71f3d7661d910e6494a9

Note that, despite what Emil said in
http://lists.freedesktop.org/archives/mesa-dev/2014-December/072575.html using
dlopen(RTLD_LOCAL) may not be a viable option, for example if both the system's
libGL and the Mesa libGL depend on libraries with the same soname (llvm or X11
for example) but which are incompatible for some reason (in my case, I am
loading Mesa from a plugin, and I don't know what the host application has
loaded before - the only safe way is to use a mangled Mesa).

In short, RTLD_LOCAL works for the loaded library (in my case, it is the
plugin), but not its dependencies.
See https://sourceware.org/ml/libc-help/2014-08/msg00042.html for a full
explanation.

So please, don't remove the option to mangle Mesa symbols, unless there is a
viable and portable possibility to load a non-mangled mesa together with the
system libGL.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 1/1] i965: Do not overwrite optimizer dumps

2015-12-15 Thread Juan A. Suarez Romero
On Thu, 2015-12-10 at 09:47 -0800, Matt Turner wrote:
> Assuming that the cause is indeed non-orthogonal state changes, yes.
> But I never saw an answer to that question.
> 
> Reviewed-by: Matt Turner 


After rebasing and testing against master (11.1-branchpoint-653-
g5c5ad4d) I can't reproduce this issue anymore.

Optimizations both in brw vec4 and fs are just called once, so we don't
get steps overwritten.


So I think I'll keep this patch unpushed for now.


Thanks anyway.


J.A.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] st/va: remove fence handling

2015-12-15 Thread Christian König

Are you sure the flush after calling the compositor is really necessary?

That clearly looks odd, but if it works I'm fine with keeping that for now.

Regards,
Christian.

On 15.12.2015 10:06, Julien Isorce wrote:

And the attachment :)

On 15 December 2015 at 09:06, Julien Isorce > wrote:


Hi Christian,

I tried your v2.

I had to apply attached change on top of your patch. (the one in
buffer.c to avoid crashing, the one postproc.c otherwise same
behavior as the v1 of this patch). Note that I export the RGB-like
surface (the one that vpp output), not the NV12 one that come from
the decoder directly.

Cheers
Julien

On 14 December 2015 at 10:11, Christian König
> wrote:


Also note that in this pipeline, HW decoding is done with
nouveau driver and rendering is done with intel. dmabuf in
between.

Yeah, I already thought that somebody is using it like this.
I'm not sure if this is actually supposed to work because we
don't have proper synchronization between kernel drivers with
DMA-buf jet.


Maybe the idea of the patch is good but something is still wrong.

While it is not the proper solution I would say let's keep the
pipeline draining during exporting the handle for now if
that's really necessary for your use case. Please test the
attached patch.

Coding the patch I've just noticed that there wasn't a
pipe->flush() before exporting the handle. Does it work as
well if you just flush the pipeline without waiting for the
commands to be finished?

Regards,
Christian.


On 14.12.2015 10:14, Julien Isorce wrote:

Hi Christian,

I have tested this patch but then the displayed video is
garbage (mostly white and sometimes just garbage). It also
stall the nouveau driver which requires to reboot but I guess
this is another issue.
I tested with:
GST_GL_WINDOW=x11 GST_GL_PLATFORM=egl GST_GL_API=gles2
GST_DEBUG=2 LIBVA_DRIVER_NAME=gallium gst-launch-1.0 filesrc
location=simpson.mp4 ! qtdemux ! vaapidecodebin ! glimagesink

(to test that you need my gstreamer-vaapi and gstgl branches
on my github but I would not waste time to try them since
they should be merged upstream at some point)

Also note that in this pipeline, HW decoding is done with
nouveau driver and rendering is done with intel. dmabuf in
between.

Maybe the idea of the patch is good but something is still wrong.
I can test any update if it helps.

Cheers
Julien




On 11 December 2015 at 12:33, Christian König
> wrote:

From: Christian König >

It's nonsense to drain the pipeline like this.

Signed-off-by: Christian König >
---
 src/gallium/state_trackers/va/buffer.c|  5 -
 src/gallium/state_trackers/va/image.c |  1 -
 src/gallium/state_trackers/va/postproc.c  |  6 --
 src/gallium/state_trackers/va/surface.c   | 10 +-
 src/gallium/state_trackers/va/va_private.h |  2 --
 5 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/src/gallium/state_trackers/va/buffer.c
b/src/gallium/state_trackers/va/buffer.c
index 769305e..2ec187c 100644
--- a/src/gallium/state_trackers/va/buffer.c
+++ b/src/gallium/state_trackers/va/buffer.c
@@ -257,11 +257,6 @@
vlVaAcquireBufferHandle(VADriverContextP ctx, VABufferID
buf_id,

screen = VL_VA_PSCREEN(ctx);

-   if (buf->derived_surface.fence) {
- screen->fence_finish(screen,
buf->derived_surface.fence, PIPE_TIMEOUT_INFINITE);
- screen->fence_reference(screen,
>derived_surface.fence, NULL);
-   }
-
if (buf->export_refcount > 0) {
   if (buf->export_state.mem_type != mem_type)
  return VA_STATUS_ERROR_INVALID_PARAMETER;
diff --git a/src/gallium/state_trackers/va/image.c
b/src/gallium/state_trackers/va/image.c
index ae07da8..58c9ff7 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -266,7 +266,6 @@ vlVaDeriveImage(VADriverContextP ctx,
VASurfaceID surface, VAImage *image)
img_buf->type = VAImageBufferType;

Re: [Mesa-dev] [PATCH] clover: Fix build against LLVM 3.8 SVN >= r255078

2015-12-15 Thread Michel Dänzer
On 15.12.2015 09:17, Ilia Mirkin wrote:
> On Wed, Dec 9, 2015 at 5:30 AM, Francisco Jerez  wrote:
>> Michel Dänzer  writes:
>>
>>> From: Michel Dänzer 
>>>
>>> Signed-off-by: Michel Dänzer 
>>
>> Looks OK to me,
>> Reviewed-by: Francisco Jerez 
>>
>>> ---
>>>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
>>> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>>> index 3b37f08..4d11c24 100644
>>> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
>>> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>>> @@ -661,7 +661,11 @@ namespace {
>>>
>>>if (dump_asm) {
>>>   LLVMSetTargetMachineAsmVerbosity(tm, true);
>>> +#if HAVE_LLVM >= 0x0308
>>> + LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod).release());
>>> +#else
>>>   LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod));
>>> +#endif
>>>   emit_code(tm, debug_mod, LLVMAssemblyFile, _buffer, r_log);
>>>   buffer_size = LLVMGetBufferSize(out_buffer);
>>>   buffer_data = LLVMGetBufferStart(out_buffer);
> 
> Emil, consider cherry-picking this into 11.1 and perhaps even 11.0 to
> save people from unnecessary compilation trouble. This is commit
> b4a03e7f8f upstream.

FWIW, I still think that's a bad idea at this point: Supporting
unreleased snapshots of LLVM simply isn't feasible on stable Mesa
branches — the next similar breakage can appear in LLVM SVN anytime.

To help people running into this, maybe stable Mesa branches could get a
change to configure.ac which aborts with a descriptive error message
when trying to build against a version of LLVM which isn't supported on
that stable Mesa branch yet.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] st/va: remove fence handling

2015-12-15 Thread Julien Isorce
And the attachment :)

On 15 December 2015 at 09:06, Julien Isorce  wrote:

> Hi Christian,
>
> I tried your v2.
>
> I had to apply attached change on top of your patch. (the one in buffer.c
> to avoid crashing, the one postproc.c otherwise same behavior as the v1 of
> this patch). Note that I export the RGB-like surface (the one that vpp
> output), not the NV12 one that come from the decoder directly.
>
> Cheers
> Julien
>
> On 14 December 2015 at 10:11, Christian König 
> wrote:
>
>> Also note that in this pipeline, HW decoding is done with nouveau driver
>> and rendering is done with intel. dmabuf in between.
>>
>> Yeah, I already thought that somebody is using it like this. I'm not sure
>> if this is actually supposed to work because we don't have proper
>> synchronization between kernel drivers with DMA-buf jet.
>>
>> Maybe the idea of the patch is good but something is still wrong.
>>
>> While it is not the proper solution I would say let's keep the pipeline
>> draining during exporting the handle for now if that's really necessary for
>> your use case. Please test the attached patch.
>>
>> Coding the patch I've just noticed that there wasn't a pipe->flush()
>> before exporting the handle. Does it work as well if you just flush the
>> pipeline without waiting for the commands to be finished?
>>
>> Regards,
>> Christian.
>>
>>
>> On 14.12.2015 10:14, Julien Isorce wrote:
>>
>> Hi Christian,
>>
>> I have tested this patch but then the displayed video is garbage (mostly
>> white and sometimes just garbage). It also stall the nouveau driver which
>> requires to reboot but I guess this is another issue.
>> I tested with:
>> GST_GL_WINDOW=x11 GST_GL_PLATFORM=egl GST_GL_API=gles2 GST_DEBUG=2
>> LIBVA_DRIVER_NAME=gallium gst-launch-1.0 filesrc location=simpson.mp4 !
>> qtdemux ! vaapidecodebin ! glimagesink
>>
>> (to test that you need my gstreamer-vaapi and gstgl branches on my github
>> but I would not waste time to try them since they should be merged upstream
>> at some point)
>>
>> Also note that in this pipeline, HW decoding is done with nouveau driver
>> and rendering is done with intel. dmabuf in between.
>>
>> Maybe the idea of the patch is good but something is still wrong.
>> I can test any update if it helps.
>>
>> Cheers
>> Julien
>>
>>
>>
>>
>> On 11 December 2015 at 12:33, Christian König 
>> wrote:
>>
>>> From: Christian König 
>>>
>>> It's nonsense to drain the pipeline like this.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>  src/gallium/state_trackers/va/buffer.c |  5 -
>>>  src/gallium/state_trackers/va/image.c  |  1 -
>>>  src/gallium/state_trackers/va/postproc.c   |  6 --
>>>  src/gallium/state_trackers/va/surface.c| 10 +-
>>>  src/gallium/state_trackers/va/va_private.h |  2 --
>>>  5 files changed, 1 insertion(+), 23 deletions(-)
>>>
>>> diff --git a/src/gallium/state_trackers/va/buffer.c
>>> b/src/gallium/state_trackers/va/buffer.c
>>> index 769305e..2ec187c 100644
>>> --- a/src/gallium/state_trackers/va/buffer.c
>>> +++ b/src/gallium/state_trackers/va/buffer.c
>>> @@ -257,11 +257,6 @@ vlVaAcquireBufferHandle(VADriverContextP ctx,
>>> VABufferID buf_id,
>>>
>>> screen = VL_VA_PSCREEN(ctx);
>>>
>>> -   if (buf->derived_surface.fence) {
>>> -  screen->fence_finish(screen, buf->derived_surface.fence,
>>> PIPE_TIMEOUT_INFINITE);
>>> -  screen->fence_reference(screen, >derived_surface.fence,
>>> NULL);
>>> -   }
>>> -
>>> if (buf->export_refcount > 0) {
>>>if (buf->export_state.mem_type != mem_type)
>>>   return VA_STATUS_ERROR_INVALID_PARAMETER;
>>> diff --git a/src/gallium/state_trackers/va/image.c
>>> b/src/gallium/state_trackers/va/image.c
>>> index ae07da8..58c9ff7 100644
>>> --- a/src/gallium/state_trackers/va/image.c
>>> +++ b/src/gallium/state_trackers/va/image.c
>>> @@ -266,7 +266,6 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID
>>> surface, VAImage *image)
>>> img_buf->type = VAImageBufferType;
>>> img_buf->size = image->data_size;
>>> img_buf->num_elements = 1;
>>> -   img_buf->derived_surface.fence = surf->fence;
>>>
>>> pipe_resource_reference(_buf->derived_surface.resource,
>>> surfaces[0]->texture);
>>>
>>> diff --git a/src/gallium/state_trackers/va/postproc.c
>>> b/src/gallium/state_trackers/va/postproc.c
>>> index 105f251..1ee3587 100644
>>> --- a/src/gallium/state_trackers/va/postproc.c
>>> +++ b/src/gallium/state_trackers/va/postproc.c
>>> @@ -54,7 +54,6 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver
>>> *drv, vlVaContext *contex
>>> vlVaSurface *src_surface;
>>> VAProcPipelineParameterBuffer *pipeline_param;
>>> struct pipe_surface **surfaces;
>>> -   struct pipe_screen *screen;
>>> struct pipe_surface *psurf;
>>>
>>> if (!drv || !context)
>>> @@ -77,8 +76,6 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver
>>> *drv, 

Re: [Mesa-dev] [PATCH 5/8] st/va: handle default post process regions

2015-12-15 Thread Emil Velikov
On 11 December 2015 at 12:33, Christian König  wrote:
> From: Christian König 
>
> Avoid referencing NULL pointers.
>
Lacking any prior knowledge of the sequential patches, I'm afraid this
commit message doesn't make any sense. How about "Will be used in the
follow up patches" or anything alike ?

> Signed-off-by: Christian König 
> ---
>  src/gallium/state_trackers/va/postproc.c | 36 
> +---
>  1 file changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/postproc.c 
> b/src/gallium/state_trackers/va/postproc.c
> index 2d17694..105f251 100644
> --- a/src/gallium/state_trackers/va/postproc.c
> +++ b/src/gallium/state_trackers/va/postproc.c
> @@ -29,9 +29,26 @@
>
>  #include "va_private.h"
>
> +static const VARectangle *
> +vlVaRegionDefault(const VARectangle *region, struct pipe_video_buffer *buf,
> + VARectangle *def)
> +{
> +   if (region)
> +  return region;
> +
> +   def->x = 0;
> +   def->y = 0;
> +   def->width = buf->width;
> +   def->height = buf->height;
> +
> +   return def;
> +}
> +
>  VAStatus
>  vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver *drv, vlVaContext 
> *context, vlVaBuffer *buf)
>  {
> +   VARectangle def_src_region, def_dst_region;
> +   const VARectangle *src_region, *dst_region;
> struct u_rect src_rect;
> struct u_rect dst_rect;
> vlVaSurface *src_surface;
> @@ -64,15 +81,18 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver 
> *drv, vlVaContext *contex
>
> psurf = surfaces[0];
>
> -   src_rect.x0 = pipeline_param->surface_region->x;
> -   src_rect.y0 = pipeline_param->surface_region->y;
> -   src_rect.x1 = pipeline_param->surface_region->x + 
> pipeline_param->surface_region->width;
> -   src_rect.y1 = pipeline_param->surface_region->y + 
> pipeline_param->surface_region->height;
> +   src_region = vlVaRegionDefault(pipeline_param->surface_region, 
> src_surface->buffer, _src_region);
> +   dst_region = vlVaRegionDefault(pipeline_param->output_region, 
> context->target, _dst_region);
> +
Mind moving this a couple of lines down - alongside the users of dst_rect ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] st/va: fix unused variable warning

2015-12-15 Thread Christian König

On 15.12.2015 11:08, Emil Velikov wrote:

On 11 December 2015 at 12:33, Christian König  wrote:

From: Christian König 

Signed-off-by: Christian König 
---
  src/gallium/state_trackers/va/picture.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 8623139..7b30bf8 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -92,7 +92,6 @@ vlVaGetReferenceFrame(vlVaDriver *drv, VASurfaceID surface_id,
  static VAStatus
  handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
  {
-   unsigned int i;
 VAStatus vaStatus = VA_STATUS_SUCCESS;


Can I bribe you to also remove the "set once, used once" variable vaStatus ?


Unfortunately I already pushed this one yesterday after Julien gave me 
his rb. But going to keep that in mind when I touch the function the 
next time.


Regards,
Christian.



Either way
Reviewed-by: Emil Velikov 

-Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] mesa: Add helper to check if the active fragment shader has shader storage

2015-12-15 Thread Iago Toral Quiroga
Some drivers can disable the FS unit if there is nothing in the shader code
that writes to an output (i.e. color, depth, etc). For drivers that check
for these things, this helper function is useful to avoid that optimization
in the case that the shader has shader storage space assigned (since it could
be writing to it).
---
 src/mesa/main/mtypes.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 48309bf..acacae0 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4544,6 +4544,13 @@ _mesa_active_fragment_shader_has_atomic_ops(const struct 
gl_context *ctx)
   
ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumAtomicBuffers
 > 0;
 }
 
+static inline bool
+_mesa_active_fragment_shader_has_shader_storage(const struct gl_context *ctx)
+{
+   return ctx->Shader._CurrentFragmentProgram != NULL &&
+  
ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumShaderStorageBlocks
 > 0;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa: fix interface matching done in validate_io

2015-12-15 Thread Tapani Pälli

On 12/15/2015 10:56 AM, Timothy Arceri wrote:

On Tue, 2015-12-15 at 07:58 +0200, Tapani Pälli wrote:

On 12/15/2015 03:31 AM, Timothy Arceri wrote:

On Mon, 2015-12-14 at 10:29 +0200, Tapani Pälli wrote:

Patch makes following changes for interface matching:

 - do not try to match builtin variables
 - handle swizzle in input name, as example 'a.z' should
   match with 'a'
 - check that amount of inputs and outputs matches

These changes make interface matching tests to work in:
 ES31-CTS.sepshaderobjs.StateInteraction

Test does not still pass completely due to errors in rendering
output. IMO this is unrelated to interface matching.

v2: add spec reference, return true on desktop since we do not
  have failing cases for it, inputs and outputs amount do not
  need to match on desktop.

Signed-off-by: Tapani Pälli 

Hi Tapani,

Just a general comment first.

I think we should first move _mesa_validate_pipeline_io() and
   validate_io() to src/mesa/main/pipelineobj.c I don't think it
belongs
here right?

Sure, it can be done now. Original intention was to use program
resources and that is why it ended up being here.


Ah but it uses ir_variable so it may be painful to move. Would it be OK 
to still have it in shader_query.cpp?



---
   src/mesa/main/shader_query.cpp | 54
++
   1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp
b/src/mesa/main/shader_query.cpp
index ced10a9..bc01b97 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1377,19 +1377,38 @@ validate_io(const struct gl_shader
*input_stage,
   const struct gl_shader *output_stage, bool isES)
   {
  assert(input_stage && output_stage);
+   unsigned inputs = 0, outputs = 0;
+
+   /* Currently no matching done for desktop. */

I think the spec quote should be moved here as it applies to all
the
rules in the function then you can also have the comment explaining
why
validation for desktop it not done.

OK


I've also filed a spec bug for desktop for the reasons I outlined
in
irc previously. It would be great if you could quote the bug here
also.
Something like:

/* FIXME: Update once Khronos spec bug #15331 is resolved. */

Sure, will add.


+   if (!isES)
+  return true;
   
  /* For each output in a, find input in b and do any required

checks. */
  foreach_in_list(ir_instruction, out, input_stage->ir) {
 ir_variable *out_var = out->as_variable();

It's existing code but it would also be nice to have a patch that
renames input_stage/output_stage to producer_stage/consumer_stage
this
it what they are called in the linker code. Maybe its just me but
getting the outputs from input_stage just looks wrong.

OK, can change this.


-  if (!out_var || out_var->data.mode != ir_var_shader_out)
+  if (!out_var || out_var->data.mode != ir_var_shader_out ||
+  is_gl_identifier(out_var->name))
continue;
   
+  outputs++;

+
+  inputs = 0;
 foreach_in_list(ir_instruction, in, output_stage->ir) {

Two comments here:

1. Take a look at cross_validate_outputs_to_inputs() in
link_varyings.cpp for a way to avoid the nested loop? Although it
may
cause even more overhaed using the symbol table not sure.

I don't know if symbol table can be trusted as variables that get
optimized away or changed in some way are still there. Only way to be
sure is to iterate IR or use resource list. Also, symbol table gets
destroyed after linking. My first implementation was using a hash but
that was also bad idea because variables names do not necessarily
match
exactly.


The code in in cross_validate_outputs_to_inputs() doesn't use *the*
symbol table it use *a* which it builds from iterating over the
producers IR. But I guess it will have the same problem as the hash
table. I wonder why the linking code uses it rather than a plain hash
table.



2. Take a look at the same function for matching via explicit
location.
Does the CTS not test for mismatched explicit locations? Maybe we
should create a piglit test for this as your existing code doesn't
take
into account explicit locations.

No, I haven't seen this test using explicit locations. This patch
also
makes the interface matching pass.

Right but it would break any varyings with explicit locations that
don't have a matching names which is legal.

"An output variable is considered to match an input variable in the
subsequent shader if:

   –the two variables match in name, type, and qualification; or
   –the two variables are declared with the same location qualifier and
match in type and qualification."



I was going to suggest sharing the code between here and the linker
however I'm about to add a bunch of rules for matching the
component
qualifier for enhanced layouts so not entirely sure if we should do
this what do you think?

Linker will need to do much more so maybe do separately, 

Re: [Mesa-dev] [PATCH 1/3] mesa: Add helper to check if the active fragment shader has shader storage

2015-12-15 Thread Tapani Pälli

Yep, I remember when and why this was done for atomic counters.

Patches 1 and 2 are
Reviewed-by: Tapani Pälli 

On 12/15/2015 01:51 PM, Iago Toral Quiroga wrote:

Some drivers can disable the FS unit if there is nothing in the shader code
that writes to an output (i.e. color, depth, etc). For drivers that check
for these things, this helper function is useful to avoid that optimization
in the case that the shader has shader storage space assigned (since it could
be writing to it).
---
  src/mesa/main/mtypes.h | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 48309bf..acacae0 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4544,6 +4544,13 @@ _mesa_active_fragment_shader_has_atomic_ops(const struct 
gl_context *ctx)

ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumAtomicBuffers
 > 0;
  }
  
+static inline bool

+_mesa_active_fragment_shader_has_shader_storage(const struct gl_context *ctx)
+{
+   return ctx->Shader._CurrentFragmentProgram != NULL &&
+  
ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumShaderStorageBlocks
 > 0;
+}
+
  #ifdef __cplusplus
  }
  #endif


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer

2015-12-15 Thread Iago Toral Quiroga
Page 502 of the Command Reference Broadwell PRM says that CURBE Total
Data Length must be 64-bit aligned.

Fixes the following CTS tests:
ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
---
 src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c 
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
index 1fde69c..dbd1967 100644
--- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
@@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
 
unsigned push_constant_data_size =
   (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
-   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
+   unsigned reg_aligned_constant_size =
+  ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
unsigned push_constant_regs = reg_aligned_constant_size / 32;
unsigned threads = get_cs_thread_count(cs_prog_data);
 
@@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
 
   const unsigned push_constant_data_size =
  (local_id_dwords + prog_data->nr_params) * sizeof(gl_constant_value);
-  const unsigned reg_aligned_constant_size = 
ALIGN(push_constant_data_size, 32);
+  const unsigned reg_aligned_constant_size =
+ ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
   const unsigned param_aligned_count =
  reg_aligned_constant_size / sizeof(*param);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [OT] some contribution statistics

2015-12-15 Thread Giuseppe Bilotta
On Tue, Dec 15, 2015 at 10:22 PM, Kenneth Graunke  wrote:
> On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote:
>> The only problem with these numbers is actually the lack of a .mailmap
>> to normalize contributor name/emails, which obviously skews the
>> results a little bit towards the lower end. I don't suppose someone
>> has a .mailmap for Mesa contributors, or is interested in creating
>> one?
>
> I actually have one of those!
>
> http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm

Doh, now I wish you would have replied earlier 8-)

In the mean time I prepared a .mailmap myself … A merge might be
needed. I think I'll try sending mine to the list.

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [OT] some contribution statistics

2015-12-15 Thread Nicolai Hähnle

On 15.12.2015 16:22, Kenneth Graunke wrote:

On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote:

The only problem with these numbers is actually the lack of a .mailmap
to normalize contributor name/emails, which obviously skews the
results a little bit towards the lower end. I don't suppose someone
has a .mailmap for Mesa contributors, or is interested in creating
one?


I actually have one of those!

http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm


Do you take patches?

Nicolai



--Ken



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters

2015-12-15 Thread Ian Romanick
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> ---
>  src/glsl/builtin_variables.cpp  |  5 +
>  src/glsl/glsl_parser_extras.cpp |  1 +
>  src/glsl/glsl_parser_extras.h   |  2 ++
>  src/glsl/nir/nir.c  |  8 
>  src/glsl/nir/nir_intrinsics.h   |  2 ++
>  src/glsl/nir/shader_enums.h | 20 
>  src/glsl/standalone_scaffolding.cpp |  1 +
>  src/mesa/main/extensions_table.h|  1 +
>  src/mesa/main/mtypes.h  |  1 +
>  9 files changed, 41 insertions(+)
> 
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index e8eab80..e82c99e 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars()
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
> if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> +   if (state->ARB_shader_draw_parameters_enable) {
> +  add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB");
> +  add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, 
> "gl_BaseInstanceARB");
> +  add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB");
> +   }
> if (state->AMD_vertex_shader_layer_enable) {
>var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
>var->data.interpolation = INTERP_QUALIFIER_FLAT;
> diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
> index 29cf0c6..8c46f14 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -608,6 +608,7 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
> EXT(ARB_shader_atomic_counters,   true,  false, 
> ARB_shader_atomic_counters),
> EXT(ARB_shader_bit_encoding,  true,  false, 
> ARB_shader_bit_encoding),
> EXT(ARB_shader_clock, true,  false, ARB_shader_clock),
> +   EXT(ARB_shader_draw_parameters,   true,  false, 
> ARB_shader_draw_parameters),
> EXT(ARB_shader_image_load_store,  true,  false, 
> ARB_shader_image_load_store),
> EXT(ARB_shader_image_size,true,  false, 
> ARB_shader_image_size),
> EXT(ARB_shader_precision, true,  false, 
> ARB_shader_precision),
> diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
> index a4bda77..afb99af 100644
> --- a/src/glsl/glsl_parser_extras.h
> +++ b/src/glsl/glsl_parser_extras.h
> @@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state {
> bool ARB_shader_bit_encoding_warn;
> bool ARB_shader_clock_enable;
> bool ARB_shader_clock_warn;
> +   bool ARB_shader_draw_parameters_enable;
> +   bool ARB_shader_draw_parameters_warn;
> bool ARB_shader_image_load_store_enable;
> bool ARB_shader_image_load_store_warn;
> bool ARB_shader_image_size_enable;
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 35fc1de..4b70e7c 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val)
>return nir_intrinsic_load_vertex_id;
> case SYSTEM_VALUE_INSTANCE_ID:
>return nir_intrinsic_load_instance_id;
> +   case SYSTEM_VALUE_DRAW_ID:
> +  return nir_intrinsic_load_draw_id;
> +   case SYSTEM_VALUE_BASE_INSTANCE:
> +  return nir_intrinsic_load_base_instance;
> case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
>return nir_intrinsic_load_vertex_id_zero_base;
> case SYSTEM_VALUE_BASE_VERTEX:
> @@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op 
> intrin)
>return SYSTEM_VALUE_VERTEX_ID;
> case nir_intrinsic_load_instance_id:
>return SYSTEM_VALUE_INSTANCE_ID;
> +   case nir_intrinsic_load_draw_id:
> +  return SYSTEM_VALUE_DRAW_ID;
> +   case nir_intrinsic_load_base_instance:
> +  return SYSTEM_VALUE_BASE_INSTANCE;
> case nir_intrinsic_load_vertex_id_zero_base:
>return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
> case nir_intrinsic_load_base_vertex:
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index 9811fb3..917c805 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0)
>  SYSTEM_VALUE(vertex_id_zero_base, 1, 0)
>  SYSTEM_VALUE(base_vertex, 1, 0)
>  SYSTEM_VALUE(instance_id, 1, 0)
> +SYSTEM_VALUE(base_instance, 1, 0)
> +SYSTEM_VALUE(draw_id, 1, 0)
>  SYSTEM_VALUE(sample_id, 1, 0)
>  SYSTEM_VALUE(sample_pos, 2, 0)
>  SYSTEM_VALUE(sample_mask_in, 1, 0)
> diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h
> index dd0e0ba..0be217c 100644
> --- a/src/glsl/nir/shader_enums.h
> +++ b/src/glsl/nir/shader_enums.h
> @@ -379,6 +379,26 @@ typedef enum
>  * \sa SYSTEM_VALUE_VERTEX_ID, 

Re: [Mesa-dev] [PATCH 3/7] i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered

2015-12-15 Thread Ian Romanick
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> fs_visitor::emit_vs_system_value() looks like it's trying to handle
> SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the
> backend.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 68f2548..d5193a9 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -46,6 +46,7 @@ fs_visitor::emit_vs_system_value(int location)
>vs_prog_data->uses_vertexid = true;
>break;
> case SYSTEM_VALUE_VERTEX_ID:
> +  unreachable("should have been lowered");
> case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
>reg->reg_offset = 2;
>vs_prog_data->uses_vertexid = true;
> 

There was some reason that Ken and I decided to do this like this, but I
don't remember what it was.  I *think* this is probably a good change,
but I'd like Ken to weigh in.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB

2015-12-15 Thread Ian Romanick
On 12/15/2015 11:48 AM, Anuj Phogat wrote:
> On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensen
>  wrote:
>> We already have gl_BaseVertexARB in the .x component of the SGVS vec4
>> and plug gl_BaseInstanceARB into the last free component (.y).
>> ---
>>  src/mesa/drivers/dri/i965/brw_compiler.h  |  2 ++
>>  src/mesa/drivers/dri/i965/brw_context.h   |  9 --
>>  src/mesa/drivers/dri/i965/brw_draw.c  | 12 ++--
>>  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 35 
>> ++-
>>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  3 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 ++-
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  6 +++-
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++--
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++-
>>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 +++-
>>  src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 35 
>> ++-
>>  11 files changed, 102 insertions(+), 38 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
>> b/src/mesa/drivers/dri/i965/brw_compiler.h
>> index 218d9c7..58ee966 100644
>> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
>> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
>> @@ -547,6 +547,8 @@ struct brw_vs_prog_data {
>>
>> bool uses_vertexid;
>> bool uses_instanceid;
>> +   bool uses_basevertex;
>> +   bool uses_baseinstance;
> Missed bool uses_drawid ?

It looks like there may be some rebase or patch splitting issues.  These
are added in the next patch, but they are already used in this patch.  I
think it will compile after patch 5, but I don't think it will compile
after patch 4.

>>  };
>>
>>  struct brw_tcs_prog_data
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
>> b/src/mesa/drivers/dri/i965/brw_context.h
>> index a845541..1378402 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> @@ -905,8 +905,13 @@ struct brw_context
>> uint32_t pma_stall_bits;
>>
>> struct {
>> -  /** The value of gl_BaseVertex for the current _mesa_prim. */
>> -  int gl_basevertex;
>> +  struct {
>> + /** The value of gl_BaseVertex for the current _mesa_prim. */
>> + int gl_basevertex;
>> +
>> + /** The value of gl_BaseInstance for the current _mesa_prim. */
>> + int gl_baseinstance;
>> +  } params;
> Missed gl_drawid and gl_drawid_bo ?
>>
>>/**
>> * Buffer and offset used for GL_ARB_shader_draw_parameters
>> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
>> b/src/mesa/drivers/dri/i965/brw_draw.c
>> index 8398471..298ac06 100644
>> --- a/src/mesa/drivers/dri/i965/brw_draw.c
>> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
>> @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx,
>>   }
>>}
>>
>> -  brw->draw.gl_basevertex =
>> +  brw->draw.params.gl_basevertex =
>>   prims[i].indexed ? prims[i].basevertex : prims[i].start;
>> -
>> +  brw->draw.params.gl_baseinstance = prims[i].base_instance;
>>drm_intel_bo_unreference(brw->draw.draw_params_bo);
>>
>>if (prims[i].is_indirect) {
>> @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx,
>>   brw->draw.draw_params_offset = 0;
>>}
>>
>> +  /* gl_DrawID always needs its own vertex buffer since it's not part of
>> +   * the indirect parameter buffer. */
>> +  if (brw->vs.prog_data->uses_drawid) {
>> + brw->draw.gl_drawid = prims[i].drawid;
> brw->draw.gl_drawid = prims[i].draw_id;
>> + drm_intel_bo_unreference(brw->draw.draw_id_bo);
>> + brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
>> +  }
>> +
>>if (brw->gen < 6)
>>  brw_set_prim(brw, [i]);
>>else
>> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
>> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
>> index ea0f6f2..ccf963c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
>> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
>> @@ -592,8 +592,10 @@ void
>>  brw_prepare_shader_draw_parameters(struct brw_context *brw)
>>  {
>> /* For non-indirect draws, upload gl_BaseVertex. */
>> -   if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == 
>> NULL) {
>> -  intel_upload_data(brw, >draw.gl_basevertex, 4, 4,
>> +   if ((brw->vs.prog_data->uses_basevertex ||
>> +brw->vs.prog_data->uses_baseinstance) &&
>> +   brw->draw.draw_params_bo == NULL) {
>> +  intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4,
>> >draw.draw_params_bo,
>>  >draw.draw_params_offset);
>> }
>> @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw)
>> brw_emit_query_begin(brw);
>>
>> unsigned nr_elements = brw->vb.nr_enabled;
>> -   if (brw->vs.prog_data->uses_vertexid || 
>> 

Re: [Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing

2015-12-15 Thread Ian Romanick
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> This optimizes a + b - b to just a. Modest shader-db results (BDW):
> 
>   total instructions in shared programs: 7842452 -> 7841862 (-0.01%)
>   instructions in affected programs: 61938 -> 61348 (-0.95%)
>   total loops in shared programs:2131 -> 2131 (0.00%)
>   helped:263
>   HURT:  0
>   GAINED:0
>   LOST:  0
> 
> but the optimization turns
> 
>   gl_VertexID - gl_BaseVertexARB
> 
> into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the
> i965 hardware supports natively. That means we can avoid using the
> internal vertex buffer for gl_BaseVertexARB in this case.

Removing that extra state should be a bigger real win than removing the
instructions.  This patch is

Reviewed-by: Ian Romanick 

> ---
>  src/glsl/nir/nir_opt_algebraic.py | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/glsl/nir/nir_opt_algebraic.py 
> b/src/glsl/nir/nir_opt_algebraic.py
> index cb715c0..1fdad3d 100644
> --- a/src/glsl/nir/nir_opt_algebraic.py
> +++ b/src/glsl/nir/nir_opt_algebraic.py
> @@ -62,6 +62,10 @@ optimizations = [
> (('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))),
> (('fadd', ('fneg', a), a), 0.0),
> (('iadd', ('ineg', a), a), 0),
> +   (('iadd', ('ineg', a), ('iadd', a, b)), b),
> +   (('iadd', a, ('iadd', ('ineg', a), b)), b),
> +   (('fadd', ('fneg', a), ('fadd', a, b)), b),
> +   (('fadd', a, ('fadd', ('fneg', a), b)), b),
> (('fmul', a, 0.0), 0.0),
> (('imul', a, 0), 0),
> (('umul_unorm_4x8', a, 0), 0),
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere

2015-12-15 Thread Timothy Arceri
On Tue, 2015-12-15 at 14:32 +0200, Tapani Pälli wrote:
> On 12/15/2015 01:25 AM, Timothy Arceri wrote:
> > On Wed, 2015-12-09 at 00:17 +1100, Timothy Arceri wrote:
> > > In core profile even if re-linking fails rendering shouldn't fail
> > > as
> > > the
> > > previous succesfully linked program will still be available. It
> > > also
> > > shouldn't be possible to have an unlinked program as part of the
> > > current rendering state.
> > Hey guys,
> > 
> > Any thoughts on this change?
> > 
> > Thinking about this some more we should probably rework the compat
> > code
> > also and only do the check for link status if there is an assembly
> > shader right?
> 
> I wanted to hear from others first since for me it feels this change 
> seems specific to separate shader programs (I had a patch on list
> that 
> skipped the check for those programs that were not in use by current 
> pipeline).
> The reason is that with regular programs I can't see a way to
> continue 
> if relinking fails (because program is now in bad state). I think
> user 
> should detach the malfunctioning stage and link again. However with
> SSO 
> relink to a unused stage may fail but we can still have a complete 
> working program with stages marked as used.


Hi Tapani,

I don't see anything that says this is specific to separate shader programs. 
For full programs you still need to call UseProgram to install the executable 
code as part of the rendering state just like with SSO.

From Section 7.3 (Program Objects) of the OpenGL 4.5 spec under UseProgram:

"This will install executable code as part of the current rendering state for 
each shader stage present when the program was last successfully linked."

...

"If LinkProgram or ProgramBinary successfully re-links a program object that is 
active for any shader stage, then the newly generated executable code will be 
installed as part of the current rendering state for all shader stages where 
the program is active."

...

"If a program object that is active for any shader stage is re-linked 
unsuccess-fully, the link status will be set to FALSE, but any existing 
executables and associ-ated state will remain part of the current rendering 
state until a subsequent call to UseProgram, UseProgramStages, or 
BindProgramPipeline removes them from use."

...

"An unsuc-cessfully linked program may not be made part of the current 
rendering state by UseProgram or added to program pipeline objects by 
UseProgramStages until it is successfully re-linked."

As far as I can tell it should not be possible to have an unsuccessfully linked 
program as part of the current rendering state, which is why this patch removes 
the LinkStatus check completely.

I can add all of this to the commit message.

Tim

> 
> 
> > Thanks,
> > Tim
> 
> // Tapani
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ttn: Use the new nir_load_system_value helper

2015-12-15 Thread Jason Ekstrand
Only compile-tested.

Cc: Eric Anholt 
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index 5def6d3..122e87b 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -544,9 +544,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
   break;
 
case TGSI_FILE_SYSTEM_VALUE: {
-  nir_intrinsic_instr *load;
   nir_intrinsic_op op;
-  unsigned ncomp = 1;
 
   assert(!indirect);
   assert(!dim);
@@ -568,13 +566,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
  unreachable("bad system value");
   }
 
-  load = nir_intrinsic_instr_create(b->shader, op);
-  load->num_components = ncomp;
-
-  nir_ssa_dest_init(>instr, >dest, ncomp, NULL);
-  nir_builder_instr_insert(b, >instr);
-
-  src = nir_src_for_ssa(>dest.ssa);
+  src = nir_src_for_ssa(nir_load_system_value(b, op, 0));
   break;
}
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA

2015-12-15 Thread Eric Anholt
Jason Ekstrand  writes:

> The one user of this (i965) only ever calls it while in SSA form.

This series is:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Add .mailmap

2015-12-15 Thread Giuseppe Bilotta
This adds a first tentative .mailmap file, to canonicize contributor
name/emails in shortlogs and other statistical endeavours.

There's a couple of root and richard entries which I don't know who
they belong to, and hopefully not too many overeager merges.

Signed-off-by: Giuseppe Bilotta 
---
 .mailmap | 460 +++
 1 file changed, 460 insertions(+)
 create mode 100644 .mailmap

diff --git a/.mailmap b/.mailmap
new file mode 100644
index 000..bf8b4d9
--- /dev/null
+++ b/.mailmap
@@ -0,0 +1,460 @@
+Aapo Tahkola  
+
+Adam Jackson  
+Adam Jackson  
+
+Adrian Marius Negreanu  Adrian Negreanu 

+Adrian Marius Negreanu  Negreanu Marius Adrian 

+
+Dave Airlie  
+Dave Airlie  airlied 
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+
+Alan Coopersmith  
+
+Alan Hourihane  
+Alan Hourihane  
+Alan Hourihane  
+
+Alexander Monakov  
+
+Alexander von Gluck IV  Alexander von Gluck 

+
+Alex Corscadden  
+Alex Corscadden  
+
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+
+Andreas Fänger  
+
+Andreas Hartmetz  
+
+Andre Heider 
+Andreas Heider 
+
+Andreas Pokorny  

+
+Andrew Randrianasulu  
+Andrew Randrianasulu  
+
+Arthur Huillet  Arthur HUILLET 
+
+Benjamin Franzke  ben 

+
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+
+Ben Widawsky  Ben Widawsky 
+
+Blair Sadewitz  Blair Sadewitz 

+
+bma 
+
+Brian Paul  Brian 
+Brian Paul  
+Brian Paul  
+Brian Paul  
+Brian Paul  brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  root 
+
+Bruce Merry  
+
+caner 
+
+Carl-Philip Hänsch  Carl-Philip Haensch 

+Carl-Philip Hänsch  Carl-Philip Haensch 

+Carl-Philip Hänsch  Carl-Philip Haensch 

+
+Chad Versace  
+Chad Versace  
+Chad Versace  

Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA

2015-12-15 Thread Eric Anholt
Jason Ekstrand  writes:

> On Tue, Dec 15, 2015 at 12:26 PM, Eric Anholt  wrote:
>> Jason Ekstrand  writes:
>>
>>> The one user of this (i965) only ever calls it while in SSA form.
>>
>> This series is:
>>
>> Reviewed-by: Eric Anholt 
>
> Thanks!
>
> Did you happen to run it on something that actually uses clip plane
> lowering?  I'd like to not break things.

I hadn't, just reviewed.  I checked now, and piglit's user-clip does pass.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB

2015-12-15 Thread Ian Romanick
This patch is really doing two different things.  It changes the
existing SYSTEM_VALUE_BASE_VERTEX to be independent from
SYSTEM_VALUE_VERTEX_ID_ZERO.  It also adds SYSTEM_VALUE_BASE_INSTANCE
support.

I was going to let that go, but because the two things happened in one
patch, I overlooked the extra gl_DrawID related cruft that should have
been in the next patch.  Thankfully Anuj caught it.

On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> We already have gl_BaseVertexARB in the .x component of the SGVS vec4
> and plug gl_BaseInstanceARB into the last free component (.y).
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_context.h   |  9 --
>  src/mesa/drivers/dri/i965/brw_draw.c  | 12 ++--
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 35 
> ++-
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  3 +-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 ++-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  6 +++-
>  src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++-
>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 +++-
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 35 
> ++-
>  11 files changed, 102 insertions(+), 38 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 218d9c7..58ee966 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -547,6 +547,8 @@ struct brw_vs_prog_data {
>  
> bool uses_vertexid;
> bool uses_instanceid;
> +   bool uses_basevertex;
> +   bool uses_baseinstance;
>  };
>  
>  struct brw_tcs_prog_data
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index a845541..1378402 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -905,8 +905,13 @@ struct brw_context
> uint32_t pma_stall_bits;
>  
> struct {
> -  /** The value of gl_BaseVertex for the current _mesa_prim. */
> -  int gl_basevertex;
> +  struct {
> + /** The value of gl_BaseVertex for the current _mesa_prim. */
> + int gl_basevertex;
> +
> + /** The value of gl_BaseInstance for the current _mesa_prim. */
> + int gl_baseinstance;
> +  } params;
>  
>/**
> * Buffer and offset used for GL_ARB_shader_draw_parameters
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 8398471..298ac06 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx,
>   }
>}
>  
> -  brw->draw.gl_basevertex =
> +  brw->draw.params.gl_basevertex =
>   prims[i].indexed ? prims[i].basevertex : prims[i].start;
> -
> +  brw->draw.params.gl_baseinstance = prims[i].base_instance;
>drm_intel_bo_unreference(brw->draw.draw_params_bo);
>  
>if (prims[i].is_indirect) {
> @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx,
>   brw->draw.draw_params_offset = 0;
>}
>  
> +  /* gl_DrawID always needs its own vertex buffer since it's not part of
> +   * the indirect parameter buffer. */
> +  if (brw->vs.prog_data->uses_drawid) {
> + brw->draw.gl_drawid = prims[i].drawid;
> + drm_intel_bo_unreference(brw->draw.draw_id_bo);
> + brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
> +  }
> +
>if (brw->gen < 6)
>brw_set_prim(brw, [i]);
>else
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index ea0f6f2..ccf963c 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -592,8 +592,10 @@ void
>  brw_prepare_shader_draw_parameters(struct brw_context *brw)
>  {
> /* For non-indirect draws, upload gl_BaseVertex. */
> -   if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) 
> {
> -  intel_upload_data(brw, >draw.gl_basevertex, 4, 4,
> +   if ((brw->vs.prog_data->uses_basevertex ||
> +brw->vs.prog_data->uses_baseinstance) &&
> +   brw->draw.draw_params_bo == NULL) {
> +  intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4,
>   >draw.draw_params_bo,
>  >draw.draw_params_offset);
> }
> @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw)
> brw_emit_query_begin(brw);
>  
> unsigned nr_elements = brw->vb.nr_enabled;
> -   if (brw->vs.prog_data->uses_vertexid || 
> brw->vs.prog_data->uses_instanceid)
> +   if (brw->vs.prog_data->uses_vertexid || 
> brw->vs.prog_data->uses_instanceid ||
> +   

Re: [Mesa-dev] [PATCH 5/7] i965: Add support for gl_DrawIDARB and enable extension

2015-12-15 Thread Ian Romanick
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> We have to break open a new vec4 for gl_DrawIDARB. We've used up all
> space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
> own separate vertex buffer anyway.  This is because we point the vb for
> base vertex and base instance into the draw parameter BO for indirect
> draw calls, but the draw id is generated by mesa in a different buffer.
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.h  |  1 +
>  src/mesa/drivers/dri/i965/brw_context.h   |  9 +
>  src/mesa/drivers/dri/i965/brw_draw.c  |  8 ++--
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 45 
> ++-
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  2 +
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  | 10 -
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 10 +
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 +++-
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 -
>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  5 +++
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c  | 34 -
>  src/mesa/drivers/dri/i965/intel_extensions.c  |  1 +
>  12 files changed, 132 insertions(+), 11 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 58ee966..2333f4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -549,6 +549,7 @@ struct brw_vs_prog_data {
> bool uses_instanceid;
> bool uses_basevertex;
> bool uses_baseinstance;
> +   bool uses_drawid;
>  };
>  
>  struct brw_tcs_prog_data
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 1378402..97ebf06 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -919,6 +919,15 @@ struct brw_context
> */
>drm_intel_bo *draw_params_bo;
>uint32_t draw_params_offset;
> +
> +  /**
> +   * The value of gl_DrawID for the current _mesa_prim. This always comes
> +   * in from it's own vertex buffer since it's not part of the indirect
> +   * draw parameters.
> +   */
> +  int gl_drawid;
> +  drm_intel_bo *draw_id_bo;
> +  uint32_t draw_id_offset;
> } draw;
>  
> struct {
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 298ac06..b0710c67 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -513,11 +513,9 @@ brw_try_draw_prims(struct gl_context *ctx,
>  
>/* gl_DrawID always needs its own vertex buffer since it's not part of
> * the indirect parameter buffer. */

The */ goes on its own line.

> -  if (brw->vs.prog_data->uses_drawid) {
> - brw->draw.gl_drawid = prims[i].drawid;
> - drm_intel_bo_unreference(brw->draw.draw_id_bo);
> - brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
> -  }
> +  brw->draw.gl_drawid = prims[i].draw_id;
> +  drm_intel_bo_unreference(brw->draw.draw_id_bo);
> +  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;

The previous patch (incorrectly) added this block, and it seems like
this should be conditional on uses_drawid.

>  
>if (brw->gen < 6)
>brw_set_prim(brw, [i]);
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index ccf963c..e601190 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -599,6 +599,12 @@ brw_prepare_shader_draw_parameters(struct brw_context 
> *brw)
>   >draw.draw_params_bo,
>  >draw.draw_params_offset);
> }
> +
> +   if (brw->vs.prog_data->uses_drawid) {
> +  intel_upload_data(brw, >draw.gl_drawid, 
> sizeof(brw->draw.gl_drawid), 4,
> + >draw.draw_id_bo,
> +>draw.draw_id_offset);
> +   }
>  }
>  
>  /**
> @@ -663,6 +669,8 @@ brw_emit_vertices(struct brw_context *brw)
> if (brw->vs.prog_data->uses_vertexid || 
> brw->vs.prog_data->uses_instanceid ||
> brw->vs.prog_data->uses_basevertex || 
> brw->vs.prog_data->uses_baseinstance)
>++nr_elements;
> +   if (brw->vs.prog_data->uses_drawid)
> +  nr_elements++;
>  
> /* If the VS doesn't read any inputs (calculating vertex position from
>  * a state variable for some reason, for example), emit a single pad
> @@ -699,7 +707,8 @@ brw_emit_vertices(struct brw_context *brw)
> const bool uses_draw_params =
>brw->vs.prog_data->uses_basevertex ||
>brw->vs.prog_data->uses_baseinstance;
> -   const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params;
> +   const unsigned nr_buffers = brw->vb.nr_buffers +
> +  uses_draw_params + brw->vs.prog_data->uses_drawid;
>  
> if (nr_buffers) 

[Mesa-dev] [PATCH 0/5] i965: Non-overridden OpenGLES 3.1 context on Gen8+

2015-12-15 Thread Jordan Justen
git://people.freedesktop.org/~jljusten/mesa es31-gen8-v1

With this series, gen8+ should be able to create an OpenGLES 3.1
context without any environment variable overrides.

Jordan Justen (5):
  main: Add MESA_VERBOSE=api for LinkProgram & UseProgram
  main: Allow compute shaders to be compiled with OpenGLES 3.1
  main/version: Don't require ARB_compute_shader for OpenGLES 3.1
  i965: Enable compute shaders in more cases for OpenGLES 3.1
  i965/screen: Allow OpenGLES 3.1 for gen8+

 src/mesa/drivers/dri/i965/brw_context.c  | 5 -
 src/mesa/drivers/dri/i965/intel_screen.c | 5 +
 src/mesa/main/shaderapi.c| 7 ++-
 src/mesa/main/version.c  | 9 ++---
 4 files changed, 21 insertions(+), 5 deletions(-)

-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/screen: Allow OpenGLES 3.1 for gen8+

2015-12-15 Thread Jordan Justen
OpenGLES 3.1 cannot be enabled for gen 7 (Ivy Bridge, Haswell) since
they are still missing ARB_stencil_texturing.

Signed-off-by: Jordan Justen 
Cc: Ian Romanick 
Cc: Marta Lofstedt 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 825a7c1..13498f4 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1338,6 +1338,11 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 9:
case 8:
+  psp->max_gl_core_version = 33;
+  psp->max_gl_compat_version = 30;
+  psp->max_gl_es1_version = 11;
+  psp->max_gl_es2_version = 31;
+  break;
case 7:
case 6:
   psp->max_gl_core_version = 33;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] main: Allow compute shaders to be compiled with OpenGLES 3.1

2015-12-15 Thread Jordan Justen
Previous OpenGLES 3.1 testing had been done when ARB_compute_shader
was overridden to enabled.

Signed-off-by: Jordan Justen 
Cc: Marta Lofstedt 
---
 src/mesa/main/shaderapi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index a732d83..e258ad9 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -208,7 +208,7 @@ _mesa_validate_shader_target(const struct gl_context *ctx, 
GLenum type)
case GL_TESS_EVALUATION_SHADER:
   return ctx == NULL || _mesa_has_tessellation(ctx);
case GL_COMPUTE_SHADER:
-  return ctx == NULL || ctx->Extensions.ARB_compute_shader;
+  return ctx == NULL || _mesa_has_compute_shaders(ctx);
default:
   return false;
}
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1

2015-12-15 Thread Jordan Justen
Previously we were checking the desktop OpenGL ARB_compute_shader
requirements, but for OpenGLES 3.1, the requirements are lower.

Signed-off-by: Jordan Justen 
Cc: Marta Lofstedt 
---
 src/mesa/drivers/dri/i965/brw_context.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 0abe601..5105625 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context *brw)
   [MESA_SHADER_GEOMETRY] = brw->gen >= 6,
   [MESA_SHADER_FRAGMENT] = true,
   [MESA_SHADER_COMPUTE] =
- (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
+ (ctx->API == API_OPENGL_CORE &&
+  ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
+ (ctx->API == API_OPENGLES2 &&
+  ctx->Const.MaxComputeWorkGroupSize[0] >= 128) ||
  _mesa_extension_override_enables.ARB_compute_shader,
};
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1

2015-12-15 Thread Jordan Justen
The OpenGL ARB_compute_shader extension specfication requires at least
1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1
only required 128.

Signed-off-by: Jordan Justen 
Cc: Ian Romanick 
Cc: Marta Lofstedt 
---
 src/mesa/main/version.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index e92bb11..112a73d 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions *extensions)
 }
 
 static GLuint
-compute_version_es2(const struct gl_extensions *extensions)
+compute_version_es2(const struct gl_extensions *extensions,
+const struct gl_constants *consts)
 {
/* OpenGL ES 2.0 is derived from OpenGL 2.0 */
const bool ver_2_0 = (extensions->ARB_texture_cube_map &&
@@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions *extensions)
  extensions->EXT_texture_snorm &&
  extensions->NV_primitive_restart &&
  extensions->OES_depth_texture_cube_map);
+   const bool es31_compute_shader =
+  consts->MaxComputeWorkGroupInvocations >= 128;
const bool ver_3_1 = (ver_3_0 &&
  extensions->ARB_arrays_of_arrays &&
- extensions->ARB_compute_shader &&
+ es31_compute_shader &&
  extensions->ARB_draw_indirect &&
  extensions->ARB_explicit_uniform_location &&
  extensions->ARB_framebuffer_no_attachments &&
@@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions *extensions,
case API_OPENGLES:
   return compute_version_es1(extensions);
case API_OPENGLES2:
-  return compute_version_es2(extensions);
+  return compute_version_es2(extensions, consts);
}
return 0;
 }
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] main: Add MESA_VERBOSE=api for LinkProgram & UseProgram

2015-12-15 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/main/shaderapi.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index ac40891..a732d83 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1514,6 +1514,8 @@ void GLAPIENTRY
 _mesa_LinkProgram(GLhandleARB programObj)
 {
GET_CURRENT_CONTEXT(ctx);
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glLinkProgram %u\n", programObj);
link_program(ctx, programObj);
 }
 
@@ -1731,6 +1733,9 @@ _mesa_UseProgram(GLhandleARB program)
GET_CURRENT_CONTEXT(ctx);
struct gl_shader_program *shProg;
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glUseProgram %u\n", program);
+
if (_mesa_is_xfb_active_and_unpaused(ctx)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "glUseProgram(transform feedback active)");
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: don't use debug code in update_state() in release builds

2015-12-15 Thread Brian Paul
---
 src/gallium/drivers/svga/svga_state.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_state.c 
b/src/gallium/drivers/svga/svga_state.c
index 722b369..4479a27 100644
--- a/src/gallium/drivers/svga/svga_state.c
+++ b/src/gallium/drivers/svga/svga_state.c
@@ -129,7 +129,11 @@ update_state(struct svga_context *svga,
  const struct svga_tracked_state *atoms[],
  unsigned *state)
 {
+#ifdef DEBUG
boolean debug = TRUE;
+#else
+   boolean debug = FALSE;
+#endif
enum pipe_error ret = PIPE_OK;
unsigned i;
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/osmesa: add OSMesaCreateContextAttribs() function

2015-12-15 Thread Brian Paul
As with the previous commit, except for gallium.
---
 src/gallium/state_trackers/osmesa/osmesa.c | 96 +-
 1 file changed, 93 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/osmesa/osmesa.c 
b/src/gallium/state_trackers/osmesa/osmesa.c
index 0f27ba8..ee78910 100644
--- a/src/gallium/state_trackers/osmesa/osmesa.c
+++ b/src/gallium/state_trackers/osmesa/osmesa.c
@@ -544,11 +544,39 @@ GLAPI OSMesaContext GLAPIENTRY
 OSMesaCreateContextExt(GLenum format, GLint depthBits, GLint stencilBits,
GLint accumBits, OSMesaContext sharelist)
 {
+   int attribs[100], n = 0;
+
+   attribs[n++] = OSMESA_FORMAT;
+   attribs[n++] = format;
+   attribs[n++] = OSMESA_DEPTH_BITS;
+   attribs[n++] = depthBits;
+   attribs[n++] = OSMESA_STENCIL_BITS;
+   attribs[n++] = stencilBits;
+   attribs[n++] = OSMESA_ACCUM_BITS;
+   attribs[n++] = accumBits;
+   attribs[n++] = 0;
+
+   return OSMesaCreateContextAttribs(attribs, sharelist);
+}
+
+
+/**
+ * New in Mesa 11.2
+ *
+ * Create context with attribute list.
+ */
+GLAPI OSMesaContext GLAPIENTRY
+OSMesaCreateContextAttribs(const int *attribList, OSMesaContext sharelist)
+{
OSMesaContext osmesa;
struct st_context_iface *st_shared;
enum st_context_error st_error = 0;
struct st_context_attribs attribs;
struct st_api *stapi = get_st_api();
+   GLenum format = GL_RGBA;
+   int depthBits = 0, stencilBits = 0, accumBits = 0;
+   int profile = OSMESA_COMPAT_PROFILE, version_major = 1, version_minor = 0;
+   int i;
 
if (sharelist) {
   st_shared = sharelist->stctx;
@@ -561,6 +589,64 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, 
GLint stencilBits,
if (!osmesa)
   return NULL;
 
+   for (i = 0; attribList[i]; i += 2) {
+  switch (attribList[i]) {
+  case OSMESA_FORMAT:
+ format = attribList[i+1];
+ switch (format) {
+ case OSMESA_COLOR_INDEX:
+ case OSMESA_RGBA:
+ case OSMESA_BGRA:
+ case OSMESA_ARGB:
+ case OSMESA_RGB:
+ case OSMESA_BGR:
+ case OSMESA_RGB_565:
+/* legal */
+break;
+ default:
+return NULL;
+ }
+ break;
+  case OSMESA_DEPTH_BITS:
+ depthBits = attribList[i+1];
+ if (depthBits < 0)
+return NULL;
+ break;
+  case OSMESA_STENCIL_BITS:
+ stencilBits = attribList[i+1];
+ if (stencilBits < 0)
+return NULL;
+ break;
+  case OSMESA_ACCUM_BITS:
+ accumBits = attribList[i+1];
+ if (accumBits < 0)
+return NULL;
+ break;
+  case OSMESA_PROFILE:
+ profile = attribList[i+1];
+ if (profile != OSMESA_CORE_PROFILE &&
+ profile != OSMESA_COMPAT_PROFILE)
+return NULL;
+ break;
+  case OSMESA_CONTEXT_MAJOR_VERSION:
+ version_major = attribList[i+1];
+ if (version_major < 1)
+return NULL;
+ break;
+  case OSMESA_CONTEXT_MINOR_VERSION:
+ version_minor = attribList[i+1];
+ if (version_minor < 0)
+return NULL;
+ break;
+  case 0:
+ /* end of list */
+ break;
+  default:
+ fprintf(stderr, "Bad attribute in OSMesaCreateContextAttribs()\n");
+ return NULL;
+  }
+   }
+
/* Choose depth/stencil/accum buffer formats */
if (accumBits > 0) {
   osmesa->accum_format = PIPE_FORMAT_R16G16B16A16_SNORM;
@@ -581,9 +667,11 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, 
GLint stencilBits,
/*
 * Create the rendering context
 */
-   attribs.profile = ST_PROFILE_DEFAULT;
-   attribs.major = 2;
-   attribs.minor = 1;
+   memset(, 0, sizeof(attribs));
+   attribs.profile = (profile == OSMESA_CORE_PROFILE)
+  ? ST_PROFILE_OPENGL_CORE : ST_PROFILE_DEFAULT;
+   attribs.major = version_major;
+   attribs.minor = version_minor;
attribs.flags = 0;  /* ST_CONTEXT_FLAG_x */
attribs.options.force_glsl_extensions_warn = FALSE;
attribs.options.disable_blend_func_extended = FALSE;
@@ -614,6 +702,7 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, 
GLint stencilBits,
 }
 
 
+
 /**
  * Destroy an Off-Screen Mesa rendering context.
  *
@@ -883,6 +972,7 @@ struct name_function
 static struct name_function functions[] = {
{ "OSMesaCreateContext", (OSMESAproc) OSMesaCreateContext },
{ "OSMesaCreateContextExt", (OSMESAproc) OSMesaCreateContextExt },
+   { "OSMesaCreateContextAttribs", (OSMESAproc) OSMesaCreateContextAttribs },
{ "OSMesaDestroyContext", (OSMESAproc) OSMesaDestroyContext },
{ "OSMesaMakeCurrent", (OSMESAproc) OSMesaMakeCurrent },
{ "OSMesaGetCurrentContext", (OSMESAproc) OSMesaGetCurrentContext },
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Ilia Mirkin
Hardly a complete review, but a handful of comments:

On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté  wrote:
> ---
>  src/mesa/Makefile.sources |   1 +
>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 
> ++
>  src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
>  src/mesa/state_tracker/st_atom_constbuf.c |  14 +
>  src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
>  src/mesa/state_tracker/st_cb_program.c|  35 +-
>  src/mesa/state_tracker/st_program.c   |  22 +
>  src/mesa/state_tracker/st_program.h   |   1 +
>  8 files changed, 920 insertions(+), 1 deletion(-)
>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h
>
> +static struct ureg_src prepare_argument(struct st_translate *t, const 
> unsigned argId,
> +  const struct atifragshader_src_register *srcReg)
> +{
> +   struct ureg_src src = get_source(t, srcReg->Index);
> +   struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
> +
> +   switch (srcReg->argRep) {
> +  case GL_NONE:
> + break;
> +  case GL_RED:
> + src = ureg_swizzle(src,
> +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, 
> TGSI_SWIZZLE_X);
> + break;
> +  case GL_GREEN:
> + src = ureg_swizzle(src,
> +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, 
> TGSI_SWIZZLE_Y);
> + break;
> +  case GL_BLUE:
> + src = ureg_swizzle(src,
> +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
> TGSI_SWIZZLE_Z);
> + break;
> +  case GL_ALPHA:
> + src = ureg_swizzle(src,
> +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, 
> TGSI_SWIZZLE_W);
> + break;
> +   }
> +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
> +
> +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
> +  struct ureg_src modsrc[2];
> +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
> +  modsrc[1] = ureg_src(arg);
> +
> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
> +   }
> +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
> +  struct ureg_src modsrc[2];
> +  modsrc[0] = ureg_src(arg);
> +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
> +
> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
> +   }
> +   if (srcReg->argMod & GL_2X_BIT_ATI) {
> +  struct ureg_src modsrc[2];
> +  modsrc[0] = ureg_src(arg);
> +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
> +
> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);

aka ADD arg, arg, arg

> +   }
> +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
> +  struct ureg_src modsrc[2];
> +  modsrc[0] = ureg_src(arg);
> +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
> +
> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);

aka NEG arg, arg

> +   }
> +   return  ureg_src(arg);
> +}
> +
> +/* These instructions have no direct equivalent in TGSI */
> +static void emit_special_inst(struct st_translate *t, struct 
> instruction_desc *desc,
> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
> +{
> +   struct ureg_dst tmp[1];
> +   struct ureg_src src[3];
> +
> +   if(desc->special == 1) {
> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose 
> a3
> +  src[0] = ureg_imm1f(t->ureg, 0.5f);
> +  src[1] = args[2];
> +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
> +  src[0] = ureg_src(tmp[0]);
> +  src[1] = args[0];
> +  src[2] = args[1];
> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
> +   } else if (desc->special == 2) {
> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose 
> a3
> +  src[0] = args[2];
> +  src[1] = ureg_imm1f(t->ureg, 0.0f);
> +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
> +  src[0] = ureg_src(tmp[0]);
> +  src[1] = args[0];
> +  src[2] = args[1];
> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);

Isn't this the CMP instruction? Just flip the args.

http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP

The other one should be expressible as CMP as well I think.

> +   } else if (desc->special == 3) {
> +  src[0] = args[0];
> +  src[1] = args[1];
> +  src[2] = ureg_swizzle(args[2],
> +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z);
> +  emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3);
> +   }
> +}
> +
> +static void emit_arith_inst(struct st_translate *t,
> +  struct instruction_desc *desc,
> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
> +{
> +   if (desc->special) {
> +  return emit_special_inst(t, desc, dst, args, argcount);
> +   }
> +
> +   emit_insn(t, desc->TGSI_opcode, dst, 1, args, argcount);
> +}
> +
> +static void emit_dstmod(struct st_translate *t,
> +  struct ureg_dst dst, GLuint dstMod)
> +{
> +   float imm = 0.0;

1.0 right? (if you just have the saturate bit)

> +   struct ureg_src src[3];
> +
> +   if 

Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Ian Romanick
On 12/15/2015 04:40 PM, Ilia Mirkin wrote:
> Hardly a complete review, but a handful of comments:
> 
> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté  wrote:
>> ---
>>  src/mesa/Makefile.sources |   1 +
>>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 
>> ++
>>  src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
>>  src/mesa/state_tracker/st_atom_constbuf.c |  14 +
>>  src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
>>  src/mesa/state_tracker/st_cb_program.c|  35 +-
>>  src/mesa/state_tracker/st_program.c   |  22 +
>>  src/mesa/state_tracker/st_program.h   |   1 +
>>  8 files changed, 920 insertions(+), 1 deletion(-)
>>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
>>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h
>>
>> +static struct ureg_src prepare_argument(struct st_translate *t, const 
>> unsigned argId,
>> +  const struct atifragshader_src_register *srcReg)
>> +{
>> +   struct ureg_src src = get_source(t, srcReg->Index);
>> +   struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
>> +
>> +   switch (srcReg->argRep) {
>> +  case GL_NONE:
>> + break;
>> +  case GL_RED:
>> + src = ureg_swizzle(src,
>> +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, 
>> TGSI_SWIZZLE_X);
>> + break;
>> +  case GL_GREEN:
>> + src = ureg_swizzle(src,
>> +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, 
>> TGSI_SWIZZLE_Y);
>> + break;
>> +  case GL_BLUE:
>> + src = ureg_swizzle(src,
>> +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
>> TGSI_SWIZZLE_Z);
>> + break;
>> +  case GL_ALPHA:
>> + src = ureg_swizzle(src,
>> +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, 
>> TGSI_SWIZZLE_W);
>> + break;
>> +   }
>> +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
>> +
>> +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
>> +  struct ureg_src modsrc[2];
>> +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
>> +  modsrc[1] = ureg_src(arg);
>> +
>> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>> +   }
>> +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
>> +  struct ureg_src modsrc[2];
>> +  modsrc[0] = ureg_src(arg);
>> +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
>> +
>> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>> +   }
>> +   if (srcReg->argMod & GL_2X_BIT_ATI) {
>> +  struct ureg_src modsrc[2];
>> +  modsrc[0] = ureg_src(arg);
>> +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
>> +
>> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> 
> aka ADD arg, arg, arg
> 
>> +   }
>> +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
>> +  struct ureg_src modsrc[2];
>> +  modsrc[0] = ureg_src(arg);
>> +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
>> +
>> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> 
> aka NEG arg, arg
> 
>> +   }
>> +   return  ureg_src(arg);
>> +}
>> +
>> +/* These instructions have no direct equivalent in TGSI */
>> +static void emit_special_inst(struct st_translate *t, struct 
>> instruction_desc *desc,
>> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
>> +{
>> +   struct ureg_dst tmp[1];
>> +   struct ureg_src src[3];
>> +
>> +   if(desc->special == 1) {
>> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose 
>> a3
>> +  src[0] = ureg_imm1f(t->ureg, 0.5f);
>> +  src[1] = args[2];
>> +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
>> +  src[0] = ureg_src(tmp[0]);
>> +  src[1] = args[0];
>> +  src[2] = args[1];
>> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
>> +   } else if (desc->special == 2) {
>> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose 
>> a3
>> +  src[0] = args[2];
>> +  src[1] = ureg_imm1f(t->ureg, 0.0f);
>> +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
>> +  src[0] = ureg_src(tmp[0]);
>> +  src[1] = args[0];
>> +  src[2] = args[1];
>> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
> 
> Isn't this the CMP instruction? Just flip the args.
> 
> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP
> 
> The other one should be expressible as CMP as well I think.
> 
>> +   } else if (desc->special == 3) {
>> +  src[0] = args[0];
>> +  src[1] = args[1];
>> +  src[2] = ureg_swizzle(args[2],
>> +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z);
>> +  emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3);
>> +   }
>> +}
>> +
>> +static void emit_arith_inst(struct st_translate *t,
>> +  struct instruction_desc *desc,
>> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
>> +{
>> +   if (desc->special) {
>> +  return emit_special_inst(t, desc, dst, args, argcount);
>> +   }
>> +
>> +   emit_insn(t, desc->TGSI_opcode, dst, 1, args, argcount);
>> +}
>> +
>> +static void 

Re: [Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1

2015-12-15 Thread Ian Romanick
Doesn't this make patch 3 irrelevant?  FWIW, I like this better.

On 12/15/2015 04:08 PM, Jordan Justen wrote:
> Previously we were checking the desktop OpenGL ARB_compute_shader
> requirements, but for OpenGLES 3.1, the requirements are lower.
> 
> Signed-off-by: Jordan Justen 
> Cc: Marta Lofstedt 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 0abe601..5105625 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context *brw)
>[MESA_SHADER_GEOMETRY] = brw->gen >= 6,
>[MESA_SHADER_FRAGMENT] = true,
>[MESA_SHADER_COMPUTE] =
> - (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
> + (ctx->API == API_OPENGL_CORE &&
> +  ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
> + (ctx->API == API_OPENGLES2 &&
> +  ctx->Const.MaxComputeWorkGroupSize[0] >= 128) ||
>   _mesa_extension_override_enables.ARB_compute_shader,
> };
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Reduce vertex state reemission

2015-12-15 Thread Ian Romanick
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
> We can inspect VS prog_data for iterations i > 0, and only flag
> BRW_NEW_VERTICES when one of our system values change.
> 
> This change also flags BRW_NEW_VERTICES in one case we were missing
> before: if we're doing an indirect draw, prims[i].basevertex is always 0
> and the real base vertex value is in the indirect parameter
> buffer. Thus, if a program uses base vertex or base instance, and the
> draw call is indirect, flag BRW_NEW_VERTICES.  A new piglit test,
> spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this.
> ---
>  src/mesa/drivers/dri/i965/brw_draw.c | 44 
> 
>  1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index b0710c67..9e400ca 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -491,9 +491,44 @@ brw_try_draw_prims(struct gl_context *ctx,
>   }
>}
>  
> -  brw->draw.params.gl_basevertex =
> +  /* Determine if we need to flag BRW_NEW_VERTICES for updating the
> +   * gl_BaseVertexARB, gl_BaseInstanceARB or gl_DrawIDARB values. As
> +   * above, we don't need to check first iteration, since the flag is set
> +   * before the loop. We also can't rely on vs prog_data in the first
> +   * iteration, but after drawing once, we've uploaded the programs and
> +   * can look at prog_data.
> +   *
> +   * Despite the prims[] name, eache iteration correspond to a draw call
  eachcorresponds

> +   * from a glMulti* style draw call. We need to re-upload vertex state 
> if
> +   *
> +   *  1) the program uses gl_DrawIDARB (changes every iteration),
> +   *
> +   *  2) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the
> +   * draw call is indirect (meaning we can't check if the value 
> change
> +   * or not), or
> +   *
> +   *  3) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the
> +   *  value changed
> +   */
> +  const int new_basevertex =
>   prims[i].indexed ? prims[i].basevertex : prims[i].start;
> -  brw->draw.params.gl_baseinstance = prims[i].base_instance;
> +  const int new_baseinstance = prims[i].base_instance;
> +  if (i > 0) {
> + const bool uses_draw_parameters =
> +brw->vs.prog_data->uses_basevertex ||
> +brw->vs.prog_data->uses_baseinstance;
> +
> + if (brw->vs.prog_data->uses_drawid ||
> + (uses_draw_parameters && prims[i].is_indirect) ||
> + (brw->vs.prog_data->uses_basevertex &&
> +  brw->draw.params.gl_basevertex != new_basevertex) ||
> + (brw->vs.prog_data->uses_baseinstance &&
> +  brw->draw.params.gl_baseinstance != new_baseinstance))
> +brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
> +  }
> +
> +  brw->draw.params.gl_basevertex = new_basevertex;
> +  brw->draw.params.gl_baseinstance = new_baseinstance;
>drm_intel_bo_unreference(brw->draw.draw_params_bo);
>  
>if (prims[i].is_indirect) {
> @@ -512,10 +547,11 @@ brw_try_draw_prims(struct gl_context *ctx,
>}
>  
>/* gl_DrawID always needs its own vertex buffer since it's not part of
> -   * the indirect parameter buffer. */
> +   * the indirect parameter buffer.
> +   */

Lol

>brw->draw.gl_drawid = prims[i].draw_id;
>drm_intel_bo_unreference(brw->draw.draw_id_bo);
> -  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
> +  brw->draw.draw_id_bo = NULL;

It seems odd that this change is in this patch.  Should it have always
been after the drm_intel_bo_unreference call?

>  
>if (brw->gen < 6)
>brw_set_prim(brw, [i]);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] osmesa: add new OSMesaCreateContextAttribs function

2015-12-15 Thread Brian Paul
This allows specifying a GL profile and version so one can get a core-
profile context.
---
 docs/relnotes/11.2.0.html|   2 +
 include/GL/osmesa.h  |  45 -
 src/mesa/drivers/osmesa/osmesa.c | 104 ++-
 3 files changed, 148 insertions(+), 3 deletions(-)

diff --git a/docs/relnotes/11.2.0.html b/docs/relnotes/11.2.0.html
index 12e0f07..e382856 100644
--- a/docs/relnotes/11.2.0.html
+++ b/docs/relnotes/11.2.0.html
@@ -56,6 +56,8 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_vertex_type_10f_11f_11f_rev on freedreno/a4xx
 GL_KHR_texture_compression_astc_ldr on freedreno/a4xx
 GL_AMD_performance_monitor on radeonsi (CIK+ only)
+New OSMesaCreateContextAttribs() function (for creating core profile
+contexts)
 
 
 Bug fixes
diff --git a/include/GL/osmesa.h b/include/GL/osmesa.h
index ca0d167..39cd54e 100644
--- a/include/GL/osmesa.h
+++ b/include/GL/osmesa.h
@@ -58,8 +58,8 @@ extern "C" {
 #include 
 
 
-#define OSMESA_MAJOR_VERSION 10
-#define OSMESA_MINOR_VERSION 0
+#define OSMESA_MAJOR_VERSION 11
+#define OSMESA_MINOR_VERSION 2
 #define OSMESA_PATCH_VERSION 0
 
 
@@ -95,6 +95,18 @@ extern "C" {
 #define OSMESA_MAX_WIDTH   0x24  /* new in 4.0 */
 #define OSMESA_MAX_HEIGHT  0x25  /* new in 4.0 */
 
+/*
+ * Accepted in OSMesaCreateContextAttrib's attribute list.
+ */
+#define OSMESA_DEPTH_BITS0x30
+#define OSMESA_STENCIL_BITS  0x31
+#define OSMESA_ACCUM_BITS0x32
+#define OSMESA_PROFILE   0x33
+#define OSMESA_CORE_PROFILE  0x34
+#define OSMESA_COMPAT_PROFILE0x35
+#define OSMESA_CONTEXT_MAJOR_VERSION 0x36
+#define OSMESA_CONTEXT_MINOR_VERSION 0x37
+
 
 typedef struct osmesa_context *OSMesaContext;
 
@@ -128,6 +140,35 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, 
GLint stencilBits,
 
 
 /*
+ * Create an Off-Screen Mesa rendering context with attribute list.
+ * The list is composed of (attribute, value) pairs and terminated with
+ * attribute==0.  Supported Attributes:
+ *
+ * AttributesValues
+ * --
+ * OSMESA_FORMAT OSMESA_RGBA*, OSMESA_BGRA, OSMESA_ARGB, etc.
+ * OSMESA_DEPTH_BITS 0*, 16, 24, 32
+ * OSMESA_STENCIL_BITS   0*, 8
+ * OSMESA_ACCUM_BITS 0*, 16
+ * OSMESA_PROFILEOSMESA_COMPAT_PROFILE*, OSMESA_CORE_PROFILE
+ * OSMESA_CONTEXT_MAJOR_VERSION  1*, 2, 3
+ * OSMESA_CONTEXT_MINOR_VERSION  0+
+ *
+ * Note: * = default value
+ *
+ * We return a context version >= what's specified by OSMESA_CONTEXT_MAJOR/
+ * MINOR_VERSION for the given profile.  For example, if you request a GL 1.4
+ * compat profile, you might get a GL 3.0 compat profile.
+ * Otherwise, null is returned if the version/profile is not supported.
+ *
+ * New in Mesa 11.2
+ */
+GLAPI OSMesaContext GLAPIENTRY
+OSMesaCreateContextAttribs( const int *attribList, OSMesaContext sharelist );
+
+
+
+/*
  * Destroy an Off-Screen Mesa rendering context.
  *
  * Input:  ctx - the context to destroy
diff --git a/src/mesa/drivers/osmesa/osmesa.c b/src/mesa/drivers/osmesa/osmesa.c
index 5c7dcac..8f14dfd 100644
--- a/src/mesa/drivers/osmesa/osmesa.c
+++ b/src/mesa/drivers/osmesa/osmesa.c
@@ -645,10 +645,104 @@ GLAPI OSMesaContext GLAPIENTRY
 OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits,
 GLint accumBits, OSMesaContext sharelist )
 {
+   int attribs[100], n = 0;
+
+   attribs[n++] = OSMESA_FORMAT;
+   attribs[n++] = format;
+   attribs[n++] = OSMESA_DEPTH_BITS;
+   attribs[n++] = depthBits;
+   attribs[n++] = OSMESA_STENCIL_BITS;
+   attribs[n++] = stencilBits;
+   attribs[n++] = OSMESA_ACCUM_BITS;
+   attribs[n++] = accumBits;
+   attribs[n++] = 0;
+
+   return OSMesaCreateContextAttribs(attribs, sharelist);
+}
+
+
+/**
+ * New in Mesa 11.2
+ *
+ * Create context with attribute list.
+ */
+GLAPI OSMesaContext GLAPIENTRY
+OSMesaCreateContextAttribs(const int *attribList, OSMesaContext sharelist)
+{
OSMesaContext osmesa;
struct dd_function_table functions;
GLint rind, gind, bind, aind;
GLint redBits = 0, greenBits = 0, blueBits = 0, alphaBits =0;
+   GLenum format = OSMESA_RGBA;
+   GLint depthBits = 0, stencilBits = 0, accumBits = 0;
+   int profile = OSMESA_COMPAT_PROFILE, version_major = 1, version_minor = 0;
+   gl_api api_profile = API_OPENGL_COMPAT;
+   int i;
+
+   osmesa = (OSMesaContext) CALLOC_STRUCT(osmesa_context);
+   if (!osmesa)
+  return NULL;
+
+   for (i = 0; attribList[i]; i += 2) {
+  switch (attribList[i]) {
+  case OSMESA_FORMAT:
+ format = attribList[i+1];
+ switch (format) {
+ case OSMESA_COLOR_INDEX:
+ case OSMESA_RGBA:
+ case OSMESA_BGRA:
+ case OSMESA_ARGB:
+ case OSMESA_RGB:
+ case OSMESA_BGR:
+ case OSMESA_RGB_565:
+/* legal */
+  

[Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Miklós Máté
---
 src/mesa/Makefile.sources |   1 +
 src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 ++
 src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
 src/mesa/state_tracker/st_atom_constbuf.c |  14 +
 src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
 src/mesa/state_tracker/st_cb_program.c|  35 +-
 src/mesa/state_tracker/st_program.c   |  22 +
 src/mesa/state_tracker/st_program.h   |   1 +
 8 files changed, 920 insertions(+), 1 deletion(-)
 create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
 create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index ed9848c..a8e645d 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -390,6 +390,7 @@ VBO_FILES = \
vbo/vbo_split_inplace.c
 
 STATETRACKER_FILES = \
+   state_tracker/st_atifs_to_tgsi.c \
state_tracker/st_atom_array.c \
state_tracker/st_atom_blend.c \
state_tracker/st_atom.c \
diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
b/src/mesa/state_tracker/st_atifs_to_tgsi.c
new file mode 100644
index 000..1d704cb
--- /dev/null
+++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
@@ -0,0 +1,798 @@
+
+#include "main/mtypes.h"
+#include "main/atifragshader.h"
+#include "main/texobj.h"
+#include "main/errors.h"
+#include "program/prog_parameter.h"
+
+#include "tgsi/tgsi_ureg.h"
+#include "util/u_math.h"
+#include "util/u_memory.h"
+
+#include "st_program.h"
+#include "st_atifs_to_tgsi.h"
+
+/**
+ * Intermediate state used during shader translation.
+ */
+struct st_translate {
+   struct ureg_program *ureg;
+   struct gl_context *ctx;
+   struct ati_fragment_shader *atifs;
+
+   struct ureg_dst temps[MAX_PROGRAM_TEMPS];
+   struct ureg_src *constants;
+   struct ureg_dst outputs[PIPE_MAX_SHADER_OUTPUTS];
+   struct ureg_src inputs[PIPE_MAX_SHADER_INPUTS];
+   struct ureg_dst address[1];
+   struct ureg_src samplers[PIPE_MAX_SAMPLERS];
+   struct ureg_src systemValues[SYSTEM_VALUE_MAX];
+
+   const GLuint *inputMapping;
+   const GLuint *outputMapping;
+
+   /* Keep a record of the tgsi instruction number that each mesa
+* instruction starts at, will be used to fix up labels after
+* translation.
+*/
+   unsigned *insn;
+   unsigned insn_size;
+   unsigned insn_count;
+
+   unsigned current_pass;
+
+   bool regs_written[MAX_NUM_PASSES_ATI][MAX_NUM_FRAGMENT_REGISTERS_ATI];
+
+   boolean error;
+};
+
+struct instruction_desc {
+   unsigned TGSI_opcode;
+   const char *name;
+   unsigned char arg_count;
+   unsigned char special; /* no 1:1 corresponding TGSI instruction */
+};
+
+/* index this array as inst_desc[ATI_opcode-GL_MOV_ATI] */
+static struct instruction_desc inst_desc[] = {
+   {TGSI_OPCODE_MOV, "MOV", 1, 0},
+   {TGSI_OPCODE_NOP, "UND", 0, 0}, /* unused */
+   {TGSI_OPCODE_ADD, "ADD", 2, 0},
+   {TGSI_OPCODE_MUL, "MUL", 2, 0},
+   {TGSI_OPCODE_SUB, "SUB", 2, 0},
+   {TGSI_OPCODE_DP3, "DOT3", 2, 0},
+   {TGSI_OPCODE_DP4, "DOT4", 2, 0},
+   {TGSI_OPCODE_MAD, "MAD", 3, 0},
+   {TGSI_OPCODE_LRP, "LERP", 3, 0},
+   {TGSI_OPCODE_NOP, "CND", 3, 1},
+   {TGSI_OPCODE_NOP, "CND0", 3, 2},
+   {TGSI_OPCODE_NOP, "DOT2_ADD", 3, 3}
+};
+
+/**
+ * Called prior to emitting the TGSI code for each Mesa instruction.
+ * Allocate additional space for instructions if needed.
+ * Update the insn[] array so the next Mesa instruction points to
+ * the next TGSI instruction.
+ * Copied from st_mesa_to_tgsi.c
+ */
+static void set_insn_start(struct st_translate *t,
+  unsigned start)
+{
+   if (t->insn_count + 1 >= t->insn_size) {
+  t->insn_size = 1 << (util_logbase2(t->insn_size) + 1);
+  t->insn = realloc(t->insn, t->insn_size * sizeof t->insn[0]);
+  if (t->insn == NULL) {
+ t->error = TRUE;
+ return;
+  }
+   }
+
+   t->insn[t->insn_count++] = start;
+}
+
+static void emit_insn(struct st_translate *t,
+  unsigned opcode,
+  const struct ureg_dst *dst,
+  unsigned nr_dst,
+  const struct ureg_src *src,
+  unsigned nr_src)
+{
+   set_insn_start(t, ureg_get_instruction_number(t->ureg));
+   ureg_insn(t->ureg, opcode, dst, nr_dst, src, nr_src);
+}
+
+static struct ureg_dst get_temp(struct st_translate *t, unsigned index)
+{
+   if (ureg_dst_is_undef(t->temps[index]))
+  t->temps[index] = ureg_DECL_temporary(t->ureg);
+   return t->temps[index];
+}
+
+static struct ureg_src apply_swizzle(struct st_translate *t,
+  struct ureg_src src, GLuint swizzle)
+{
+   if (swizzle == GL_SWIZZLE_STR_ATI) {
+  return src;
+   } else if (swizzle == GL_SWIZZLE_STQ_ATI) {
+  return ureg_swizzle(src,
+TGSI_SWIZZLE_X, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_W, TGSI_SWIZZLE_Z);
+   } else {
+  struct ureg_dst tmp[2];
+  struct ureg_src imm[3];
+
+  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI);
+  tmp[1] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+1);
+  imm[0] = src;
+  imm[1] 

[Mesa-dev] [PATCH 07/11] program: fix comment about the fog formula

2015-12-15 Thread Miklós Máté
---
 src/mesa/program/prog_statevars.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_statevars.c 
b/src/mesa/program/prog_statevars.c
index bdb335e..12490d0 100644
--- a/src/mesa/program/prog_statevars.c
+++ b/src/mesa/program/prog_statevars.c
@@ -474,7 +474,7 @@ _mesa_fetch_state(struct gl_context *ctx, const 
gl_state_index state[],
   * single MAD.
   * linear: fogcoord * -1/(end-start) + end/(end-start)
   * exp: 2^-(density/ln(2) * fogcoord)
-  * exp2: 2^-((density/(ln(2)^2) * fogcoord)^2)
+  * exp2: 2^-((density/(sqrt(ln(2))) * fogcoord)^2)
   */
  value[0] = (ctx->Fog.End == ctx->Fog.Start)
 ? 1.0f : (GLfloat)(-1.0F / (ctx->Fog.End - ctx->Fog.Start));
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/11] [RFC] mesa: optimize out the realloc from glCopyTexImagexD()

2015-12-15 Thread Miklós Máté
Apitrace showed this call to be 5ms (9 times per frame),
but in reality it's about 500us. This shortcut makes it 20us.
---
 src/mesa/main/teximage.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index ab60a2f..ba13720 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -3393,6 +3393,21 @@ formats_differ_in_component_sizes(mesa_format f1, 
mesa_format f2)
return GL_FALSE;
 }
 
+static GLboolean
+canAvoidRealloc(struct gl_texture_image *texImage, GLenum internalFormat,
+  GLint x, GLint y, GLsizei width, GLsizei height, GLint border)
+{
+   if (texImage->InternalFormat != internalFormat)
+  return false;
+   if (texImage->Border != border)
+  return false;
+   if (texImage->Width2 != width)
+  return false;
+   if (texImage->Height2 != height)
+  return false;
+   return true;
+}
+
 /**
  * Implement the glCopyTexImage1/2D() functions.
  */
@@ -3433,6 +3448,20 @@ copyteximage(struct gl_context *ctx, GLuint dims,
texObj = _mesa_get_current_tex_object(ctx, target);
assert(texObj);
 
+   _mesa_lock_texture(ctx, texObj);
+   {
+  texImage = _mesa_select_tex_image(texObj, target, level);
+  if (texImage && canAvoidRealloc(texImage, internalFormat,
+   x, y, width, height, border)) {
+ _mesa_unlock_texture(ctx, texObj);
+ //_mesa_debug(0, "using shortcut\n");
+ return _mesa_copy_texture_sub_image(ctx, dims, texObj, target, level,
+   0, 0, 0, x, y, width, height, "CopyTexImage");
+  }
+  //_mesa_debug(0, "can't shortcut %p, %dx%d\n", texImage, width, height);
+   }
+   _mesa_unlock_texture(ctx, texObj);
+
texFormat = _mesa_choose_texture_format(ctx, texObj, target, level,
internalFormat, GL_NONE, GL_NONE);
 
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/11] swrast: move two global defines to the only place where they are used

2015-12-15 Thread Miklós Máté
---
 src/mesa/main/mtypes.h| 2 --
 src/mesa/swrast/s_atifragshader.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5c71ac4..99e7912 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2278,8 +2278,6 @@ struct gl_compute_program_state
 /**
  * ATI_fragment_shader runtime state
  */
-#define ATI_FS_INPUT_PRIMARY 0
-#define ATI_FS_INPUT_SECONDARY 1
 
 struct atifs_instruction;
 struct atifs_setupinst;
diff --git a/src/mesa/swrast/s_atifragshader.c 
b/src/mesa/swrast/s_atifragshader.c
index 2974dee..414a414 100644
--- a/src/mesa/swrast/s_atifragshader.c
+++ b/src/mesa/swrast/s_atifragshader.c
@@ -26,6 +26,8 @@
 #include "swrast/s_atifragshader.h"
 #include "swrast/s_context.h"
 
+#define ATI_FS_INPUT_PRIMARY 0
+#define ATI_FS_INPUT_SECONDARY 1
 
 /**
  * State for executing ATI fragment shader.
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/11] GL_ATI_fragment_shader support for Gallium

2015-12-15 Thread Miklós Máté
Hi,

This series aims to improve the looks of Star Wars: Knights of the Old Republic 
(via Wine), but features some additional cleanup as well. The main component of 
the series is the implementation of GL_ATI_fragment_shader for all Gallium 
drivers (though I could only test it with radeonsi, llvmpipe, and softpipe). If 
this extension is available, the game uses it quite extensively: perhaps the 
most notable effect is the animated water ripples, but it also fixes the grass, 
improves the specular on wet characters (e.g. the Selkath) and it is used for 
regular texturing almost everywhere. The game has two optional post-process 
effects that also depend on this extension: framebuffer effects (light bloom, 
distortion), and soft shadows. Patches 5&6 are needed to fix crashing with 
post-processing. With current fglrx the grass is wrong, and post-process 
crashes, but my previous Radeon cards ran this game perfectly on Windows.

One other game that can use GL_ATI_fragment_shader is Doom 3, if 
r_renderer="r200" instead of "best" (which means "arb2", if 
GL_ARB_fragment_program is available). By default image_useNormalCompression=0, 
which results in wrong lighting and makes the specular overbright with r200. 
Setting it to 1 fixes r200, but messes up arb2, setting it to 2 fixes both. The 
light interaction is the same in r200 and arb2, but r200 doesn't have the 
heathaze shader. Later idTech4 games don't support r200 anymore: in Quake 4 
everything is green, in Prey the organic walls are black, and ETQW has a 
completely revised renderer. I verified these with fglrx.

The series is based on the 11.0 branch of Mesa. Patches 1-4 implement 
GL_ATI_fragment_shader, 5-6 fix crashing in post-process of KotOR, 7-11 are 
various cleanups. There are a few TODO comments where I wasn't entirely sure, 
and the two RFC patches are more like ideas than solutions, but most of the 
code should be fine.

After this series the following issues remain in KotOR that I've been unable to 
fix:

1. Enabling soft shadows makes all characters disappear. When drawing the 
post-process effects the game switches between scratch framebuffers and the 
real one with glXMakeContextCurrent() several times. The scratch buffers have 
no depth, and after switching back to the real one the depth buffer is lost, so 
all subsequent depth tests fail.

2. Enabling MSAA results in black screen when post-process is enabled, only the 
light bloom is visible. I don't know how to debug this.

3. Post-process filters are extremely slow. Normally the game runs around 80fps 
(cpu-bound), but drops to 15fps with framebuffer effects, 20fps with soft 
shadows, 9fps with both. I've tried to profile this with apitrace, and found a 
bottleneck (see patch 10) that cost 5ms per call, but it turned out that it's 
not the real bottleneck. Both capturing and replaying are very slow compared to 
the game (15fps), so the profiler basically measures its own latency. I've 
tried to find the real culprit by adding time measurement to the calls made 
when drawing the post-process effects, but haven't found anything yet.

Screenshot gallery:

Dantooine
Fixed-function: http://postimg.org/image/5de014vd5/
With grass: http://postimg.org/image/u7xhv7g7d/
With ATIfs: http://postimg.org/image/jijt2y4eh/

Kashyyyk Shadowlands
Fixed-function: http://postimg.org/image/mchb7drbv/
ATIfs without fog: http://postimg.org/image/dk0cjp66z/
ATIfs with apply_fog(): http://postimg.org/image/rcerfbwyj/

Manaan
Fixed-function: http://postimg.org/image/4l13f6mjf/
ATIfs: http://postimg.org/image/nat2vxfa3/
Framebuffer effects: http://postimg.org/image/vhl2ni5cr/

Stealth Mission
Without framebuffer effects: http://postimg.org/image/xcy12i7v3/
With framebuffer effects: http://postimg.org/image/75wu6jplb/

Shadows
Hard: http://postimg.org/image/ycjqkgxn3/
Soft: http://postimg.org/image/lmk3l4f2n/


Miklós Máté (11):
  mesa: Don't leak ATIfs instructions in DeleteFragmentShader
  mesa: optionally associate a gl_program to ati_fragment_shader
  st/mesa: implement GL_ATI_fragment_shader
  st/mesa: enable GL_ATI_fragment_shader
  [RFC] mesa: allow binding framebuffer without depth
  st/mesa: fix handling the fallback texture
  program: fix comment about the fog formula
  mesa: improve debug log in atifragshader
  swrast: move two global defines to the only place where they are used
  [RFC] mesa: optimize out the realloc from glCopyTexImagexD()
  program: Remove extra reference_program()

 src/mesa/Makefile.sources |   1 +
 src/mesa/drivers/common/driverfuncs.c |   3 +
 src/mesa/main/atifragshader.c |  18 +-
 src/mesa/main/context.c   |  10 +-
 src/mesa/main/dd.h|   6 +-
 src/mesa/main/mtypes.h|   3 +-
 src/mesa/main/state.c |  14 +-
 src/mesa/main/teximage.c  |  29 ++
 src/mesa/program/ir_to_mesa.cpp   |   2 -
 src/mesa/program/prog_statevars.c |   2 +-
 

[Mesa-dev] [PATCH 11/11] program: Remove extra reference_program()

2015-12-15 Thread Miklós Máté
It was already done in get_mesa_program()
---
 src/mesa/program/ir_to_mesa.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 8f58f3e..a28cf97 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2938,8 +2938,6 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (linked_prog) {
  _mesa_copy_linked_program_data((gl_shader_stage) i, prog, 
linked_prog);
 
-_mesa_reference_program(ctx, >_LinkedShaders[i]->Program,
-linked_prog);
  if (!ctx->Driver.ProgramStringNotify(ctx,
   _mesa_shader_stage_to_program(i),
   linked_prog)) {
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/11] mesa: optionally associate a gl_program to ati_fragment_shader

2015-12-15 Thread Miklós Máté
the state tracker will use it
---
 src/mesa/drivers/common/driverfuncs.c |  3 +++
 src/mesa/main/atifragshader.c | 13 -
 src/mesa/main/dd.h|  6 +-
 src/mesa/main/mtypes.h|  1 +
 src/mesa/main/state.c | 14 +-
 5 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/driverfuncs.c 
b/src/mesa/drivers/common/driverfuncs.c
index 6fe42b1..36e9281 100644
--- a/src/mesa/drivers/common/driverfuncs.c
+++ b/src/mesa/drivers/common/driverfuncs.c
@@ -118,6 +118,9 @@ _mesa_init_driver_functions(struct dd_function_table 
*driver)
driver->NewProgram = _mesa_new_program;
driver->DeleteProgram = _mesa_delete_program;
 
+   /* ATI_fragment_shader */
+   driver->NewATIfs = NULL;
+
/* simple state commands */
driver->AlphaFunc = NULL;
driver->BlendColor = NULL;
diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
index 3ddc51d..d1c07c5 100644
--- a/src/mesa/main/atifragshader.c
+++ b/src/mesa/main/atifragshader.c
@@ -30,6 +30,7 @@
 #include "main/mtypes.h"
 #include "main/dispatch.h"
 #include "main/atifragshader.h"
+#include "program/program.h"
 
 #define MESA_DEBUG_ATI_FS 0
 
@@ -63,6 +64,7 @@ _mesa_delete_ati_fragment_shader(struct gl_context *ctx, 
struct ati_fragment_sha
   free(s->Instructions[i]);
   free(s->SetupInst[i]);
}
+   _mesa_reference_program(ctx, >Program, NULL);
free(s);
 }
 
@@ -321,6 +323,8 @@ _mesa_BeginFragmentShaderATI(void)
  free(ctx->ATIFragmentShader.Current->SetupInst[i]);
}
 
+   _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, 
NULL);
+
/* malloc the instructions here - not sure if the best place but its
   a start */
for (i = 0; i < MAX_NUM_PASSES_ATI; i++) {
@@ -402,7 +406,14 @@ _mesa_EndFragmentShaderATI(void)
}
 #endif
 
-   if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, NULL)) {
+   if (ctx->Driver.NewATIfs) {
+  struct gl_program *prog = ctx->Driver.NewATIfs(ctx,
+ctx->ATIFragmentShader.Current->Id);
+  _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, 
prog);
+   }
+
+   if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI,
+curProg->Program)) {
   ctx->ATIFragmentShader.Current->isValid = GL_FALSE;
   /* XXX is this the right error? */
   _mesa_error(ctx, GL_INVALID_OPERATION,
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 87eb63e..9d24279 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -471,7 +471,11 @@ struct dd_function_table {
struct gl_program * (*NewProgram)(struct gl_context *ctx, GLenum target,
  GLuint id);
/** Delete a program */
-   void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog);   
+   void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog);
+   /**
+* Allocate a program to associate with the new ATI fragment shader 
(optional)
+*/
+   struct gl_program * (*NewATIfs)(struct gl_context *ctx, GLuint id);
/**
 * Notify driver that a program string (and GPU code) has been specified
 * or modified.  Return GL_TRUE or GL_FALSE to indicate if the program is
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index cc8f350..5c71ac4 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2303,6 +2303,7 @@ struct ati_fragment_shader
GLboolean interpinp1;
GLboolean isValid;
GLuint swizzlerq;
+   struct gl_program *Program;
 };
 
 /**
diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index d3b1c72..cabba1b 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -124,7 +124,8 @@ update_program(struct gl_context *ctx)
 * follows:
 *   1. OpenGL 2.0/ARB vertex/fragment shaders
 *   2. ARB/NV vertex/fragment programs
-*   3. Programs derived from fixed-function state.
+*   3. ATI fragment shader
+*   4. Programs derived from fixed-function state.
 *
 * Note: it's possible for a vertex shader to get used with a fragment
 * program (and vice versa) here, but in practice that shouldn't ever
@@ -152,6 +153,17 @@ update_program(struct gl_context *ctx)
   _mesa_reference_fragprog(ctx, >FragmentProgram._TexEnvProgram,
   NULL);
}
+   else if (ctx->ATIFragmentShader._Enabled
+&& ctx->ATIFragmentShader.Current->Program) {
+   /* Use the enabled ATI fragment shader's associated program */
+  _mesa_reference_shader_program(ctx,
+ >_Shader->_CurrentFragmentProgram,
+NULL);
+  _mesa_reference_fragprog(ctx, >FragmentProgram._Current,
+   
gl_fragment_program(ctx->ATIFragmentShader.Current->Program));
+  _mesa_reference_fragprog(ctx, >FragmentProgram._TexEnvProgram,
+  NULL);
+   }
else if 

[Mesa-dev] [PATCH 01/11] mesa: Don't leak ATIfs instructions in DeleteFragmentShader

2015-12-15 Thread Miklós Máté
---
 src/mesa/main/atifragshader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
index 935ba05..3ddc51d 100644
--- a/src/mesa/main/atifragshader.c
+++ b/src/mesa/main/atifragshader.c
@@ -293,7 +293,7 @@ _mesa_DeleteFragmentShaderATI(GLuint id)
 prog->RefCount--;
 if (prog->RefCount <= 0) {
assert(prog != );
-   free(prog);
+_mesa_delete_ati_fragment_shader(ctx, prog);
 }
   }
}
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/11] mesa: improve debug log in atifragshader

2015-12-15 Thread Miklós Máté
---
 src/mesa/main/atifragshader.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
index d1c07c5..8b19a35 100644
--- a/src/mesa/main/atifragshader.c
+++ b/src/mesa/main/atifragshader.c
@@ -349,6 +349,9 @@ _mesa_BeginFragmentShaderATI(void)
ctx->ATIFragmentShader.Current->isValid = GL_FALSE;
ctx->ATIFragmentShader.Current->swizzlerq = 0;
ctx->ATIFragmentShader.Compiling = 1;
+#if MESA_DEBUG_ATI_FS
+   _mesa_debug(ctx, "%s %u\n", __func__, ctx->ATIFragmentShader.Current->Id);
+#endif
 }
 
 void GLAPIENTRY
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/11] st/mesa: enable GL_ATI_fragment_shader

2015-12-15 Thread Miklós Máté
---
 src/mesa/state_tracker/st_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index d97dfde..45ceae1 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -652,6 +652,7 @@ void st_init_extensions(struct pipe_screen *screen,
extensions->EXT_texture_env_dot3 = GL_TRUE;
extensions->EXT_vertex_array_bgra = GL_TRUE;
 
+   extensions->ATI_fragment_shader = GL_TRUE;
extensions->ATI_texture_env_combine3 = GL_TRUE;
 
extensions->MESA_pack_invert = GL_TRUE;
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/11] st/mesa: fix handling the fallback texture

2015-12-15 Thread Miklós Máté
---
 src/mesa/state_tracker/st_atom_sampler.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_atom_sampler.c 
b/src/mesa/state_tracker/st_atom_sampler.c
index 4252c27..7d3d8e7 100644
--- a/src/mesa/state_tracker/st_atom_sampler.c
+++ b/src/mesa/state_tracker/st_atom_sampler.c
@@ -131,7 +131,7 @@ convert_sampler(struct st_context *st,
 struct pipe_sampler_state *sampler,
 GLuint texUnit)
 {
-   const struct gl_texture_object *texobj;
+   struct gl_texture_object *texobj;
struct gl_context *ctx = st->ctx;
struct gl_sampler_object *msamp;
GLenum texBaseFormat;
@@ -144,6 +144,10 @@ convert_sampler(struct st_context *st,
texBaseFormat = _mesa_texture_base_format(texobj);
 
msamp = _mesa_get_samplerobj(ctx, texUnit);
+   if (!msamp) {
+  /* handle the fallback texture */
+  msamp = >Sampler;
+   }
 
memset(sampler, 0, sizeof(*sampler));
sampler->wrap_s = gl_wrap_xlate(msamp->WrapS);
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/11] [RFC] mesa: allow binding framebuffer without depth

2015-12-15 Thread Miklós Máté
this works with radeonsi, but crashes with llvmpipe
---
 src/mesa/main/context.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 888c461..dcaf524 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1550,10 +1550,10 @@ check_compatible(const struct gl_context *ctx,
   return GL_FALSE;
if (ctxvis->haveAccumBuffer && !bufvis->haveAccumBuffer)
   return GL_FALSE;
-   if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer)
-  return GL_FALSE;
+   /*if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer)
+ return GL_FALSE;
if (ctxvis->haveStencilBuffer && !bufvis->haveStencilBuffer)
-  return GL_FALSE;
+  return GL_FALSE;*/
if (ctxvis->redMask && ctxvis->redMask != bufvis->redMask)
   return GL_FALSE;
if (ctxvis->greenMask && ctxvis->greenMask != bufvis->greenMask)
@@ -1565,8 +1565,8 @@ check_compatible(const struct gl_context *ctx,
if (ctxvis->depthBits && ctxvis->depthBits != bufvis->depthBits)
   return GL_FALSE;
 #endif
-   if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits)
-  return GL_FALSE;
+   /*if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits)
+  return GL_FALSE;*/
 
return GL_TRUE;
 }
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1

2015-12-15 Thread Ian Romanick
On 12/15/2015 04:08 PM, Jordan Justen wrote:
> The OpenGL ARB_compute_shader extension specfication requires at least
> 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1
> only required 128.

Does this mean that extensions->ARB_compute_shader is not set?  I'm a
little bit nervous about that.  Are we sure that we check for compute
shader support correctly everywhere (i.e., don't just check the
extension bit that isn't set)?

> Signed-off-by: Jordan Justen 
> Cc: Ian Romanick 
> Cc: Marta Lofstedt 
> ---
>  src/mesa/main/version.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index e92bb11..112a73d 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions 
> *extensions)
>  }
>  
>  static GLuint
> -compute_version_es2(const struct gl_extensions *extensions)
> +compute_version_es2(const struct gl_extensions *extensions,
> +const struct gl_constants *consts)
>  {
> /* OpenGL ES 2.0 is derived from OpenGL 2.0 */
> const bool ver_2_0 = (extensions->ARB_texture_cube_map &&
> @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions 
> *extensions)
>   extensions->EXT_texture_snorm &&
>   extensions->NV_primitive_restart &&
>   extensions->OES_depth_texture_cube_map);
> +   const bool es31_compute_shader =
> +  consts->MaxComputeWorkGroupInvocations >= 128;
> const bool ver_3_1 = (ver_3_0 &&
>   extensions->ARB_arrays_of_arrays &&
> - extensions->ARB_compute_shader &&
> + es31_compute_shader &&
>   extensions->ARB_draw_indirect &&
>   extensions->ARB_explicit_uniform_location &&
>   extensions->ARB_framebuffer_no_attachments &&
> @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions *extensions,
> case API_OPENGLES:
>return compute_version_es1(extensions);
> case API_OPENGLES2:
> -  return compute_version_es2(extensions);
> +  return compute_version_es2(extensions, consts);
> }
> return 0;
>  }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1

2015-12-15 Thread Jordan Justen
On 2015-12-15 16:50:39, Ian Romanick wrote:
> On 12/15/2015 04:08 PM, Jordan Justen wrote:
> > The OpenGL ARB_compute_shader extension specfication requires at least
> > 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1
> > only required 128.
> 
> Does this mean that extensions->ARB_compute_shader is not set?

Yes. I think we can't set this in some cases due to desktop GL
requirements, but we should still be able to support CS on ES 3.1.

> I'm a little bit nervous about that. Are we sure that we check for
> compute shader support correctly everywhere (i.e., don't just check
> the extension bit that isn't set)?

I think we have it pretty well covered. The ES 3.1 CTS seems pretty
happy with what we have.

That said, patch 2 was yet another fix to use
_mesa_has_compute_shaders, and I wouldn't be surprised if we ended up
finding some more. (I did try to grep to find anything we might have
missed.)

-Jordan

> > Signed-off-by: Jordan Justen 
> > Cc: Ian Romanick 
> > Cc: Marta Lofstedt 
> > ---
> >  src/mesa/main/version.c | 9 ++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> > index e92bb11..112a73d 100644
> > --- a/src/mesa/main/version.c
> > +++ b/src/mesa/main/version.c
> > @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions 
> > *extensions)
> >  }
> >  
> >  static GLuint
> > -compute_version_es2(const struct gl_extensions *extensions)
> > +compute_version_es2(const struct gl_extensions *extensions,
> > +const struct gl_constants *consts)
> >  {
> > /* OpenGL ES 2.0 is derived from OpenGL 2.0 */
> > const bool ver_2_0 = (extensions->ARB_texture_cube_map &&
> > @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions 
> > *extensions)
> >   extensions->EXT_texture_snorm &&
> >   extensions->NV_primitive_restart &&
> >   extensions->OES_depth_texture_cube_map);
> > +   const bool es31_compute_shader =
> > +  consts->MaxComputeWorkGroupInvocations >= 128;
> > const bool ver_3_1 = (ver_3_0 &&
> >   extensions->ARB_arrays_of_arrays &&
> > - extensions->ARB_compute_shader &&
> > + es31_compute_shader &&
> >   extensions->ARB_draw_indirect &&
> >   extensions->ARB_explicit_uniform_location &&
> >   extensions->ARB_framebuffer_no_attachments &&
> > @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions 
> > *extensions,
> > case API_OPENGLES:
> >return compute_version_es1(extensions);
> > case API_OPENGLES2:
> > -  return compute_version_es2(extensions);
> > +  return compute_version_es2(extensions, consts);
> > }
> > return 0;
> >  }
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage

2015-12-15 Thread Jason Ekstrand
On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga"  wrote:
>
> We want to make sure that the driver does not disable the FS unit if
> the shader code only has SSBO writes (i.e. no color or depth output).
>
> We could go a step further and check if the shader storage is actually
> used for writing, but does not seem worth the trouble. Also, we do the
> same thing for atomic buffers.
>
> Fixes the following CTS test:
> ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs
> ---
>  src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++-
>  src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
> index 06d5e65..d292b13 100644
> --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
> @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw)
>dw1 |= GEN7_WM_KILL_ENABLE;
> }
>
> -   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) {
> +   if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) ||
> +   _mesa_active_fragment_shader_has_shader_storage(>ctx)) {

Ugh... We also need to be checking for images.

How about we change it to active_fragment_shader_has_side_effects and make
it check all three?

>dw1 |= GEN7_WM_DISPATCH_ENABLE;
> }
>
> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> index 945f710..8769269 100644
> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw,
>  * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS |
_NEW_COLOR
>  */
> if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) ||
> +_mesa_active_fragment_shader_has_shader_storage(>ctx) ||
>  prog_data->base.nr_image_params) &&
> !brw_color_buffer_write_enabled(brw))
>dw1 |= GEN8_PSX_SHADER_HAS_UAV;
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] GL_ATI_fragment_shader support for Gallium

2015-12-15 Thread Roland Scheidegger
Am 16.12.2015 um 00:05 schrieb Miklós Máté:
> Hi,
> 
> This series aims to improve the looks of Star Wars: Knights of the
> Old Republic (via Wine), but features some additional cleanup as
> well. The main component of the series is the implementation of
> GL_ATI_fragment_shader for all Gallium drivers (though I could only
> test it with radeonsi, llvmpipe, and softpipe). If this extension is
> available, the game uses it quite extensively: perhaps the most
> notable effect is the animated water ripples, but it also fixes the
> grass, improves the specular on wet characters (e.g. the Selkath) and
> it is used for regular texturing almost everywhere. The game has two
> optional post-process effects that also depend on this extension:
> framebuffer effects (light bloom, distortion), and soft shadows.
> Patches 5&6 are needed to fix crashing with post-processing. With
> current fglrx the grass is wrong, and post-process crashes, but my
> previous Radeon cards ran this game perfectly on Windows.
> 
> One other game that can use GL_ATI_fragment_shader is Doom 3, if
> r_renderer="r200" instead of "best" (which means "arb2", if
> GL_ARB_fragment_program is available). By default
> image_useNormalCompression=0, which results in wrong lighting and
> makes the specular overbright with r200. Setting it to 1 fixes r200,
> but messes up arb2, setting it to 2 fixes both. The light interaction
> is the same in r200 and arb2, but r200 doesn't have the heathaze
> shader. Later idTech4 games don't support r200 anymore: in Quake 4
> everything is green, in Prey the organic walls are black, and ETQW
> has a completely revised renderer. I verified these with fglrx.

I think the reason why noone was interested in making ATI_fs supported
so far on anything other than r200 was that there just wasn't really
anything depending on it. As doom3 could use arb_fs just fine...
But I guess if wine can use it there's some more apps probably...

FWIW I think quake4 should work fine. Back when I implemented this for
r200, it was indeed broken and I traced that back to something broken in
the main shader (can't remember what, something trivial like wrong tex
unit used in an instruction). I reported that and got told it was
already fixed in the game - however there was never a new demo released
thus if you just have the demo it's still broken.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Ilia Mirkin
On Dec 15, 2015 8:59 PM, "Ian Romanick"  wrote:
>
> On 12/15/2015 05:08 PM, Ilia Mirkin wrote:
> > On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick 
wrote:
> >> On 12/15/2015 04:40 PM, Ilia Mirkin wrote:
> >>> Hardly a complete review, but a handful of comments:
> >>>
> >>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté  wrote:
>  ---
>   src/mesa/Makefile.sources |   1 +
>   src/mesa/state_tracker/st_atifs_to_tgsi.c | 798
++
>   src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
>   src/mesa/state_tracker/st_atom_constbuf.c |  14 +
>   src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
>   src/mesa/state_tracker/st_cb_program.c|  35 +-
>   src/mesa/state_tracker/st_program.c   |  22 +
>   src/mesa/state_tracker/st_program.h   |   1 +
>   8 files changed, 920 insertions(+), 1 deletion(-)
>   create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
>   create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h
> 
>  +static struct ureg_src prepare_argument(struct st_translate *t,
const unsigned argId,
>  +  const struct atifragshader_src_register *srcReg)
>  +{
>  +   struct ureg_src src = get_source(t, srcReg->Index);
>  +   struct ureg_dst arg = get_temp(t,
MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
>  +
>  +   switch (srcReg->argRep) {
>  +  case GL_NONE:
>  + break;
>  +  case GL_RED:
>  + src = ureg_swizzle(src,
>  +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X,
TGSI_SWIZZLE_X);
>  + break;
>  +  case GL_GREEN:
>  + src = ureg_swizzle(src,
>  +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y,
TGSI_SWIZZLE_Y);
>  + break;
>  +  case GL_BLUE:
>  + src = ureg_swizzle(src,
>  +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z,
TGSI_SWIZZLE_Z);
>  + break;
>  +  case GL_ALPHA:
>  + src = ureg_swizzle(src,
>  +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W,
TGSI_SWIZZLE_W);
>  + break;
>  +   }
>  +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
>  +
>  +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
>  +  struct ureg_src modsrc[2];
>  +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
>  +  modsrc[1] = ureg_src(arg);
>  +
>  +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>  +   }
>  +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
>  +  struct ureg_src modsrc[2];
>  +  modsrc[0] = ureg_src(arg);
>  +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
>  +
>  +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>  +   }
>  +   if (srcReg->argMod & GL_2X_BIT_ATI) {
>  +  struct ureg_src modsrc[2];
>  +  modsrc[0] = ureg_src(arg);
>  +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
>  +
>  +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> >>>
> >>> aka ADD arg, arg, arg
> >>>
>  +   }
>  +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
>  +  struct ureg_src modsrc[2];
>  +  modsrc[0] = ureg_src(arg);
>  +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
>  +
>  +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> >>>
> >>> aka NEG arg, arg
> >>>
>  +   }
>  +   return  ureg_src(arg);
>  +}
>  +
>  +/* These instructions have no direct equivalent in TGSI */
>  +static void emit_special_inst(struct st_translate *t, struct
instruction_desc *desc,
>  +  struct ureg_dst *dst, struct ureg_src *args, unsigned
argcount)
>  +{
>  +   struct ureg_dst tmp[1];
>  +   struct ureg_src src[3];
>  +
>  +   if(desc->special == 1) {
>  +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); //
re-purpose a3
>  +  src[0] = ureg_imm1f(t->ureg, 0.5f);
>  +  src[1] = args[2];
>  +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
>  +  src[0] = ureg_src(tmp[0]);
>  +  src[1] = args[0];
>  +  src[2] = args[1];
>  +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
>  +   } else if (desc->special == 2) {
>  +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); //
re-purpose a3
>  +  src[0] = args[2];
>  +  src[1] = ureg_imm1f(t->ureg, 0.0f);
>  +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
>  +  src[0] = ureg_src(tmp[0]);
>  +  src[1] = args[0];
>  +  src[2] = args[1];
>  +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
> >>>
> >>> Isn't this the CMP instruction? Just flip the args.
> >>>
> >>> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP
> >>>
> >>> The other one should be expressible as CMP as well I think.
> >>>
>  +   } else if (desc->special == 3) {
>  +  src[0] = 

Re: [Mesa-dev] [PATCH] Add .mailmap

2015-12-15 Thread Michel Dänzer
On 16.12.2015 06:40, Giuseppe Bilotta wrote:
> This adds a first tentative .mailmap file, to canonicize contributor
> name/emails in shortlogs and other statistical endeavours.
> 
> There's a couple of root and richard entries which I don't know who
> they belong to, and hopefully not too many overeager merges.
> 
> Signed-off-by: Giuseppe Bilotta 

[...]

> diff --git a/.mailmap b/.mailmap
> new file mode 100644
> index 000..bf8b4d9
> --- /dev/null
> +++ b/.mailmap
> @@ -0,0 +1,460 @@
> +
> +Adam Jackson  
> +Adam Jackson  

In Adam's case, you put a personal e-mail address first and his
employer's address last.


> +Michel Dänzer  Michel Daenzer 
> 
> +Michel Dänzer  Michel Daenzer 
> 
> +Michel Dänzer  
> +Michel Dänzer  
> +Michel Dänzer  

In my case, you put my current employer's address first and my personal
and former employers' addresses last.

What's the (intended) meaning of this mapping? If it means that all my
contributions will be accounted to AMD, I'm afraid that's not very accurate.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware

2015-12-15 Thread Jason Ekstrand
On Dec 15, 2015 12:30 AM, "Abdiel Janulgue" 
wrote:
>
>
>
> On 12/10/2015 06:23 AM, Jason Ekstrand wrote:
> > While we're at it, we also add support for the possibility that the
> > indirect is, in fact, a constant.  This shouldn't happen in the common
case
> > (if it does, that means NIR failed to constant-fold something), but it's
> > possible so we should handle it.
>
> Perhaps this should re-ordered before patch 3?

We could, but it really doesn't matter. No MOV_INDIRECT ever hits the
generator pre-BDW prior to patch 15. They get lowered away to pull constant
loads.
--Jason

> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51
+++---
> >  2 files changed, 42 insertions(+), 13 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 9eaf8d0..a2ec03e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -4424,6 +4424,10 @@ get_lowered_simd_width(const struct
brw_device_info *devinfo,
> > case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
> >return 8;
> >
> > +   case SHADER_OPCODE_MOV_INDIRECT:
> > +  /* Prior to Broadwell, we only have 8 address subregisters */
> > +  return devinfo->gen < 8 ? 8 : inst->exec_size;
> > +
> > default:
> >return inst->exec_size;
> > }
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > index d86eee1..7fa6d84 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > @@ -351,22 +351,47 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
> >
> > unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr;
> >
> > -   /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
> > -   struct brw_reg addr = vec8(brw_address_reg(0));
> > +   if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) {
> > +  imm_byte_offset += indirect_byte_offset.ud;
> >
> > -   /* The destination stride of an instruction (in bytes) must be
greater
> > -* than or equal to the size of the rest of the instruction.  Since
the
> > -* address register is of type UW, we can't use a D-type
instruction.
> > -* In order to get around this, re re-type to UW and use a stride.
> > -*/
> > -   indirect_byte_offset =
> > -  retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> > +  reg.nr = imm_byte_offset / REG_SIZE;
> > +  reg.subnr = imm_byte_offset % REG_SIZE;
> > +  brw_MOV(p, dst, reg);
> > +   } else {
> > +  /* Prior to Broadwell, there are only 8 address registers. */
> > +  assert(inst->exec_size == 8 || devinfo->gen >= 8);
> >
> > -   /* Prior to Broadwell, there are only 8 address registers. */
> > -   assert(inst->exec_size == 8 || devinfo->gen >= 8);
> > +  /* We use VxH indirect addressing, clobbering a0.0 through a0.7.
*/
> > +  struct brw_reg addr = vec8(brw_address_reg(0));
> >
> > -   brw_MOV(p, addr, indirect_byte_offset);
> > -   brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset),
dst.type));
> > +  /* The destination stride of an instruction (in bytes) must be
greater
> > +   * than or equal to the size of the rest of the instruction.
Since the
> > +   * address register is of type UW, we can't use a D-type
instruction.
> > +   * In order to get around this, re re-type to UW and use a
stride.
> > +   */
> > +  indirect_byte_offset =
> > + retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> > +
> > +  if (devinfo->gen < 8) {
> > + /* Prior to broadwell, we have a restriction that the bottom
5 bits
> > +  * of the base offset and the bottom 5 bits of the indirect
must add
> > +  * to less than 32.  In other words, the hardware needs to be
able to
> > +  * add the bottom five bits of the two to get the subnumber
and add
> > +  * the next 7 bits of each to get the actual register
number.  Since
> > +  * the indirect may cause us to cross a register boundary,
this makes
> > +  * it almost useless.  We could try and do something clever
where we
> > +  * use a actual base offset if base_offset % 32 == 0 but that
would
> > +  * mean we were generating different code depending on the
base
> > +  * offset.  Instead, for the sake of consistency, we'll just
do the
> > +  * add ourselves.
> > +  */
> > + brw_ADD(p, addr, indirect_byte_offset,
brw_imm_uw(imm_byte_offset));
> > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, 0), dst.type));
> > +  } else {
> > + brw_MOV(p, addr, indirect_byte_offset);
> > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset),
dst.type));
> > +  }
> > +   }
> >  }
> >
> >  void
> >

[Mesa-dev] [PATCH] gallivm: add a horrible hack for stencil texturing with border

2015-12-15 Thread sroland
From: Roland Scheidegger 

mesa/st doesn't give us a useful swizzle when stencil texturing. Moreover,
it's not even obvious what the swizzle actually should be - the channel which
is used for the fetch (Y) is not the same as the one which must be used for
the border component (X), which is due to a mismatch between GL and gallium
interface. (On top of that, I have no idea what GL expects in YZW channels in
the end.)
So add some special case for stencil texturing with border, to fetch the right
border component. Though it seems there has to be some better solution...
This fixes piglit texwrap GL_ARB_texture_stencil8 bordercolor (only the fixed
version).
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 28 +--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index e21933f..efba5a8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -187,9 +187,33 @@ lp_build_sample_texel_soa(struct lp_build_sample_context 
*bld,
   border_type.length = 4;
   /*
* Only replace channels which are actually present. The others should
-   * get optimized away eventually by sampler_view swizzle anyway but it's
-   * easier too.
+   * get optimized away eventually by sampler_view swizzle in most cases...
+   * If not, for "ordinary" color textures, fetch will have placed the
+   * correct default values there, since missing channels must use default
+   * values regardless of border.
+   * We do, however, some horrendous hack for stencil textures. We won't
+   * get a useful swizzle, and furthermore the channel to fetch (Y) doesn't
+   * match the channel for the border color (X).
*/
+  if (util_format_has_stencil(format_desc) &&
+!util_format_has_depth(format_desc)) {
+ LLVMValueRef zero = lp_build_const_int32(bld->gallivm, 0);
+ LLVMValueRef border_col;
+ border_col = lp_build_extract_broadcast(bld->gallivm,
+ border_type,
+ bld->texel_type,
+ bld->border_color_clamped,
+ zero);
+ /*
+  * Replace first 3 chans (match what fetch did).
+  */
+ for (chan = 0; chan < 3; chan++) {
+texel_out[chan] = lp_build_select(>texel_bld, use_border,
+  border_col, texel_out[chan]);
+ }
+ return;
+  }
+
   for (chan = 0; chan < 4; chan++) {
  unsigned chan_s;
  /* reverse-map channel... */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer

2015-12-15 Thread Lofstedt, Marta
Thanks Iago!

This patch does not only fix the ssbo test mentioned below, but a lot of other 
GLES 3.1 CTS tests.

> -Original Message-
> From: Iago Toral Quiroga [mailto:ito...@igalia.com]
> Sent: Tuesday, December 15, 2015 12:55 PM
> To: mesa-dev@lists.freedesktop.org
> Cc: Lofstedt, Marta; Justen, Jordan L; Palli, Tapani; Iago Toral Quiroga
> Subject: [PATCH] i965/gen8/cs: fix constant push buffer
> 
> Page 502 of the Command Reference Broadwell PRM says that CURBE Total
> Data Length must be 64-bit aligned.
> 
> Fixes the following CTS tests:
> ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-
> cs
> ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-
> case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 1fde69c..dbd1967 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
> 
> unsigned push_constant_data_size =
>(prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size,
> 32);
> +   unsigned reg_aligned_constant_size =
> +  ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> unsigned push_constant_regs = reg_aligned_constant_size / 32;
> unsigned threads = get_cs_thread_count(cs_prog_data);
> 
> @@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context
> *brw,
> 
>const unsigned push_constant_data_size =
>   (local_id_dwords + prog_data->nr_params) *
> sizeof(gl_constant_value);
> -  const unsigned reg_aligned_constant_size =
> ALIGN(push_constant_data_size, 32);
> +  const unsigned reg_aligned_constant_size =
> + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>const unsigned param_aligned_count =
>   reg_aligned_constant_size / sizeof(*param);
> 
> --
> 1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere

2015-12-15 Thread Tapani Pälli

On 12/15/2015 01:25 AM, Timothy Arceri wrote:

On Wed, 2015-12-09 at 00:17 +1100, Timothy Arceri wrote:

In core profile even if re-linking fails rendering shouldn't fail as
the
previous succesfully linked program will still be available. It also
shouldn't be possible to have an unlinked program as part of the
current rendering state.

Hey guys,

Any thoughts on this change?

Thinking about this some more we should probably rework the compat code
also and only do the check for link status if there is an assembly
shader right?


I wanted to hear from others first since for me it feels this change 
seems specific to separate shader programs (I had a patch on list that 
skipped the check for those programs that were not in use by current 
pipeline).


The reason is that with regular programs I can't see a way to continue 
if relinking fails (because program is now in bad state). I think user 
should detach the malfunctioning stage and link again. However with SSO 
relink to a unused stage may fail but we can still have a complete 
working program with stages marked as used.




Thanks,
Tim


// Tapani

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [OT] some contribution statistics

2015-12-15 Thread Giuseppe Bilotta
Hello all,

when Steam first announced they'd give all present and future games
free to all Mesa contributors with at least 25 commits[1], I was
curious to see how many people would be affected by this choice, so I
ran some statistics on the number of committers (and contributions by
committer) on Mesa at the time, with the following results (for April
9, 2015):

# count: 684
# min: 1
# max: 14020
# mid: 7010.5
# range: 14019

# mean: 101.35818713450293
# stddev: 652.7501707733724

# mode(s): 1

# median: 2
# quartiles: 1 2 10
# IQR: 9

Having come across an old discussion about these stats, I decided to
rerun the stats now, and the results are:

# count: 736
# min: 1
# max: 14310
# mid: 7155.5
# range: 14309

# mean: 102.19701086956522
# stddev: 651.6642244733528

# mode(s): 1

# median: 3
# quartiles: 1 3 12
# IQR: 11

And I would say that this counts as an improvement: the mean number of
contributions per developer has gone up, but most importantly the
_median_ contribution has gone up (from 2 to 3 contributions), and
ditto for the upper quartile (from 10 to 12). In some sense, the 'long
tail' of contributors in Mesa has _shortened_ in these 8 months, even
though the number of contribution has increased!

The only problem with these numbers is actually the lack of a .mailmap
to normalize contributor name/emails, which obviously skews the
results a little bit towards the lower end. I don't suppose someone
has a .mailmap for Mesa contributors, or is interested in creating
one?

[1]: http://lists.freedesktop.org/archives/dri-devel/2015-April/081045.html


-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA

2015-12-15 Thread Jason Ekstrand
On Tue, Dec 15, 2015 at 12:26 PM, Eric Anholt  wrote:
> Jason Ekstrand  writes:
>
>> The one user of this (i965) only ever calls it while in SSA form.
>
> This series is:
>
> Reviewed-by: Eric Anholt 

Thanks!

Did you happen to run it on something that actually uses clip plane
lowering?  I'd like to not break things.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [OT] some contribution statistics

2015-12-15 Thread Kenneth Graunke
On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote:
> The only problem with these numbers is actually the lack of a .mailmap
> to normalize contributor name/emails, which obviously skews the
> results a little bit towards the lower end. I don't suppose someone
> has a .mailmap for Mesa contributors, or is interested in creating
> one?

I actually have one of those!

http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] mesa: Don't leak ATIfs instructions in DeleteFragmentShader

2015-12-15 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 
Cc: "11.0 11.1" 

Assuming there are no objections, I'll push this in 24 hours.

On 12/15/2015 03:05 PM, Miklós Máté wrote:
> ---
>  src/mesa/main/atifragshader.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
> index 935ba05..3ddc51d 100644
> --- a/src/mesa/main/atifragshader.c
> +++ b/src/mesa/main/atifragshader.c
> @@ -293,7 +293,7 @@ _mesa_DeleteFragmentShaderATI(GLuint id)
>prog->RefCount--;
>if (prog->RefCount <= 0) {
>   assert(prog != );
> - free(prog);
> +_mesa_delete_ati_fragment_shader(ctx, prog);
>}
>}
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Jason Ekstrand
On Dec 15, 2015 6:19 PM, "Ilia Mirkin"  wrote:
>
>
> On Dec 15, 2015 8:59 PM, "Ian Romanick"  wrote:
> >
> > On 12/15/2015 05:08 PM, Ilia Mirkin wrote:
> > > On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick 
wrote:
> > >> On 12/15/2015 04:40 PM, Ilia Mirkin wrote:
> > >>> Hardly a complete review, but a handful of comments:
> > >>>
> > >>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté 
wrote:
> >  ---
> >   src/mesa/Makefile.sources |   1 +
> >   src/mesa/state_tracker/st_atifs_to_tgsi.c | 798
++
> >   src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
> >   src/mesa/state_tracker/st_atom_constbuf.c |  14 +
> >   src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
> >   src/mesa/state_tracker/st_cb_program.c|  35 +-
> >   src/mesa/state_tracker/st_program.c   |  22 +
> >   src/mesa/state_tracker/st_program.h   |   1 +
> >   8 files changed, 920 insertions(+), 1 deletion(-)
> >   create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
> >   create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h
> > 
> >  +static struct ureg_src prepare_argument(struct st_translate *t,
const unsigned argId,
> >  +  const struct atifragshader_src_register *srcReg)
> >  +{
> >  +   struct ureg_src src = get_source(t, srcReg->Index);
> >  +   struct ureg_dst arg = get_temp(t,
MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
> >  +
> >  +   switch (srcReg->argRep) {
> >  +  case GL_NONE:
> >  + break;
> >  +  case GL_RED:
> >  + src = ureg_swizzle(src,
> >  +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X,
TGSI_SWIZZLE_X);
> >  + break;
> >  +  case GL_GREEN:
> >  + src = ureg_swizzle(src,
> >  +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y,
TGSI_SWIZZLE_Y);
> >  + break;
> >  +  case GL_BLUE:
> >  + src = ureg_swizzle(src,
> >  +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z,
TGSI_SWIZZLE_Z);
> >  + break;
> >  +  case GL_ALPHA:
> >  + src = ureg_swizzle(src,
> >  +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W,
TGSI_SWIZZLE_W);
> >  + break;
> >  +   }
> >  +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
> >  +
> >  +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
> >  +  struct ureg_src modsrc[2];
> >  +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
> >  +  modsrc[1] = ureg_src(arg);
> >  +
> >  +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
> >  +   }
> >  +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
> >  +  struct ureg_src modsrc[2];
> >  +  modsrc[0] = ureg_src(arg);
> >  +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
> >  +
> >  +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
> >  +   }
> >  +   if (srcReg->argMod & GL_2X_BIT_ATI) {
> >  +  struct ureg_src modsrc[2];
> >  +  modsrc[0] = ureg_src(arg);
> >  +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
> >  +
> >  +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> > >>>
> > >>> aka ADD arg, arg, arg
> > >>>
> >  +   }
> >  +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
> >  +  struct ureg_src modsrc[2];
> >  +  modsrc[0] = ureg_src(arg);
> >  +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
> >  +
> >  +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
> > >>>
> > >>> aka NEG arg, arg
> > >>>
> >  +   }
> >  +   return  ureg_src(arg);
> >  +}
> >  +
> >  +/* These instructions have no direct equivalent in TGSI */
> >  +static void emit_special_inst(struct st_translate *t, struct
instruction_desc *desc,
> >  +  struct ureg_dst *dst, struct ureg_src *args, unsigned
argcount)
> >  +{
> >  +   struct ureg_dst tmp[1];
> >  +   struct ureg_src src[3];
> >  +
> >  +   if(desc->special == 1) {
> >  +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); //
re-purpose a3
> >  +  src[0] = ureg_imm1f(t->ureg, 0.5f);
> >  +  src[1] = args[2];
> >  +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
> >  +  src[0] = ureg_src(tmp[0]);
> >  +  src[1] = args[0];
> >  +  src[2] = args[1];
> >  +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
> >  +   } else if (desc->special == 2) {
> >  +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); //
re-purpose a3
> >  +  src[0] = args[2];
> >  +  src[1] = ureg_imm1f(t->ureg, 0.0f);
> >  +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
> >  +  src[0] = ureg_src(tmp[0]);
> >  +  src[1] = args[0];
> >  +  src[2] = args[1];
> >  +  emit_insn(t, TGSI_OPCODE_LRP, dst, 

Re: [Mesa-dev] [PATCH] ir_to_mesa: Skip useless comparison instructions.

2015-12-15 Thread Matt Turner
On Mon, Dec 7, 2015 at 10:50 AM, Matt Turner  wrote:
> ---
> With this, we generate the same number of Mesa IR instructions before
> and after my series. all() is the same as well.

Maybe Ian could have a look?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Ilia Mirkin
On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick  wrote:
> On 12/15/2015 04:40 PM, Ilia Mirkin wrote:
>> Hardly a complete review, but a handful of comments:
>>
>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté  wrote:
>>> ---
>>>  src/mesa/Makefile.sources |   1 +
>>>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 
>>> ++
>>>  src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
>>>  src/mesa/state_tracker/st_atom_constbuf.c |  14 +
>>>  src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
>>>  src/mesa/state_tracker/st_cb_program.c|  35 +-
>>>  src/mesa/state_tracker/st_program.c   |  22 +
>>>  src/mesa/state_tracker/st_program.h   |   1 +
>>>  8 files changed, 920 insertions(+), 1 deletion(-)
>>>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
>>>  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h
>>>
>>> +static struct ureg_src prepare_argument(struct st_translate *t, const 
>>> unsigned argId,
>>> +  const struct atifragshader_src_register *srcReg)
>>> +{
>>> +   struct ureg_src src = get_source(t, srcReg->Index);
>>> +   struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
>>> +
>>> +   switch (srcReg->argRep) {
>>> +  case GL_NONE:
>>> + break;
>>> +  case GL_RED:
>>> + src = ureg_swizzle(src,
>>> +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, 
>>> TGSI_SWIZZLE_X);
>>> + break;
>>> +  case GL_GREEN:
>>> + src = ureg_swizzle(src,
>>> +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, 
>>> TGSI_SWIZZLE_Y);
>>> + break;
>>> +  case GL_BLUE:
>>> + src = ureg_swizzle(src,
>>> +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
>>> TGSI_SWIZZLE_Z);
>>> + break;
>>> +  case GL_ALPHA:
>>> + src = ureg_swizzle(src,
>>> +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, 
>>> TGSI_SWIZZLE_W);
>>> + break;
>>> +   }
>>> +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
>>> +
>>> +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
>>> +  struct ureg_src modsrc[2];
>>> +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
>>> +  modsrc[1] = ureg_src(arg);
>>> +
>>> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>>> +   }
>>> +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
>>> +  struct ureg_src modsrc[2];
>>> +  modsrc[0] = ureg_src(arg);
>>> +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
>>> +
>>> +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
>>> +   }
>>> +   if (srcReg->argMod & GL_2X_BIT_ATI) {
>>> +  struct ureg_src modsrc[2];
>>> +  modsrc[0] = ureg_src(arg);
>>> +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
>>> +
>>> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
>>
>> aka ADD arg, arg, arg
>>
>>> +   }
>>> +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
>>> +  struct ureg_src modsrc[2];
>>> +  modsrc[0] = ureg_src(arg);
>>> +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
>>> +
>>> +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
>>
>> aka NEG arg, arg
>>
>>> +   }
>>> +   return  ureg_src(arg);
>>> +}
>>> +
>>> +/* These instructions have no direct equivalent in TGSI */
>>> +static void emit_special_inst(struct st_translate *t, struct 
>>> instruction_desc *desc,
>>> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
>>> +{
>>> +   struct ureg_dst tmp[1];
>>> +   struct ureg_src src[3];
>>> +
>>> +   if(desc->special == 1) {
>>> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // 
>>> re-purpose a3
>>> +  src[0] = ureg_imm1f(t->ureg, 0.5f);
>>> +  src[1] = args[2];
>>> +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
>>> +  src[0] = ureg_src(tmp[0]);
>>> +  src[1] = args[0];
>>> +  src[2] = args[1];
>>> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
>>> +   } else if (desc->special == 2) {
>>> +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // 
>>> re-purpose a3
>>> +  src[0] = args[2];
>>> +  src[1] = ureg_imm1f(t->ureg, 0.0f);
>>> +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
>>> +  src[0] = ureg_src(tmp[0]);
>>> +  src[1] = args[0];
>>> +  src[2] = args[1];
>>> +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
>>
>> Isn't this the CMP instruction? Just flip the args.
>>
>> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP
>>
>> The other one should be expressible as CMP as well I think.
>>
>>> +   } else if (desc->special == 3) {
>>> +  src[0] = args[0];
>>> +  src[1] = args[1];
>>> +  src[2] = ureg_swizzle(args[2],
>>> +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
>>> TGSI_SWIZZLE_Z);
>>> +  emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3);
>>> +   }
>>> +}
>>> +
>>> +static void emit_arith_inst(struct st_translate *t,
>>> +  struct instruction_desc *desc,
>>> +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)

Re: [Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1

2015-12-15 Thread Jordan Justen
On 2015-12-15 17:00:55, Ian Romanick wrote:
> Doesn't this make patch 3 irrelevant?  FWIW, I like this better.

This change only updates the way we program some constants. It is for
a local stage_exists array, which we then use later in the same
function when programming context constants.

For example, without this change, I don't think image_load_store has
any images to work with for the compute stage.

-Jordan

> 
> On 12/15/2015 04:08 PM, Jordan Justen wrote:
> > Previously we were checking the desktop OpenGL ARB_compute_shader
> > requirements, but for OpenGLES 3.1, the requirements are lower.
> > 
> > Signed-off-by: Jordan Justen 
> > Cc: Marta Lofstedt 
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > b/src/mesa/drivers/dri/i965/brw_context.c
> > index 0abe601..5105625 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context 
> > *brw)
> >[MESA_SHADER_GEOMETRY] = brw->gen >= 6,
> >[MESA_SHADER_FRAGMENT] = true,
> >[MESA_SHADER_COMPUTE] =
> > - (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
> > + (ctx->API == API_OPENGL_CORE &&
> > +  ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) ||
> > + (ctx->API == API_OPENGLES2 &&
> > +  ctx->Const.MaxComputeWorkGroupSize[0] >= 128) ||
> >   _mesa_extension_override_enables.ARB_compute_shader,
> > };
> >  
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/11] program: fix comment about the fog formula

2015-12-15 Thread Ian Romanick
Yes... that matches the GL_ARB_fragment_program spec.  This patch is

Reviewed-by: Ian Romanick 

Assuming there are no objections, I'll push this in 24 hours.

On 12/15/2015 03:05 PM, Miklós Máté wrote:
> ---
>  src/mesa/program/prog_statevars.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/program/prog_statevars.c 
> b/src/mesa/program/prog_statevars.c
> index bdb335e..12490d0 100644
> --- a/src/mesa/program/prog_statevars.c
> +++ b/src/mesa/program/prog_statevars.c
> @@ -474,7 +474,7 @@ _mesa_fetch_state(struct gl_context *ctx, const 
> gl_state_index state[],
>* single MAD.
>* linear: fogcoord * -1/(end-start) + end/(end-start)
>* exp: 2^-(density/ln(2) * fogcoord)
> -  * exp2: 2^-((density/(ln(2)^2) * fogcoord)^2)
> +  * exp2: 2^-((density/(sqrt(ln(2))) * fogcoord)^2)
>*/
>   value[0] = (ctx->Fog.End == ctx->Fog.Start)
>  ? 1.0f : (GLfloat)(-1.0F / (ctx->Fog.End - ctx->Fog.Start));
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1

2015-12-15 Thread Ian Romanick
On 12/15/2015 05:01 PM, Jordan Justen wrote:
> On 2015-12-15 16:50:39, Ian Romanick wrote:
>> On 12/15/2015 04:08 PM, Jordan Justen wrote:
>>> The OpenGL ARB_compute_shader extension specfication requires at least
>>> 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1
>>> only required 128.
>>
>> Does this mean that extensions->ARB_compute_shader is not set?
> 
> Yes. I think we can't set this in some cases due to desktop GL
> requirements, but we should still be able to support CS on ES 3.1.
> 
>> I'm a little bit nervous about that. Are we sure that we check for
>> compute shader support correctly everywhere (i.e., don't just check
>> the extension bit that isn't set)?
> 
> I think we have it pretty well covered. The ES 3.1 CTS seems pretty
> happy with what we have.
> 
> That said, patch 2 was yet another fix to use
> _mesa_has_compute_shaders, and I wouldn't be surprised if we ended up
> finding some more. (I did try to grep to find anything we might have
> missed.)

I just did that too.  I didn't see anything that looked problematic except:

src/mesa/main/get.c:/* HACK: remove when ARB_compute_shader is actually
supported */

This patch is

Reviewed-by: Ian Romanick 

> -Jordan
> 
>>> Signed-off-by: Jordan Justen 
>>> Cc: Ian Romanick 
>>> Cc: Marta Lofstedt 
>>> ---
>>>  src/mesa/main/version.c | 9 ++---
>>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
>>> index e92bb11..112a73d 100644
>>> --- a/src/mesa/main/version.c
>>> +++ b/src/mesa/main/version.c
>>> @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions 
>>> *extensions)
>>>  }
>>>  
>>>  static GLuint
>>> -compute_version_es2(const struct gl_extensions *extensions)
>>> +compute_version_es2(const struct gl_extensions *extensions,
>>> +const struct gl_constants *consts)
>>>  {
>>> /* OpenGL ES 2.0 is derived from OpenGL 2.0 */
>>> const bool ver_2_0 = (extensions->ARB_texture_cube_map &&
>>> @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions 
>>> *extensions)
>>>   extensions->EXT_texture_snorm &&
>>>   extensions->NV_primitive_restart &&
>>>   extensions->OES_depth_texture_cube_map);
>>> +   const bool es31_compute_shader =
>>> +  consts->MaxComputeWorkGroupInvocations >= 128;
>>> const bool ver_3_1 = (ver_3_0 &&
>>>   extensions->ARB_arrays_of_arrays &&
>>> - extensions->ARB_compute_shader &&
>>> + es31_compute_shader &&
>>>   extensions->ARB_draw_indirect &&
>>>   extensions->ARB_explicit_uniform_location &&
>>>   extensions->ARB_framebuffer_no_attachments &&
>>> @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions 
>>> *extensions,
>>> case API_OPENGLES:
>>>return compute_version_es1(extensions);
>>> case API_OPENGLES2:
>>> -  return compute_version_es2(extensions);
>>> +  return compute_version_es2(extensions, consts);
>>> }
>>> return 0;
>>>  }
>>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 1/1] i965: add opportunistic behaviour to opt_vector_float()

2015-12-15 Thread Juan A. Suarez Romero
opt_vector_float() transforms several scalar MOV operations to a single
vectorial MOV.

This is done when those MOV covers all the components of the destination
register. So something like:

mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf3.0.z:D, 0D

is transformed in:

mov vgrf3.0:F, [0F, 0F, 0F, 1F]

But there are cases where not all the components are written. For
example, in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf4.0.xy:D, 1065353216D
mov vgrf4.0.w:D, 0D
mov vgrf6.0:UD, u4.xyzw:UD

Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
not applied.

But it could be applied anyway with the components covered, using a
writemask to select the ones written. So we could transform it in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
mov vgrf6.0:UD, u4.xyzw:UD

This commit does precisely that: opportunistically apply
opt_vector_float() when possible.

The improvement obtained regarding current upstream
(11.1-branchpoint-654-gc51c09c) is:

total instructions in shared programs: 6846435 -> 6838649 (-0.11%)
instructions in affected programs: 393820 -> 386034 (-1.98%)
total loops in shared programs:1971 -> 1971 (0.00%)
helped:3980
HURT:  0
GAINED:0
LOST:  0

v2: change vectorize_mov() signature (Matt).
v3: take in account predicates (Juan).

Signed-off-by: Juan A. Suarez Romero 
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 62 ++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  4 +++
 2 files changed, 44 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index a697bdf..ffbbf1a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -309,6 +309,29 @@ src_reg::equals(const src_reg ) const
 }
 
 bool
+vec4_visitor::vectorize_mov(bblock_t *block, vec4_instruction *inst, uint8_t 
imm[4],
+vec4_instruction *imm_inst[4], int inst_count,
+unsigned writemask)
+{
+   if (inst_count < 2) {
+  return false;
+   }
+
+   unsigned vf;
+   memcpy(, imm, sizeof(vf));
+   vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf));
+   mov->dst.type = BRW_REGISTER_TYPE_F;
+   mov->dst.writemask = writemask;
+   inst->insert_before(block, mov);
+
+   for (int i = 0; i < inst_count; i++) {
+  imm_inst[i]->remove(block);
+   }
+
+   return true;}
+
+
+bool
 vec4_visitor::opt_vector_float()
 {
bool progress = false;
@@ -316,27 +339,37 @@ vec4_visitor::opt_vector_float()
int last_reg = -1, last_reg_offset = -1;
enum brw_reg_file last_reg_file = BAD_FILE;
 
-   int remaining_channels = 0;
-   uint8_t imm[4];
+   uint8_t imm[4] = { 0 };
int inst_count = 0;
vec4_instruction *imm_inst[4];
+   unsigned writemask = 0;
 
foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
   if (last_reg != inst->dst.nr ||
   last_reg_offset != inst->dst.reg_offset ||
   last_reg_file != inst->dst.file) {
+
+ progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, 
writemask);
+
+ inst_count = 0;
+ writemask = 0;
  last_reg = inst->dst.nr;
  last_reg_offset = inst->dst.reg_offset;
  last_reg_file = inst->dst.file;
- remaining_channels = WRITEMASK_XYZW;
-
- inst_count = 0;
+ for (int i = 0; i < 4; i++) {
+imm[i] = 0;
+ }
   }
 
   if (inst->opcode != BRW_OPCODE_MOV ||
   inst->dst.writemask == WRITEMASK_XYZW ||
-  inst->src[0].file != IMM)
+  inst->src[0].file != IMM ||
+  inst->predicate != BRW_PREDICATE_NONE) {
+ progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, 
writemask);
+ inst_count = 0;
+ last_reg = -1;
  continue;
+  }
 
   int vf = brw_float_to_vf(inst->src[0].f);
   if (vf == -1)
@@ -351,23 +384,8 @@ vec4_visitor::opt_vector_float()
   if ((inst->dst.writemask & WRITEMASK_W) != 0)
  imm[3] = vf;
 
+  writemask |= inst->dst.writemask;
   imm_inst[inst_count++] = inst;
-
-  remaining_channels &= ~inst->dst.writemask;
-  if (remaining_channels == 0) {
- unsigned vf;
- memcpy(, imm, sizeof(vf));
- vec4_instruction *mov = MOV(inst->dst, brw_imm_vf(vf));
- mov->dst.type = BRW_REGISTER_TYPE_F;
- mov->dst.writemask = WRITEMASK_XYZW;
- inst->insert_after(block, mov);
- last_reg = -1;
-
- for (int i = 0; i < inst_count; i++) {
-imm_inst[i]->remove(block);
- }
- progress = true;
-  }
}
 
if (progress)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 

[Mesa-dev] [PATCH v4 0/1] i965: add opportunistic behaviour to opt_vector_float()

2015-12-15 Thread Juan A. Suarez Romero
While working on related issue, found out that previous patch (and original
version) were applying incorrectly opt_vector_float in some cases.

Specifically, for this piece of code:

cmp.nz.f0.0 null:F, vgrf6.xyzz:F, vgrf17.xyzz:F
mov vgrf2.0.x:D, 0D
(+f0.0.any4h) mov vgrf2.0.x:D, -1D
mov vgrf2.0.yzw:D, 0D

opt_vector_float was generating:

cmp.nz.f0.0 null:F, vgrf6.xyzz:F, vgrf17.xyzz:F
(+f0.0.any4h) mov vgrf2.0.x:D, -1D
mov vgrf2.0:F, [0F, 0F, 0F, 0F]
cmp.nz.f0.0 null:D, vgrf2.xyzw:D, 0D

As can be notice, in the former code vgrf2.x could be 0 or -1, depending on the
predicate, while in the result it is always 0. Problem is that when applying the
optimization, it was ignoring the predicate.

The next patch updates the previous version to fix this problem.


*** BLURB HERE ***

Juan A. Suarez Romero (1):
  i965: add opportunistic behaviour to opt_vector_float()

 src/mesa/drivers/dri/i965/brw_vec4.cpp | 62 ++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  4 +++
 2 files changed, 44 insertions(+), 22 deletions(-)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/9] nir: move to compiler

2015-12-15 Thread Jason Ekstrand
On Sat, Nov 28, 2015 at 8:45 AM, Emil Velikov  wrote:
> On 27 November 2015 at 20:45, Jason Ekstrand  wrote:
>> On Nov 27, 2015 11:26 AM, "Matt Turner"  wrote:
>>> On Fri, Nov 27, 2015 at 6:50 AM, Emil Velikov 
>>> wrote:
>>> > On 25 November 2015 at 22:01, Matt Turner  wrote:
>>> >> On Wed, Nov 25, 2015 at 1:32 PM, Emil Velikov
>>> >>  wrote:
>>> >
>>> >>> --- a/src/Makefile.am
>>> >>> +++ b/src/Makefile.am
>>> >>> @@ -23,6 +23,7 @@ SUBDIRS = . gtest util mapi/glapi/gen mapi
>>> >>>
>>> >>>  # XXX: conditionally include
>>> >>>  SUBDIRS += compiler
>>> >>> +SUBDIRS += compiler/nir
>>> >>
>>> >> We have a non-recursive build in src/glsl today. I don't want to go
>>> >> backwards.
>>> > Not sure I fully get that can you elaborate ? Are you concerned that
>>> > things won't build in parallel, increasing the compilation times ?
>>> >
>>> > On my dual core system running with -j2 results in approx 15 seconds
>>> > increase. I'm willing to take that trade off for the improved
>>> > readability. What is the difference on your system ?
>>>
>>> src/glsl has single Makefile that builds libglcpp, glcpp, libglsl,
>>> glsl_compiler, glsl_test, libnir, and various test programs, allowing
>>> all of these things to happen in parallel. The Makefile is perfectly
>>> maintainable as it is and there's no advantage of splitting it,
>>> especially when the work has been done to get things to this state
>>> (commits 86d30dea, efd201ca) and NIR was added without an additional
>>> Makefile.
>>
>> I would tend to agree.  Making things hierarchical is nice but,
>> unfortunately, autotools makes this and parallelization mutually exclusive.
>
> Actually I have some ancient work where we benefit from both. Namely
> have a single top level Makefile.am, which directly includes the
> subdirectory Automake.mk files, resulting in one big Makefile at the
> very end.
>
> That aside can we get some quantitative representation of the penalty
> you guys see. On my (old) machine the difference is negligible  ~15
> sec of a ~11 minute `make all' and ~16 minute `make distcheck'.

What happened to this?  (Yes, "Busy doing a release" is a valid answer)
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/11] mesa: improve debug log in atifragshader

2015-12-15 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

Assuming there are no objections, I'll push this in 24 hours.

On 12/15/2015 03:05 PM, Miklós Máté wrote:
> ---
>  src/mesa/main/atifragshader.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
> index d1c07c5..8b19a35 100644
> --- a/src/mesa/main/atifragshader.c
> +++ b/src/mesa/main/atifragshader.c
> @@ -349,6 +349,9 @@ _mesa_BeginFragmentShaderATI(void)
> ctx->ATIFragmentShader.Current->isValid = GL_FALSE;
> ctx->ATIFragmentShader.Current->swizzlerq = 0;
> ctx->ATIFragmentShader.Compiling = 1;
> +#if MESA_DEBUG_ATI_FS
> +   _mesa_debug(ctx, "%s %u\n", __func__, ctx->ATIFragmentShader.Current->Id);
> +#endif
>  }
>  
>  void GLAPIENTRY
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/11] swrast: move two global defines to the only place where they are used

2015-12-15 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

Assuming there are no objections, I'll push this in 24 hours.

On 12/15/2015 03:05 PM, Miklós Máté wrote:
> ---
>  src/mesa/main/mtypes.h| 2 --
>  src/mesa/swrast/s_atifragshader.c | 2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 5c71ac4..99e7912 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2278,8 +2278,6 @@ struct gl_compute_program_state
>  /**
>   * ATI_fragment_shader runtime state
>   */
> -#define ATI_FS_INPUT_PRIMARY 0
> -#define ATI_FS_INPUT_SECONDARY 1
>  
>  struct atifs_instruction;
>  struct atifs_setupinst;
> diff --git a/src/mesa/swrast/s_atifragshader.c 
> b/src/mesa/swrast/s_atifragshader.c
> index 2974dee..414a414 100644
> --- a/src/mesa/swrast/s_atifragshader.c
> +++ b/src/mesa/swrast/s_atifragshader.c
> @@ -26,6 +26,8 @@
>  #include "swrast/s_atifragshader.h"
>  #include "swrast/s_context.h"
>  
> +#define ATI_FS_INPUT_PRIMARY 0
> +#define ATI_FS_INPUT_SECONDARY 1
>  
>  /**
>   * State for executing ATI fragment shader.
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader

2015-12-15 Thread Ian Romanick
On 12/15/2015 05:08 PM, Ilia Mirkin wrote:
> On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick  wrote:
>> On 12/15/2015 04:40 PM, Ilia Mirkin wrote:
>>> Hardly a complete review, but a handful of comments:
>>>
>>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté  wrote:
 ---
  src/mesa/Makefile.sources |   1 +
  src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 
 ++
  src/mesa/state_tracker/st_atifs_to_tgsi.h |  49 ++
  src/mesa/state_tracker/st_atom_constbuf.c |  14 +
  src/mesa/state_tracker/st_cb_drawpixels.c |   1 +
  src/mesa/state_tracker/st_cb_program.c|  35 +-
  src/mesa/state_tracker/st_program.c   |  22 +
  src/mesa/state_tracker/st_program.h   |   1 +
  8 files changed, 920 insertions(+), 1 deletion(-)
  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c
  create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h

 +static struct ureg_src prepare_argument(struct st_translate *t, const 
 unsigned argId,
 +  const struct atifragshader_src_register *srcReg)
 +{
 +   struct ureg_src src = get_source(t, srcReg->Index);
 +   struct ureg_dst arg = get_temp(t, 
 MAX_NUM_FRAGMENT_REGISTERS_ATI+argId);
 +
 +   switch (srcReg->argRep) {
 +  case GL_NONE:
 + break;
 +  case GL_RED:
 + src = ureg_swizzle(src,
 +   TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, 
 TGSI_SWIZZLE_X);
 + break;
 +  case GL_GREEN:
 + src = ureg_swizzle(src,
 +   TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, 
 TGSI_SWIZZLE_Y);
 + break;
 +  case GL_BLUE:
 + src = ureg_swizzle(src,
 +   TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
 TGSI_SWIZZLE_Z);
 + break;
 +  case GL_ALPHA:
 + src = ureg_swizzle(src,
 +   TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, 
 TGSI_SWIZZLE_W);
 + break;
 +   }
 +   emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1);
 +
 +   if (srcReg->argMod & GL_COMP_BIT_ATI) {
 +  struct ureg_src modsrc[2];
 +  modsrc[0] = ureg_imm1f(t->ureg, 1.0);
 +  modsrc[1] = ureg_src(arg);
 +
 +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
 +   }
 +   if (srcReg->argMod & GL_BIAS_BIT_ATI) {
 +  struct ureg_src modsrc[2];
 +  modsrc[0] = ureg_src(arg);
 +  modsrc[1] = ureg_imm1f(t->ureg, 0.5);
 +
 +  emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2);
 +   }
 +   if (srcReg->argMod & GL_2X_BIT_ATI) {
 +  struct ureg_src modsrc[2];
 +  modsrc[0] = ureg_src(arg);
 +  modsrc[1] = ureg_imm1f(t->ureg, 2.0);
 +
 +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
>>>
>>> aka ADD arg, arg, arg
>>>
 +   }
 +   if (srcReg->argMod & GL_NEGATE_BIT_ATI) {
 +  struct ureg_src modsrc[2];
 +  modsrc[0] = ureg_src(arg);
 +  modsrc[1] = ureg_imm1f(t->ureg, -1.0);
 +
 +  emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2);
>>>
>>> aka NEG arg, arg
>>>
 +   }
 +   return  ureg_src(arg);
 +}
 +
 +/* These instructions have no direct equivalent in TGSI */
 +static void emit_special_inst(struct st_translate *t, struct 
 instruction_desc *desc,
 +  struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
 +{
 +   struct ureg_dst tmp[1];
 +   struct ureg_src src[3];
 +
 +   if(desc->special == 1) {
 +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // 
 re-purpose a3
 +  src[0] = ureg_imm1f(t->ureg, 0.5f);
 +  src[1] = args[2];
 +  emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2);
 +  src[0] = ureg_src(tmp[0]);
 +  src[1] = args[0];
 +  src[2] = args[1];
 +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
 +   } else if (desc->special == 2) {
 +  tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // 
 re-purpose a3
 +  src[0] = args[2];
 +  src[1] = ureg_imm1f(t->ureg, 0.0f);
 +  emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2);
 +  src[0] = ureg_src(tmp[0]);
 +  src[1] = args[0];
 +  src[2] = args[1];
 +  emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3);
>>>
>>> Isn't this the CMP instruction? Just flip the args.
>>>
>>> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP
>>>
>>> The other one should be expressible as CMP as well I think.
>>>
 +   } else if (desc->special == 3) {
 +  src[0] = args[0];
 +  src[1] = args[1];
 +  src[2] = ureg_swizzle(args[2],
 +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, 
 TGSI_SWIZZLE_Z);
 +  emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3);
 +   }
 

[Mesa-dev] stencil texturing trouble

2015-12-15 Thread Roland Scheidegger
Hi,

looking at some piglit failures, I was wondering what is actually the
correct thing to do with stencil texturing. What do you put in the
missing channels?
The GL spec seems to say depth texture mode is only applicable to depth
textures, so what it is then? It looks like nvidia is returning the same
value in all channels, but from all possibilities I can think of what
should be returned (honor depth texture mode, treat it like a GL_RED
texture, ...) this seems to be about the least likely to actually be
correct.

Any pointers? Or maybe it just doesn't matter? (Oh and before I forget,
the piglit texwrap test is quite busted wrt stencil textures, so don't
trust it...)

There's actually another problem related to gallium, right now mesa/st
uses (in contrast to depth textures) XYZW swizzle (albeit if it's a
depth/stencil texture in stencil sampling mode, it would use the swizzle
according to depth texture mode). That's quite problematic for several
reasons, not least because in gallium stencil textures essentially use
the "Y" component for sampling stencil, not X. Not to mention that the
other components do not map to anything (as opposed to "ordinary" color
formats).
There's also some interface mismatch there too which looks like it can't
be solved easily - we sample the Y component, however we still need to
use the X component from the border color. Hmm.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >