Re: [Mesa-dev] [PATCH v2 22/34] i965/surface_state: Rename brw_update to gen4_update

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:21PM -0700, Jason Ekstrand wrote:
> We're about to add generic versions which work across gens and those should
> have the brw name.
> ---
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index d2bc3ba..c6bd26c 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -371,11 +371,11 @@ brw_update_buffer_texture_surface(struct gl_context 
> *ctx,
>  }
>  
>  static void
> -brw_update_texture_surface(struct gl_context *ctx,
> -   unsigned unit,
> -   uint32_t *surf_offset,
> -   bool for_gather,
> -   uint32_t plane)
> +gen4_update_texture_surface(struct gl_context *ctx,
> +unsigned unit,
> +uint32_t *surf_offset,
> +bool for_gather,
> +uint32_t plane)
>  {
> struct brw_context *brw = brw_context(ctx);
> struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
> @@ -721,10 +721,10 @@ brw_emit_null_surface_state(struct brw_context *brw,
>   * usable for further buffers when doing ARB_draw_buffer support.
>   */
>  static uint32_t
> -brw_update_renderbuffer_surface(struct brw_context *brw,
> -struct gl_renderbuffer *rb,
> -bool layered, unsigned unit,
> -uint32_t surf_index)
> +gen4_update_renderbuffer_surface(struct brw_context *brw,
> + struct gl_renderbuffer *rb,
> + bool layered, unsigned unit,
> + uint32_t surf_index)
>  {
> struct gl_context *ctx = >ctx;
> struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> @@ -1500,8 +1500,8 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
>  void
>  gen4_init_vtable_surface_functions(struct brw_context *brw)
>  {
> -   brw->vtbl.update_texture_surface = brw_update_texture_surface;
> -   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
> +   brw->vtbl.update_texture_surface = gen4_update_texture_surface;
> +   brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
> brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
> brw->vtbl.emit_buffer_surface_state = gen4_emit_buffer_surface_state;
>  }
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 6:08 PM, Samuel Pitoiset
 wrote:
>
>
> On 06/28/2016 12:06 AM, Ilia Mirkin wrote:
>>
>> On Mon, Jun 27, 2016 at 6:05 PM, Ilia Mirkin  wrote:
>>>
>>> On Mon, Jun 27, 2016 at 6:04 PM, Samuel Pitoiset
>>>  wrote:



 On 06/28/2016 12:02 AM, Ilia Mirkin wrote:
>
>
> This loses you saturation. Does the target account for this?



 No saturate flag for FADD32I.
>>>
>>>
>>> That's not what I asked.
>>
>>
>> Specifically look at this code:
>>
>> bool
>> TargetNVC0::isSatSupported(const Instruction *insn) const
>> {
>>if (insn->op == OP_CVT)
>>   return true;
>>if (!(opInfo[insn->op].dstMods & NV50_IR_MOD_SAT))
>>   return false;
>>
>>if (insn->dType == TYPE_U32)
>>   return (insn->op == OP_ADD) || (insn->op == OP_MAD);
>>
>>// add f32 LIMM cannot saturate
>>if (insn->op == OP_ADD && insn->sType == TYPE_F32) {
>>   if (insn->getSrc(1)->asImm() &&
>>   insn->getSrc(1)->reg.data.u32 & 0xfff)
>>  return false;
>>}
>>
>> Note how it will say that sat is supported for SIMMs with FADD? So the
>> compiler will generate those ops, but then the emitter won't be able
>> to handle it.
>>
>
> Okay, I get it.

By the way, instead of trying to fight the longIMMD, you should just fix it -

/*0008*/   @P0 FADD R0, R0, 1.NEG;  /*
0x3858203f8000 */

which corresponds nicely to

  emitNEG(0x2d, insn->src(1));

The issue is that emitIMMD does

   if (len == 19) {
...
  emitField( 56,   1, (val & 0x8) >> 19);
  emitField(pos, len, (val & 0x7));

So the problem is that the 56 isn't as fixed as the emission code had
hoped. I suspect that adjusting it will fix all these silly cases.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] Coding style scripts (Was Re: [PATCH 1/2] gallium: replace [0-9]*.f with [0-9]*.0f)

2016-06-27 Thread Michel Dänzer
On 27.06.2016 19:52, Jose Fonseca wrote:
>
> BTW, I've been using http://editorconfig.org/ on several projects. It's
> widely supported by many editors including Emacs.
> 
> There's even Python based tools to check editorconfig (
> https://github.com/editorconfig/editorconfig/wiki/FAQ#my-files-are-not-automatically-reformatted-the-editorconfig-plugin-is-not-working
> ).  We already require Python, downloading a Python package is trivial
> to do nowadays with pip, so we could easily standardize on .editorconfig
> files everywhere.

FWIW, +1 for adding .editorconfig files corresponding to the existing
.dir-locals.el files as a first step, so users of other editors can also
get the benefit of their code getting formatted correctly by default.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Make single-buffered GLES representation internally consistent

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 6:30 PM, Gurchetan Singh
 wrote:
> Hi Ilia,
>
> The changes for get.c where prompted by the es3fIntegerStateQueryTests (see
> modules/gles3/functional/es3fIntegerStateQueryTests.cpp in the dEQP tree).
> Specifically, these few lines:
>
>>> const GLint validInitialValues[] = {GL_BACK, GL_NONE};
>>> m_verifier->verifyIntegerAnyOf(m_testCtx, GL_READ_BUFFER,
>>> validInitialValues, DE_LENGTH_OF_ARRAY(validInitialValues));
>>> expectError(GL_NO_ERROR);
>
> We initially set ColorReadBuffer to GL_FRONT in
> _mesa_initialize_window_framebuffer for single-buffered configs.

So ... could we initialize it to GL_BACK for GLES and avoid this pain?
Unfortunately I have no idea what the implications of that would be.

>
> We could also make sure the context is single-buffered in get.c to further
> avoid bugs.  Let me know if that works for you and I'll send a modified
> patch.
>
> I do agree it is a bit hacky ... I'd definitely be interested in alternative
> solutions.

If you're flipping the value in the getter, you might as well set that
to be the value from the very beginning. However I don't know what the
effects of that are.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gm107/ir: make sure that flagsDef is set when emitting setcond

2016-06-27 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

Moderately sure this should work...

On Mon, Jun 27, 2016 at 6:59 PM, Samuel Pitoiset
 wrote:
> Rely on the existence of a second destination when emitting a setcond
> flag is dangerous, because this doesn't mean that the flag has been
> correctly set. Instead rely on flagsDef like what emitX() does
> for flagsSrc.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 40afbce..2c5e8f6 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -420,7 +420,7 @@ CodeEmitterGM107::emitSAT(int pos)
>  void
>  CodeEmitterGM107::emitCC(int pos)
>  {
> -   emitField(pos, 1, insn->defExists(1));
> +   emitField(pos, 1, insn->flagsDef >= 0);
>  }
>
>  void
> --
> 2.8.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/st: Include nir.h for nir_shader symbol.

2016-06-27 Thread Matt Turner
On Mon, Jun 27, 2016 at 6:45 PM, Vinson Lee  wrote:
> Fix this build error with GCC 4.4.
>
>   CC state_tracker/st_nir_lower_builtin.lo
> In file included from state_tracker/st_nir_lower_builtin.c:61:
> state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’
> ../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ 
> was here

This error seems to imply that nir.h is already being included somehow.

Does just removing the typedef solve the problem? Can we figure out
how nir.h is already being included and remove that?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] glsl: pass symbols to find_matching_signature() rather than shader

2016-06-27 Thread Timothy Arceri
On Tue, 2016-06-28 at 11:52 +1000, Timothy Arceri wrote:
> This will allow us to later split gl_shader into two structs.
> ---
>  src/compiler/glsl/link_functions.cpp | 47 +-
> --
>  1 file changed, 22 insertions(+), 25 deletions(-)
> 
> diff --git a/src/compiler/glsl/link_functions.cpp
> b/src/compiler/glsl/link_functions.cpp
> index 4e10287..c9dacc1 100644
> --- a/src/compiler/glsl/link_functions.cpp
> +++ b/src/compiler/glsl/link_functions.cpp
> @@ -31,8 +31,7 @@
>  
>  static ir_function_signature *
>  find_matching_signature(const char *name, const exec_list
> *actual_parameters,
> - gl_shader **shader_list, unsigned
> num_shaders,
> - bool use_builtin);
> +glsl_symbol_table *symbols, bool
> use_builtin);
>  
>  namespace {
>  
> @@ -78,8 +77,8 @@ public:
> * final linked shader.  If it does, use it as the target of
> the call.
> */
>    ir_function_signature *sig =
> -  find_matching_signature(name, >parameters, ,
> 1,
> -  ir->use_builtin);
> + find_matching_signature(name, >parameters, linked-
> >symbols,
> + ir->use_builtin);
>    if (sig != NULL) {
>    ir->callee = sig;
>    return visit_continue;
> @@ -88,8 +87,14 @@ public:
>    /* Try to find the signature in one of the other shaders that
> is being
> * linked.  If it's not found there, return an error.
> */
> -  sig = find_matching_signature(name, >actual_parameters,
> shader_list,
> - num_shaders, ir->use_builtin);
> +  for (unsigned i = 0; i < num_shaders; i++) {
> + sig = find_matching_signature(name, >actual_parameters,
> +   shader_list[i]->symbols,
> +   ir->use_builtin);
> + if (sig)
> +break;
> +  }
> +
>    if (sig == NULL) {
>    /* FINISHME: Log the full signature of unresolved function.
>     */
> @@ -307,30 +312,22 @@ private:
>   */
>  ir_function_signature *
>  find_matching_signature(const char *name, const exec_list
> *actual_parameters,
> - gl_shader **shader_list, unsigned
> num_shaders,
> - bool use_builtin)
> +glsl_symbol_table *symbols, bool
> use_builtin)
>  {
> -   for (unsigned i = 0; i < num_shaders; i++) {
> -  ir_function *const f = shader_list[i]->symbols-
> >get_function(name);
> -
> -  if (f == NULL)
> -  continue;
> +   ir_function *const f = symbols->get_function(name);
>  
> +   if (f) {
>    ir_function_signature *sig =
>   f->matching_signature(NULL, actual_parameters,
> use_builtin);
>  
> -  if ((sig == NULL) ||
> -  (!sig->is_defined && !sig->is_intrinsic))
> -  continue;
> -
> -  /* If this function expects to bind to a built-in function and
> the
> -   * signature that we found isn't a built-in, keep
> looking.  Also keep
> -   * looking if we expect a non-built-in but found a built-in.
> -   */
> -  if (use_builtin != sig->is_builtin())
> - continue;
> -
> -  return sig;
> +  if (sig && (sig->is_defined || sig->is_intrinsic)) {
> + /* If this function expects to bind to a built-in function
> and the
> +  * signature that we found isn't a built-in, keep
> looking.  Also keep
> +  * looking if we expect a non-built-in but found a built-
> in.
> +  */
> + if (use_builtin != sig->is_builtin())

This should be:
   if (use_builtin == sig->is_builtin())

Looks like I squashed the fix into the next patch rather than this one.
I've fixed this locally.

> +return sig;
> +  }
> }
>  
> return NULL;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] glsl/main: remove unused params and make function static

2016-06-27 Thread Timothy Arceri
---
 src/compiler/glsl/builtin_functions.cpp  | 2 +-
 src/compiler/glsl/standalone_scaffolding.cpp | 4 +---
 src/compiler/glsl/standalone_scaffolding.h   | 2 +-
 src/mesa/drivers/common/meta.c   | 2 +-
 src/mesa/main/ff_fragment_shader.cpp | 2 +-
 src/mesa/main/shaderapi.c| 2 +-
 src/mesa/main/shaderobj.c| 8 
 src/mesa/main/shaderobj.h| 6 +-
 8 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index ae4e8f2..941ea12 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -956,7 +956,7 @@ builtin_builder::create_shader()
 * GLSL utility code that could be linked against any stage, so just
 * arbitrarily pick GL_VERTEX_SHADER.
 */
-   shader = _mesa_new_shader(NULL, 0, MESA_SHADER_VERTEX);
+   shader = _mesa_new_shader(0, MESA_SHADER_VERTEX);
shader->symbols = new(mem_ctx) glsl_symbol_table;
 
gl_ModelViewProjectionMatrix =
diff --git a/src/compiler/glsl/standalone_scaffolding.cpp 
b/src/compiler/glsl/standalone_scaffolding.cpp
index b5dc523..53729af 100644
--- a/src/compiler/glsl/standalone_scaffolding.cpp
+++ b/src/compiler/glsl/standalone_scaffolding.cpp
@@ -68,12 +68,10 @@ _mesa_shader_debug(struct gl_context *, GLenum, GLuint *,
 }
 
 struct gl_shader *
-_mesa_new_shader(struct gl_context *ctx, GLuint name, gl_shader_stage stage)
+_mesa_new_shader(GLuint name, gl_shader_stage stage)
 {
struct gl_shader *shader;
 
-   (void) ctx;
-
assert(stage == MESA_SHADER_FRAGMENT || stage == MESA_SHADER_VERTEX);
shader = rzalloc(NULL, struct gl_shader);
if (shader) {
diff --git a/src/compiler/glsl/standalone_scaffolding.h 
b/src/compiler/glsl/standalone_scaffolding.h
index cc9..a19eae2 100644
--- a/src/compiler/glsl/standalone_scaffolding.h
+++ b/src/compiler/glsl/standalone_scaffolding.h
@@ -42,7 +42,7 @@ _mesa_reference_shader(struct gl_context *ctx, struct 
gl_shader **ptr,
struct gl_shader *sh);
 
 extern "C" struct gl_shader *
-_mesa_new_shader(struct gl_context *ctx, GLuint name, gl_shader_stage stage);
+_mesa_new_shader(GLuint name, gl_shader_stage stage);
 
 extern "C" struct gl_linked_shader *
 _mesa_new_linked_shader(gl_shader_stage stage);
diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index df57c87..fdc4748 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -129,7 +129,7 @@ meta_compile_shader_with_debug(struct gl_context *ctx, 
gl_shader_stage stage,
const GLuint name = ~0;
struct gl_shader *sh;
 
-   sh = _mesa_new_shader(ctx, name, stage);
+   sh = _mesa_new_shader(name, stage);
sh->Source = strdup(source);
sh->CompileStatus = false;
_mesa_compile_shader(ctx, sh);
diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index f90d31a..83881e9 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -1203,7 +1203,7 @@ create_new_program(struct gl_context *ctx, struct 
state_key *key)
_mesa_glsl_parse_state *state;
 
p.mem_ctx = ralloc_context(NULL);
-   p.shader = _mesa_new_shader(ctx, 0, MESA_SHADER_FRAGMENT);
+   p.shader = _mesa_new_shader(0, MESA_SHADER_FRAGMENT);
p.shader->ir = new(p.shader) exec_list;
state = new(p.shader) _mesa_glsl_parse_state(ctx, MESA_SHADER_FRAGMENT,
p.shader);
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 962b42e..fdfe1e7 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -334,7 +334,7 @@ create_shader(struct gl_context *ctx, GLenum type)
 
_mesa_HashLockMutex(ctx->Shared->ShaderObjects);
name = _mesa_HashFindFreeKeyBlock(ctx->Shared->ShaderObjects, 1);
-   sh = _mesa_new_shader(ctx, name, _mesa_shader_enum_to_shader_stage(type));
+   sh = _mesa_new_shader(name, _mesa_shader_enum_to_shader_stage(type));
sh->Type = type;
_mesa_HashInsertLocked(ctx->Shared->ShaderObjects, name, sh);
_mesa_HashUnlockMutex(ctx->Shared->ShaderObjects);
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index 93fdc66..131d682 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -88,8 +88,8 @@ _mesa_reference_shader(struct gl_context *ctx, struct 
gl_shader **ptr,
}
 }
 
-void
-_mesa_init_shader(struct gl_context *ctx, struct gl_shader *shader)
+static void
+_mesa_init_shader(struct gl_shader *shader)
 {
shader->RefCount = 1;
shader->Geom.VerticesOut = -1;
@@ -101,14 +101,14 @@ _mesa_init_shader(struct gl_context *ctx, struct 
gl_shader *shader)
  * Allocate a new gl_shader object, initialize it.
  */
 struct gl_shader *
-_mesa_new_shader(struct gl_context *ctx, GLuint name, gl_shader_stage stage)
+_mesa_new_shader(GLuint name, gl_shader_stage stage)
 {
struct 

[Mesa-dev] [PATCH 4/7] glsl: pass symbols to find_matching_signature() rather than shader

2016-06-27 Thread Timothy Arceri
This will allow us to later split gl_shader into two structs.
---
 src/compiler/glsl/link_functions.cpp | 47 +---
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/src/compiler/glsl/link_functions.cpp 
b/src/compiler/glsl/link_functions.cpp
index 4e10287..c9dacc1 100644
--- a/src/compiler/glsl/link_functions.cpp
+++ b/src/compiler/glsl/link_functions.cpp
@@ -31,8 +31,7 @@
 
 static ir_function_signature *
 find_matching_signature(const char *name, const exec_list *actual_parameters,
-   gl_shader **shader_list, unsigned num_shaders,
-   bool use_builtin);
+glsl_symbol_table *symbols, bool use_builtin);
 
 namespace {
 
@@ -78,8 +77,8 @@ public:
* final linked shader.  If it does, use it as the target of the call.
*/
   ir_function_signature *sig =
-find_matching_signature(name, >parameters, , 1,
-ir->use_builtin);
+ find_matching_signature(name, >parameters, linked->symbols,
+ ir->use_builtin);
   if (sig != NULL) {
 ir->callee = sig;
 return visit_continue;
@@ -88,8 +87,14 @@ public:
   /* Try to find the signature in one of the other shaders that is being
* linked.  If it's not found there, return an error.
*/
-  sig = find_matching_signature(name, >actual_parameters, shader_list,
-   num_shaders, ir->use_builtin);
+  for (unsigned i = 0; i < num_shaders; i++) {
+ sig = find_matching_signature(name, >actual_parameters,
+   shader_list[i]->symbols,
+   ir->use_builtin);
+ if (sig)
+break;
+  }
+
   if (sig == NULL) {
 /* FINISHME: Log the full signature of unresolved function.
  */
@@ -307,30 +312,22 @@ private:
  */
 ir_function_signature *
 find_matching_signature(const char *name, const exec_list *actual_parameters,
-   gl_shader **shader_list, unsigned num_shaders,
-   bool use_builtin)
+glsl_symbol_table *symbols, bool use_builtin)
 {
-   for (unsigned i = 0; i < num_shaders; i++) {
-  ir_function *const f = shader_list[i]->symbols->get_function(name);
-
-  if (f == NULL)
-continue;
+   ir_function *const f = symbols->get_function(name);
 
+   if (f) {
   ir_function_signature *sig =
  f->matching_signature(NULL, actual_parameters, use_builtin);
 
-  if ((sig == NULL) ||
-  (!sig->is_defined && !sig->is_intrinsic))
-continue;
-
-  /* If this function expects to bind to a built-in function and the
-   * signature that we found isn't a built-in, keep looking.  Also keep
-   * looking if we expect a non-built-in but found a built-in.
-   */
-  if (use_builtin != sig->is_builtin())
-   continue;
-
-  return sig;
+  if (sig && (sig->is_defined || sig->is_intrinsic)) {
+ /* If this function expects to bind to a built-in function and the
+  * signature that we found isn't a built-in, keep looking.  Also keep
+  * looking if we expect a non-built-in but found a built-in.
+  */
+ if (use_builtin != sig->is_builtin())
+return sig;
+  }
}
 
return NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] glsl/mesa: split gl_shader in two

2016-06-27 Thread Timothy Arceri
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.

The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.

We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
---
 src/compiler/glsl/glsl_to_nir.cpp  |   2 +-
 src/compiler/glsl/ir_optimization.h|  23 ++--
 src/compiler/glsl/link_atomics.cpp |   2 +-
 src/compiler/glsl/link_functions.cpp   |   8 +-
 src/compiler/glsl/link_interface_blocks.cpp|  10 +-
 src/compiler/glsl/link_uniform_blocks.cpp  |   2 +-
 src/compiler/glsl/link_uniform_initializers.cpp|   6 +-
 src/compiler/glsl/link_uniforms.cpp|   8 +-
 src/compiler/glsl/link_varyings.cpp|  17 ++-
 src/compiler/glsl/link_varyings.h  |  17 ++-
 src/compiler/glsl/linker.cpp   | 106 +++
 src/compiler/glsl/linker.h |  10 +-
 src/compiler/glsl/lower_distance.cpp   |   3 +-
 src/compiler/glsl/lower_named_interface_blocks.cpp |   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp|   6 +-
 src/compiler/glsl/lower_shared_reference.cpp   |   6 +-
 src/compiler/glsl/lower_tess_level.cpp |   2 +-
 src/compiler/glsl/lower_ubo_reference.cpp  |   6 +-
 src/compiler/glsl/lower_vector_derefs.cpp  |   2 +-
 src/compiler/glsl/lower_vertex_id.cpp  |   2 +-
 src/compiler/glsl/opt_dead_builtin_varyings.cpp|  11 +-
 src/compiler/glsl/standalone.cpp   |   4 +-
 src/compiler/glsl/standalone_scaffolding.cpp   |  20 +++
 src/compiler/glsl/standalone_scaffolding.h |   7 +
 src/compiler/glsl/test_optpass.cpp |   2 +-
 src/mesa/drivers/common/meta.c |   2 +-
 src/mesa/drivers/dri/i965/brw_context.c|   5 +-
 src/mesa/drivers/dri/i965/brw_context.h|   8 +-
 src/mesa/drivers/dri/i965/brw_gs.c |   5 +-
 src/mesa/drivers/dri/i965/brw_link.cpp |  24 ++--
 src/mesa/drivers/dri/i965/brw_program.c|   2 +-
 src/mesa/drivers/dri/i965/brw_program.h|   2 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |   4 +-
 src/mesa/drivers/dri/i965/brw_shader.h |   3 +-
 src/mesa/drivers/dri/i965/brw_tcs.c|   2 +-
 src/mesa/drivers/dri/i965/brw_tes.c|   3 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |   6 +-
 src/mesa/main/api_validate.c   |   6 +-
 src/mesa/main/dd.h |   3 +-
 src/mesa/main/ff_fragment_shader.cpp   |   2 +-
 src/mesa/main/mtypes.h | 143 ++---
 src/mesa/main/shader_query.cpp |   2 +-
 src/mesa/main/shaderapi.c  |  19 +--
 src/mesa/main/shaderobj.c  |  32 -
 src/mesa/main/shaderobj.h  |   7 +
 src/mesa/main/uniform_query.cpp|   4 +-
 src/mesa/main/uniforms.c   |   2 +-
 src/mesa/program/ir_to_mesa.cpp|   4 +-
 src/mesa/program/ir_to_mesa.h  |   2 +-
 src/mesa/program/prog_print.c  |  14 +-
 src/mesa/program/prog_print.h  |   2 +-
 src/mesa/state_tracker/st_atom_constbuf.c  |   2 +-
 src/mesa/state_tracker/st_atom_image.c |   2 +-
 src/mesa/state_tracker/st_atom_storagebuf.c|   2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp  |   2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   6 +-
 src/mesa/state_tracker/st_nir.h|   2 +-
 src/mesa/state_tracker/st_program.c|   7 -
 58 files changed, 388 insertions(+), 227 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index a22fd5b..0ad3be1 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -134,7 +134,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
 gl_shader_stage stage,
 const nir_shader_compiler_options *options)
 {
-   struct gl_shader *sh = shader_prog->_LinkedShaders[stage];
+   struct gl_linked_shader *sh = shader_prog->_LinkedShaders[stage];
 
nir_shader *shader = nir_shader_create(NULL, stage, options);
 
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index ba14e34..ba79076 100644
--- a/src/compiler/glsl/ir_optimization.h

[Mesa-dev] [PATCH 6/7] glsl: simplify link_uniform_blocks()

2016-06-27 Thread Timothy Arceri
There is only ever one shader so simplify the input params.
---
 src/compiler/glsl/link_uniform_blocks.cpp | 7 ++-
 src/compiler/glsl/linker.cpp  | 5 ++---
 src/compiler/glsl/linker.h| 3 +--
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/link_uniform_blocks.cpp 
b/src/compiler/glsl/link_uniform_blocks.cpp
index 4b51d40..1ccd0df 100644
--- a/src/compiler/glsl/link_uniform_blocks.cpp
+++ b/src/compiler/glsl/link_uniform_blocks.cpp
@@ -391,8 +391,7 @@ void
 link_uniform_blocks(void *mem_ctx,
 struct gl_context *ctx,
 struct gl_shader_program *prog,
-struct gl_linked_shader **shader_list,
-unsigned num_shaders,
+struct gl_linked_shader *shader,
 struct gl_uniform_block **ubo_blocks,
 unsigned *num_ubo_blocks,
 struct gl_uniform_block **ssbo_blocks,
@@ -415,9 +414,7 @@ link_uniform_blocks(void *mem_ctx,
/* Determine which uniform blocks are active.
 */
link_uniform_block_active_visitor v(mem_ctx, block_hash, prog);
-   for (unsigned i = 0; i < num_shaders; i++) {
-  visit_list_elements(, shader_list[i]->ir);
-   }
+   visit_list_elements(, shader->ir);
 
/* Count the number of active uniform blocks.  Count the total number of
 * active slots in those uniform blocks.
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 926520e..9826368 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2312,9 +2312,8 @@ link_intrastage_shaders(void *mem_ctx,
v.fixup_unnamed_interface_types();
 
/* Link up uniform blocks defined within this stage. */
-   link_uniform_blocks(mem_ctx, ctx, prog, , 1,
-   _blocks, _ubo_blocks, _blocks,
-   _ssbo_blocks);
+   link_uniform_blocks(mem_ctx, ctx, prog, linked, _blocks,
+   _ubo_blocks, _blocks, _ssbo_blocks);
 
if (!prog->LinkStatus) {
   _mesa_delete_linked_shader(ctx, linked);
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index 19b14d5..0126bcb 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -57,8 +57,7 @@ extern void
 link_uniform_blocks(void *mem_ctx,
 struct gl_context *ctx,
 struct gl_shader_program *prog,
-struct gl_linked_shader **shader_list,
-unsigned num_shaders,
+struct gl_linked_shader *shader,
 struct gl_uniform_block **ubo_blocks,
 unsigned *num_ubo_blocks,
 struct gl_uniform_block **ssbo_blocks,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] glsl: pass symbols rather than shader to _mesa_get_main_function_signature()

2016-06-27 Thread Timothy Arceri
This will allow us to split gl_shader into two different structs, one for
shader objects and one for linked shaders.
---
 src/compiler/glsl/builtin_functions.cpp | 4 ++--
 src/compiler/glsl/builtin_variables.cpp | 2 +-
 src/compiler/glsl/ir.h  | 2 +-
 src/compiler/glsl/linker.cpp| 4 ++--
 src/compiler/glsl/lower_vertex_id.cpp   | 2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 018e4064..ae4e8f2 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -5658,9 +5658,9 @@ _mesa_glsl_get_builtin_function_shader()
  * Get the function signature for main from a shader
  */
 ir_function_signature *
-_mesa_get_main_function_signature(gl_shader *sh)
+_mesa_get_main_function_signature(glsl_symbol_table *symbols)
 {
-   ir_function *const f = sh->symbols->get_function("main");
+   ir_function *const f = symbols->get_function("main");
if (f != NULL) {
   exec_list void_parameters;
 
diff --git a/src/compiler/glsl/builtin_variables.cpp 
b/src/compiler/glsl/builtin_variables.cpp
index e1a95e3..a311047 100644
--- a/src/compiler/glsl/builtin_variables.cpp
+++ b/src/compiler/glsl/builtin_variables.cpp
@@ -1456,7 +1456,7 @@ _mesa_glsl_initialize_derived_variables(struct gl_context 
*ctx,
if (shader->Stage == MESA_SHADER_COMPUTE &&
ctx->Const.LowerCsDerivedVariables) {
   ir_function_signature *const main_sig =
- _mesa_get_main_function_signature(shader);
+ _mesa_get_main_function_signature(shader->symbols);
 
   if (main_sig != NULL)
  initialize_cs_derived_variables(shader, main_sig);
diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
index cd17f69..cd56a2a 100644
--- a/src/compiler/glsl/ir.h
+++ b/src/compiler/glsl/ir.h
@@ -2593,7 +2593,7 @@ extern gl_shader *
 _mesa_glsl_get_builtin_function_shader(void);
 
 extern ir_function_signature *
-_mesa_get_main_function_signature(gl_shader *sh);
+_mesa_get_main_function_signature(glsl_symbol_table *symbols);
 
 extern void
 _mesa_glsl_release_functions(void);
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 90d7831..411d2fe 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2216,7 +2216,7 @@ link_intrastage_shaders(void *mem_ctx,
 */
gl_shader *main = NULL;
for (unsigned i = 0; i < num_shaders; i++) {
-  if (_mesa_get_main_function_signature(shader_list[i]) != NULL) {
+  if (!_mesa_get_main_function_signature(shader_list[i]->symbols)) {
 main = shader_list[i];
 break;
   }
@@ -2246,7 +2246,7 @@ link_intrastage_shaders(void *mem_ctx,
 * copy of the original shader that contained the main function).
 */
ir_function_signature *const main_sig =
-  _mesa_get_main_function_signature(linked);
+  _mesa_get_main_function_signature(linked->symbols);
 
/* Move any instructions other than variable declarations or function
 * declarations into main.
diff --git a/src/compiler/glsl/lower_vertex_id.cpp 
b/src/compiler/glsl/lower_vertex_id.cpp
index 6f46945..ee69d94 100644
--- a/src/compiler/glsl/lower_vertex_id.cpp
+++ b/src/compiler/glsl/lower_vertex_id.cpp
@@ -130,7 +130,7 @@ lower_vertex_id(gl_shader *shader)
   return false;
 
ir_function_signature *const main_sig =
-  _mesa_get_main_function_signature(shader);
+  _mesa_get_main_function_signature(shader->symbols);
if (main_sig == NULL) {
   assert(main_sig != NULL);
   return false;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Split gl_shader in two and clean-ups

2016-06-27 Thread Timothy Arceri
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.

The only place the new structs overlap is the shader layout fields and
I intend to split that out into a third struct once this series lands.

Having two well defined structs helps code readability and allows the removal
of some unreachable code paths that were the result of confusion between
the two uses.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] mesa: don't use drivers NewShader function when creating shader objects

2016-06-27 Thread Timothy Arceri
The drivers function only needs to be used when creating a struct for
linked shaders.
---
 src/mesa/main/shaderapi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 07250cd..142e750 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -334,8 +334,7 @@ create_shader(struct gl_context *ctx, GLenum type)
 
_mesa_HashLockMutex(ctx->Shared->ShaderObjects);
name = _mesa_HashFindFreeKeyBlock(ctx->Shared->ShaderObjects, 1);
-   sh = ctx->Driver.NewShader(ctx, name,
-  _mesa_shader_enum_to_shader_stage(type));
+   sh = _mesa_new_shader(ctx, name, _mesa_shader_enum_to_shader_stage(type));
sh->Type = type;
_mesa_HashInsertLocked(ctx->Shared->ShaderObjects, name, sh);
_mesa_HashUnlockMutex(ctx->Shared->ShaderObjects);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] glsl: make cross_validate_globals() more generic

2016-06-27 Thread Timothy Arceri
Rather than passing in gl_shader we now pass in the IR. This will
allow us to later split gl_shader into two structs. One for use
as a linked per stage shader struct and one for use as a GL shader
object.
---
 src/compiler/glsl/linker.cpp | 413 ++-
 1 file changed, 207 insertions(+), 206 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 3bcb907..90d7831 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -974,236 +974,225 @@ validate_intrastage_arrays(struct gl_shader_program 
*prog,
  */
 void
 cross_validate_globals(struct gl_shader_program *prog,
-  struct gl_shader **shader_list,
-  unsigned num_shaders,
-  bool uniforms_only)
+   struct exec_list *ir, glsl_symbol_table *variables,
+   bool uniforms_only)
 {
-   /* Examine all of the uniforms in all of the shaders and cross validate
-* them.
-*/
-   glsl_symbol_table variables;
-   for (unsigned i = 0; i < num_shaders; i++) {
-  if (shader_list[i] == NULL)
-continue;
-
-  foreach_in_list(ir_instruction, node, shader_list[i]->ir) {
-ir_variable *const var = node->as_variable();
+   foreach_in_list(ir_instruction, node, ir) {
+  ir_variable *const var = node->as_variable();
 
-if (var == NULL)
-   continue;
+  if (var == NULL)
+ continue;
 
-if (uniforms_only && (var->data.mode != ir_var_uniform && 
var->data.mode != ir_var_shader_storage))
-   continue;
+  if (uniforms_only && (var->data.mode != ir_var_uniform && var->data.mode 
!= ir_var_shader_storage))
+ continue;
 
- /* don't cross validate subroutine uniforms */
- if (var->type->contains_subroutine())
-continue;
+  /* don't cross validate subroutine uniforms */
+  if (var->type->contains_subroutine())
+ continue;
 
-/* Don't cross validate temporaries that are at global scope.  These
- * will eventually get pulled into the shaders 'main'.
- */
-if (var->data.mode == ir_var_temporary)
-   continue;
+  /* Don't cross validate temporaries that are at global scope.  These
+   * will eventually get pulled into the shaders 'main'.
+   */
+  if (var->data.mode == ir_var_temporary)
+ continue;
 
-/* If a global with this name has already been seen, verify that the
- * new instance has the same type.  In addition, if the globals have
- * initializers, the values of the initializers must be the same.
- */
-ir_variable *const existing = variables.get_variable(var->name);
-if (existing != NULL) {
-/* Check if types match. Interface blocks have some special
- * rules so we handle those elsewhere.
- */
-   if (var->type != existing->type &&
-!var->is_interface_instance()) {
-  if (!validate_intrastage_arrays(prog, var, existing)) {
-  if (var->type->is_record() && existing->type->is_record()
-  && existing->type->record_compare(var->type)) {
- existing->type = var->type;
-  } else {
- /* If it is an unsized array in a Shader Storage Block,
-  * two different shaders can access to different elements.
-  * Because of that, they might be converted to different
-  * sized arrays, then check that they are compatible but
-  * ignore the array size.
-  */
- if (!(var->data.mode == ir_var_shader_storage &&
-   var->data.from_ssbo_unsized_array &&
-   existing->data.mode == ir_var_shader_storage &&
-   existing->data.from_ssbo_unsized_array &&
-   var->type->gl_type == existing->type->gl_type)) {
-linker_error(prog, "%s `%s' declared as type "
-"`%s' and type `%s'\n",
-mode_string(var),
-var->name, var->type->name,
-existing->type->name);
-return;
- }
+  /* If a global with this name has already been seen, verify that the
+   * new instance has the same type.  In addition, if the globals have
+   * initializers, the values of the initializers must be the same.
+   */
+  ir_variable *const existing = variables->get_variable(var->name);
+  if (existing != NULL) {
+ /* Check if types match. Interface blocks have some special
+  * rules so we handle those elsewhere.
+  */
+ if (var->type != existing->type &&
+ 

[Mesa-dev] [PATCH] mesa/st: Include nir.h for nir_shader symbol.

2016-06-27 Thread Vinson Lee
Fix this build error with GCC 4.4.

  CC state_tracker/st_nir_lower_builtin.lo
In file included from state_tracker/st_nir_lower_builtin.c:61:
state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’
../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ 
was here

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235
Signed-off-by: Vinson Lee 
---
 src/mesa/state_tracker/st_nir.h |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_nir.h b/src/mesa/state_tracker/st_nir.h
index 49ba573..d3b4704 100644
--- a/src/mesa/state_tracker/st_nir.h
+++ b/src/mesa/state_tracker/st_nir.h
@@ -26,13 +26,12 @@
 
 #include "st_context.h"
 #include "compiler/shader_enums.h"
+#include "compiler/nir/nir.h"
 
 #ifdef __cplusplus
 extern "C" {
 #endif
 
-typedef struct nir_shader nir_shader;
-
 void st_nir_lower_builtin(nir_shader *shader);
 nir_shader * st_glsl_to_nir(struct st_context *st, struct gl_program *prog,
 struct gl_shader_program *shader_program,
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/omx: add support for nouveau / interlaced

2016-06-27 Thread Liu, Leo
Hi Julien and Christian,


I got a patch attached to fix the "fillout" problem, and please review.


But we still need to fix transcoding issue with interlaced as true. Our 
transcode support tunneling, basic the decode buffer will be used directly for 
encode.


Thanks,

Leo




From: Julien Isorce 
Sent: June 27, 2016 4:54:07 PM
To: Liu, Leo
Cc: ML mesa-dev; Gurkirpal Singh; Koenig, Christian
Subject: Re: [Mesa-dev] [PATCH 2/3] st/omx: add support for nouveau / interlaced

Hi Leo,

Sorry for the inconvenience, could you let me know how to reproduce the problem 
?
I have been playing with some gst pipelines and they all work but I can only 
test with nouveau driver.

Cheers
Julien


On 27 June 2016 at 21:35, Leo Liu > 
wrote:
This patch break omx decode to file, it got seg fault. Will take look further.

Regards,
Leo



On 06/27/2016 04:16 AM, Julien Isorce wrote:
Signed-off-by: Julien Isorce >
---
  src/gallium/state_trackers/omx/vid_dec.c | 51 
  1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index 564ca2f..85ffb88 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -48,6 +48,7 @@
  #include "pipe/p_video_codec.h"
  #include "util/u_memory.h"
  #include "util/u_surface.h"
+#include "vl/vl_video_buffer.h"
  #include "vl/vl_vlc.h"
#include "entrypoint.h"
@@ -515,34 +516,34 @@ static void vid_dec_FillOutput(vid_dec_PrivateType *priv, 
struct pipe_video_buff
 OMX_VIDEO_PORTDEFINITIONTYPE *def = >sPortParam.format.video;
   struct pipe_sampler_view **views;
-   struct pipe_transfer *transfer;
-   struct pipe_box box = { };
-   uint8_t *src, *dst;
+   unsigned i, j;
+   unsigned width, height;
   views = buf->get_sampler_view_planes(buf);
  -   dst = output->pBuffer;
-
-   box.width = def->nFrameWidth;
-   box.height = def->nFrameHeight;
-   box.depth = 1;
-
-   src = priv->pipe->transfer_map(priv->pipe, views[0]->texture, 0,
-  PIPE_TRANSFER_READ, , );
-   util_copy_rect(dst, views[0]->texture->format, def->nStride, 0, 0,
-  box.width, box.height, src, transfer->stride, 0, 0);
-   pipe_transfer_unmap(priv->pipe, transfer);
-
-   dst = ((uint8_t*)output->pBuffer) + (def->nStride * box.height);
-
-   box.width = def->nFrameWidth / 2;
-   box.height = def->nFrameHeight / 2;
-
-   src = priv->pipe->transfer_map(priv->pipe, views[1]->texture, 0,
-  PIPE_TRANSFER_READ, , );
-   util_copy_rect(dst, views[1]->texture->format, def->nStride, 0, 0,
-  box.width, box.height, src, transfer->stride, 0, 0);
-   pipe_transfer_unmap(priv->pipe, transfer);
+   for (i = 0; i < 2 /* NV12 */; i++) {
+  if (!views[i]) continue;
+  width = buf->width;
+  height = buf->height;
+  vl_video_buffer_adjust_size(, , i, buf->interlaced, 
buf->chroma_format);
+  for (j = 0; j < views[i]->texture->array_size; ++j) {
+ struct pipe_box box = {0, 0, j, width, height, 1};
+ struct pipe_transfer *transfer;
+ uint8_t *map, *dst;
+ map = priv->pipe->transfer_map(priv->pipe, views[i]->texture, 0,
+  PIPE_TRANSFER_READ, , );
+ if (!map)
+return;
+
+ dst = ((uint8_t*)output->pBuffer + output->nOffset) + j * 
def->nStride + i * buf->width * buf->height;
+ util_copy_rect(dst,
+views[i]->texture->format,
+def->nStride * views[i]->texture->array_size, 0, 0,
+box.width, box.height, map, transfer->stride, 0, 0);
+
+ pipe_transfer_unmap(priv->pipe, transfer);
+  }
+   }
  }
static void vid_dec_FrameDecoded(OMX_COMPONENTTYPE *comp, 
OMX_BUFFERHEADERTYPE* input,


From 8c06114545d63cd435834b1f7664e68bc0189a77 Mon Sep 17 00:00:00 2001
From: Leo Liu 
Date: Mon, 27 Jun 2016 20:40:30 -0400
Subject: [PATCH] st/omx: fix decoder fillout for the OMX result buffer

The call for vl_video_buffer_adjust_size is with wrong order of
arguments, apparently it will have problem when interlaced false;

The size of OMX result buffer depends on real size of clips, vl buffer
dimension is aligned with 16, so 1080p(1920*1080) video will overflow
the OMX buffer

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/omx/vid_dec.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec.c b/src/gallium/state_trackers/omx/vid_dec.c
index 85ffb88..a989c10 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -523,9 +523,9 @@ static void vid_dec_FillOutput(vid_dec_PrivateType *priv, struct pipe_video_buff
 
for (i = 0; i < 2 /* NV12 */; i++) {
   

Re: [Mesa-dev] [PATCH 3/4] glx: Fix indirect multi-texture GL_DOUBLE coordinate arrays.

2016-06-27 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 06/23/2016 11:15 AM, Matt Turner wrote:
> From: Colin McDonald 
> 
> There is no draw arrays protocol support for multi-texture coordinate
> arrays, so it is implemented by sending batches of immediate mode
> commands from emit_element_none in indirect_vertex_array.c.  This sends
> the target texture unit (which has been previously setup in the
> array_state header field), followed by the texture coordinates.  But for
> GL_DOUBLE coordinates the texture unit must be sent *after* the texture
> coordinates. This is documented in the glx protocol description, and can
> also be seen in the indirect.c immediate mode commands generated from
> gl_API.xml. Sending the target texture unit in the wrong place can crash
> the remote X server.
> 
> To fix this required some more extensive changes to
> indirect_vertex_array.c and indirect_vertex_array_priv.h, in order to
> remove the texture unit value out of the array_state "header" field, and
> send it separately.
> 
> Reviewed-by: Matt Turner 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907
> ---
>  src/glx/indirect_vertex_array.c  | 68 
> +---
>  src/glx/indirect_vertex_array_priv.h | 12 ++-
>  2 files changed, 50 insertions(+), 30 deletions(-)
> 
> diff --git a/src/glx/indirect_vertex_array.c b/src/glx/indirect_vertex_array.c
> index 01fab33..0374093 100644
> --- a/src/glx/indirect_vertex_array.c
> +++ b/src/glx/indirect_vertex_array.c
> @@ -240,8 +240,6 @@ __glXInitVertexArrayState(struct glx_context * gc)
>  
>arrays->arrays[4 + i].old_DrawArrays_possible = (i == 0);
>arrays->arrays[4 + i].index = i;
> -
> -  arrays->arrays[4 + i].header[1] = i + GL_TEXTURE0;
> }
>  
> i = 4 + texture_units;
> @@ -274,8 +272,6 @@ __glXInitVertexArrayState(struct glx_context * gc)
>  
>arrays->arrays[idx + i].old_DrawArrays_possible = 0;
>arrays->arrays[idx + i].index = idx;
> -
> -  arrays->arrays[idx + i].header[1] = idx;
> }
>  
> i += vertex_program_attribs;
> @@ -325,7 +321,7 @@ calculate_single_vertex_size_none(const struct 
> array_state_vector *arrays)
>  
> for (i = 0; i < arrays->num_arrays; i++) {
>if (arrays->arrays[i].enabled) {
> - single_vertex_size += ((uint16_t *) arrays->arrays[i].header)[0];
> + single_vertex_size += arrays->arrays[i].header[0];
>}
> }
>  
> @@ -353,17 +349,45 @@ emit_element_none(GLubyte * dst,
>* protocol is for a 4Nus.  Since the sizes are small, the
>* performance impact on modern processors should be negligible.
>*/
> - (void) memset(dst, 0, ((uint16_t *) arrays->arrays[i].header)[0]);
> -
> - (void) memcpy(dst, arrays->arrays[i].header,
> -   arrays->arrays[i].header_size);
> -
> - dst += arrays->arrays[i].header_size;
> -
> - (void) memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + offset,
> -   arrays->arrays[i].element_size);
> -
> - dst += __GLX_PAD(arrays->arrays[i].element_size);
> + (void) memset(dst, 0, arrays->arrays[i].header[0]);
> +
> + (void) memcpy(dst, arrays->arrays[i].header, 4);
> +
> + dst += 4;
> +
> + if (arrays->arrays[i].key == GL_TEXTURE_COORD_ARRAY &&
> + arrays->arrays[i].index > 0) {
> +/* Multi-texture coordinate arrays require the texture target
> + * to be sent.  For doubles it is after the data, for everything
> + * else it is before.
> + */
> +GLenum texture = arrays->arrays[i].index + GL_TEXTURE0;
> +if (arrays->arrays[i].data_type == GL_DOUBLE) {
> +   (void) memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + 
> offset,
> + arrays->arrays[i].element_size);
> +   dst += arrays->arrays[i].element_size;
> +   (void) memcpy(dst, , 4);
> +   dst += 4;
> +} else {
> +   (void) memcpy(dst, , 4);
> +   dst += 4;
> +   (void) memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + 
> offset,
> + arrays->arrays[i].element_size);
> +   dst += __GLX_PAD(arrays->arrays[i].element_size);
> +}
> + } else if (arrays->arrays[i].key == GL_VERTEX_ATTRIB_ARRAY_POINTER) 
> {
> +/* Vertex attribute data requires the index sent first.
> + */
> +(void) memcpy(dst, >arrays[i].index, 4);
> +dst += 4;
> +(void) memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + offset,
> +  arrays->arrays[i].element_size);
> +dst += __GLX_PAD(arrays->arrays[i].element_size);
> + } else {
> +(void) memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + offset,
> + 

Re: [Mesa-dev] [PATCH 1/4] glx: Call __glXInitVertexArrayState() with a usable gc.

2016-06-27 Thread Ian Romanick
On 06/23/2016 11:15 AM, Matt Turner wrote:
> From: Colin McDonald 
> 
> For each indirect context the indirect vertex array state must be initialised
> by __glXInitVertexArrayState in indirect_vertex_array.c.  As noted in the
> routine header it requires that the glx context has been setup prior to the
> call, in order to test the server version and extensions.
> 
> Currently __glXInitVertexArrayState is called from indirect_bind_context in
> indirect_glx.c, as follows:
> 
>state = gc->client_state_private;
>if (state->array_state == NULL) {
>   glGetString(GL_EXTENSIONS);
>   glGetString(GL_VERSION);
>   __glXInitVertexArrayState(gc);
>}
> 
> But, the gc context is not yet usable at this stage, so the server queries
> fail, and __glXInitVertexArrayState is called without the server version and
> extension information it needs.  This breaks multi-texturing as
> glXInitVertexArrayState doesn't get GL_MAX_TEXTURE_UNITS.  It probably also
> breaks setup of other arrays: fog, secondary colour, vertex attributes.
> 
> To fix this I have moved the call to __glXInitVertexArrayState to the end of
> MakeContextCurrent in glxcurrent.c, where the glx context is usable.
> 
> Fixes a regression caused by commit 4fbdde889c. Fixes ARB_vertex_program
> usage in the arbvparray Mesa demo when run with indirect GLX and also
> the tex-skipped-unit piglit test when run with indirect GLX.
> 
> Reviewed-by: Matt Turner 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907
> ---
>  src/glx/glxcurrent.c   | 12 
>  src/glx/indirect_glx.c |  8 
>  2 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/src/glx/glxcurrent.c b/src/glx/glxcurrent.c
> index f2e3865..d119326 100644
> --- a/src/glx/glxcurrent.c
> +++ b/src/glx/glxcurrent.c
> @@ -252,6 +252,18 @@ MakeContextCurrent(Display * dpy, GLXDrawable draw,
>  
> __glXUnlock();
>  
> +   /* The indirect vertex array state must to be initialised after we
> +* have setup the context, as it needs to query server attributes.
> +*/
> +   if (gc && !gc->isDirect) {
> +  __GLXattribute *state = gc->client_state_private;
> +  if (state && state->array_state == NULL) {
> + glGetString(GL_EXTENSIONS);
> + glGetString(GL_VERSION);
> + __glXInitVertexArrayState(gc);
> +  }
> +   }
> +

This is a pretty ugly layering violation, but it may be the best way.
What's the precise problem?  Is it that __glXSetCurrentContext(gc)
hasn't happened yet or is it that __glXUnlock() hasn't happened yet?

> return GL_TRUE;
>  }
>  
> diff --git a/src/glx/indirect_glx.c b/src/glx/indirect_glx.c
> index bb121f8..26825fb 100644
> --- a/src/glx/indirect_glx.c
> +++ b/src/glx/indirect_glx.c
> @@ -131,7 +131,6 @@ indirect_bind_context(struct glx_context *gc, struct 
> glx_context *old,
> GLXDrawable draw, GLXDrawable read)
>  {
> GLXContextTag tag;
> -   __GLXattribute *state;
> Display *dpy = gc->psc->dpy;
> int opcode = __glXSetupForCommand(dpy);
> Bool sent;
> @@ -150,13 +149,6 @@ indirect_bind_context(struct glx_context *gc, struct 
> glx_context *old,
>IndirectAPI = __glXNewIndirectAPI();
> _glapi_set_dispatch(IndirectAPI);
>  
> -   state = gc->client_state_private;
> -   if (state->array_state == NULL) {
> -  glGetString(GL_EXTENSIONS);
> -  glGetString(GL_VERSION);
> -  __glXInitVertexArrayState(gc);
> -   }
> -
> return !sent;
>  }
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] glx: Undo memory allocation checking damage.

2016-06-27 Thread Ian Romanick
On 06/23/2016 11:15 AM, Matt Turner wrote:
> This partially reverts commit d41f5396f3cb619729021390c273f838d92f11fb.
> 
> That untested commit broke the tex-skipped-unit piglit test and the
> arbvparray Mesa demo when run with indirect GLX.
> 
> state->array_state is used during initialization, so its assignment cannot be
> moved to the end of the function.
> 
> The backtrace looked like:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x777c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at 
> indirect_vertex_array.c:1952
> 1952   return state->array_state->active_texture_unit;
> (gdb) bt
> 0  0x777c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at 
> indirect_vertex_array.c:1952
> 1  0x777cbf62 in get_client_data (gc=0x626f50, cap=34018, 
> data=0x7fffd7a0) at single2.c:159
> 2  0x777cce51 in __indirect_glGetIntegerv (val=34018, 
> i=0x7fffd830) at single2.c:498
> 3  0x777c4340 in __glXInitVertexArrayState (gc=0x626f50) at 
> indirect_vertex_array.c:193
> ---
>  src/glx/indirect_vertex_array.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/src/glx/indirect_vertex_array.c b/src/glx/indirect_vertex_array.c
> index 0374093..a707343 100644
> --- a/src/glx/indirect_vertex_array.c
> +++ b/src/glx/indirect_vertex_array.c
> @@ -151,6 +151,7 @@ __glXInitVertexArrayState(struct glx_context * gc)
>  
>  
> arrays = calloc(1, sizeof(struct array_state_vector));
> +   state->array_state = arrays;
>  

Later this function will free arrays when there is a different error.
Shouldn't that place also NULL out state->array_state to prevent double
frees or other problems?

> if (arrays == NULL) {
>__glXSetError(gc, GL_OUT_OF_MEMORY);
> @@ -299,11 +300,6 @@ __glXInitVertexArrayState(struct glx_context * gc)
>__glXSetError(gc, GL_OUT_OF_MEMORY);
>return;
> }
> -
> -   /* Everything went ok so we put vertex array state in place
> -* in context.
> -*/
> -   state->array_state = arrays;
>  }
>  
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] glx: Correct opcode typos in __indirect_glTexCoordPointer.

2016-06-27 Thread Ian Romanick
On 06/23/2016 11:15 AM, Matt Turner wrote:
> From: Colin McDonald 
> 
> At the same time, replace opcode numbers with names in
> __indirect_glVertexAttribPointer.
> 
> Reviewed-by: Matt Turner 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907
> ---
>  src/glx/indirect_vertex_array.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/src/glx/indirect_vertex_array.c b/src/glx/indirect_vertex_array.c
> index 2bf2ff1..01fab33 100644
> --- a/src/glx/indirect_vertex_array.c
> +++ b/src/glx/indirect_vertex_array.c
> @@ -1386,7 +1386,7 @@ __indirect_glTexCoordPointer(GLint size, GLenum type, 
> GLsizei stride,
>X_GLrop_TexCoord4iv
> };
> static const uint16_t float_ops[5] = {
> -  0, X_GLrop_TexCoord1dv, X_GLrop_TexCoord2fv, X_GLrop_TexCoord3fv,
> +  0, X_GLrop_TexCoord1fv, X_GLrop_TexCoord2fv, X_GLrop_TexCoord3fv,

Holy crap.  That typo bug has existed since

commit fdb07636f2e6324c5250cd5ee97778b7f5933bea
Author: Ian Romanick 
Date:   Tue Feb 22 22:36:31 2005 +

Added __glExtensionBiIsEnabled and __GLXcontext::gl_extension_bits.  This
enables libGL to query which extension are exported to applications.

Refactored array-query functionality (from glGet*v) in 
src/glx/x11/single2.c.

Massive re-write of indirect vertex array support.  The most noticable
effect is that glDrawElements now generates DrawArrays protocol.  The
side-effects (and the main reasons for the re-work) are that it is much
easier to add support for new arrays (e.g., GL_VERTEX_ATTRIB_ARRAY,
GL_WEIGHT_ARRAY_ARB, etc.) and it is much easier to add support for the new
DrawArrays protocol (required to support ARB_vertex_buffer_object).

These changes were primarilly tested with progs/demos/isosurf.

Colin wins.

This patch is

Reviewed-by: Ian Romanick 

I think this patch can get as much stable tagging as Emil will apply. :)

>X_GLrop_TexCoord4fv
> };
> static const uint16_t double_ops[5] = {
> @@ -1403,7 +1403,7 @@ __indirect_glTexCoordPointer(GLint size, GLenum type, 
> GLsizei stride,
>X_GLrop_MultiTexCoord3ivARB, X_GLrop_MultiTexCoord4ivARB
> };
> static const uint16_t mfloat_ops[5] = {
> -  0, X_GLrop_MultiTexCoord1dvARB, X_GLrop_MultiTexCoord2fvARB,
> +  0, X_GLrop_MultiTexCoord1fvARB, X_GLrop_MultiTexCoord2fvARB,
>X_GLrop_MultiTexCoord3fvARB, X_GLrop_MultiTexCoord4fvARB
> };
> static const uint16_t mdouble_ops[5] = {
> @@ -1587,9 +1587,18 @@ __indirect_glVertexAttribPointer(GLuint index, GLint 
> size,
>  GLenum type, GLboolean normalized,
>  GLsizei stride, const GLvoid * pointer)
>  {
> -   static const uint16_t short_ops[5] = { 0, 4189, 4190, 4191, 4192 };
> -   static const uint16_t float_ops[5] = { 0, 4193, 4194, 4195, 4196 };
> -   static const uint16_t double_ops[5] = { 0, 4197, 4198, 4199, 4200 };
> +   static const uint16_t short_ops[5] = {
> +0, X_GLrop_VertexAttrib1svARB, X_GLrop_VertexAttrib2svARB,
> +X_GLrop_VertexAttrib3svARB, X_GLrop_VertexAttrib4svARB
> +   };
> +   static const uint16_t float_ops[5] = {
> +0, X_GLrop_VertexAttrib1fvARB, X_GLrop_VertexAttrib2fvARB,
> +X_GLrop_VertexAttrib3fvARB, X_GLrop_VertexAttrib4fvARB
> +   };
> +   static const uint16_t double_ops[5] = {
> +0, X_GLrop_VertexAttrib1dvARB, X_GLrop_VertexAttrib2dvARB,
> +X_GLrop_VertexAttrib3dvARB, X_GLrop_VertexAttrib4dvARB
> +   };
>  
> uint16_t opcode;
> struct glx_context *gc = __glXGetCurrentContext();
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Close fp on error path.

2016-06-27 Thread Ian Romanick
It looks like there's another premature return around line 10087.

On 06/27/2016 04:31 PM, Matt Turner wrote:
> ---
>  src/mesa/main/dlist.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
> index 3845d2e..1cf814b 100644
> --- a/src/mesa/main/dlist.c
> +++ b/src/mesa/main/dlist.c
> @@ -10366,7 +10366,7 @@ print_list(struct gl_context *ctx, GLuint list, const 
> char *fname)
> printf
>("ERROR IN DISPLAY LIST: opcode = %d, address = %p\n",
> opcode, (void *) n);
> -   return;
> +   goto out;
>  }
>  else {
> fprintf(f, "command %d, %u operands\n", opcode,
> @@ -10380,6 +10380,7 @@ print_list(struct gl_context *ctx, GLuint list, const 
> char *fname)
>}
> }
>  
> + out:
> fflush(f);
> if (fname)
>fclose(f);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] intel: Removing PCI IDs that are no longer listed as Kabylake.

2016-06-27 Thread Rodrigo Vivi
This is unusual. Usually IDs listed on early stages of platform
definition are kept there as reserved for later use.

However these IDs here are not listed anymore in any of steppings
and devices IDs tables for Kabylake on configurations overview
section of BSpec.

So it is better removing them before they become used in any
other future platform.

v2: Rebase.

Cc: Dhinakaran Pandiyan 
Signed-off-by: Rodrigo Vivi 
---
 intel/intel_chipset.h | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
index 6b8d4e9..514f659 100644
--- a/intel/intel_chipset.h
+++ b/intel/intel_chipset.h
@@ -204,18 +204,13 @@
 #define PCI_CHIP_KABYLAKE_DT_GT2   0x5912
 #define PCI_CHIP_KABYLAKE_DT_GT1_5 0x5917
 #define PCI_CHIP_KABYLAKE_DT_GT1   0x5902
-#define PCI_CHIP_KABYLAKE_DT_GT4   0x5932
 #define PCI_CHIP_KABYLAKE_HALO_GT2 0x591B
 #define PCI_CHIP_KABYLAKE_HALO_GT4 0x593B
-#define PCI_CHIP_KABYLAKE_HALO_GT3 0x592B
 #define PCI_CHIP_KABYLAKE_HALO_GT1_0   0x5908
 #define PCI_CHIP_KABYLAKE_HALO_GT1_1   0x590B
 #define PCI_CHIP_KABYLAKE_SRV_GT2  0x591A
-#define PCI_CHIP_KABYLAKE_SRV_GT3  0x592A
 #define PCI_CHIP_KABYLAKE_SRV_GT1  0x590A
-#define PCI_CHIP_KABYLAKE_SRV_GT4  0x593A
 #define PCI_CHIP_KABYLAKE_WKS_GT2  0x591D
-#define PCI_CHIP_KABYLAKE_WKS_GT4  0x593D
 
 #define PCI_CHIP_BROXTON_0 0x0A84
 #define PCI_CHIP_BROXTON_1 0x1A84
@@ -431,14 +426,9 @@
 
 #define IS_KBL_GT3(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT3_0 || \
 (devid) == PCI_CHIP_KABYLAKE_ULT_GT3_1 || \
-(devid) == PCI_CHIP_KABYLAKE_ULT_GT3_2 || \
-(devid) == PCI_CHIP_KABYLAKE_HALO_GT3  || \
-(devid) == PCI_CHIP_KABYLAKE_SRV_GT3)
-
-#define IS_KBL_GT4(devid)  ((devid) == PCI_CHIP_KABYLAKE_DT_GT4|| \
-(devid) == PCI_CHIP_KABYLAKE_HALO_GT4  || \
-(devid) == PCI_CHIP_KABYLAKE_SRV_GT4   || \
-(devid) == PCI_CHIP_KABYLAKE_WKS_GT4)
+(devid) == PCI_CHIP_KABYLAKE_ULT_GT3_2)
+
+#define IS_KBL_GT4(devid)  ((devid) == PCI_CHIP_KABYLAKE_HALO_GT4)
 
 #define IS_KABYLAKE(devid) (IS_KBL_GT1(devid) || \
 IS_KBL_GT2(devid) || \
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] intel: Add more Kabylake PCI IDs.

2016-06-27 Thread Rodrigo Vivi
The spec has been updated adding new PCI IDs.

v2: Avoid using "H" instead of HALO to keep names uniform - DK.

Cc: Dhinakaran Pandiyan 
Signed-off-by: Rodrigo Vivi 
---
 intel/intel_chipset.h | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
index e2554c3..6b8d4e9 100644
--- a/intel/intel_chipset.h
+++ b/intel/intel_chipset.h
@@ -194,7 +194,9 @@
 #define PCI_CHIP_KABYLAKE_ULT_GT2  0x5916
 #define PCI_CHIP_KABYLAKE_ULT_GT1_50x5913
 #define PCI_CHIP_KABYLAKE_ULT_GT1  0x5906
-#define PCI_CHIP_KABYLAKE_ULT_GT3  0x5926
+#define PCI_CHIP_KABYLAKE_ULT_GT3_00x5923
+#define PCI_CHIP_KABYLAKE_ULT_GT3_10x5926
+#define PCI_CHIP_KABYLAKE_ULT_GT3_20x5927
 #define PCI_CHIP_KABYLAKE_ULT_GT2F 0x5921
 #define PCI_CHIP_KABYLAKE_ULX_GT1_50x5915
 #define PCI_CHIP_KABYLAKE_ULX_GT1  0x590E
@@ -206,7 +208,8 @@
 #define PCI_CHIP_KABYLAKE_HALO_GT2 0x591B
 #define PCI_CHIP_KABYLAKE_HALO_GT4 0x593B
 #define PCI_CHIP_KABYLAKE_HALO_GT3 0x592B
-#define PCI_CHIP_KABYLAKE_HALO_GT1 0x590B
+#define PCI_CHIP_KABYLAKE_HALO_GT1_0   0x5908
+#define PCI_CHIP_KABYLAKE_HALO_GT1_1   0x590B
 #define PCI_CHIP_KABYLAKE_SRV_GT2  0x591A
 #define PCI_CHIP_KABYLAKE_SRV_GT3  0x592A
 #define PCI_CHIP_KABYLAKE_SRV_GT1  0x590A
@@ -414,7 +417,8 @@
 (devid) == PCI_CHIP_KABYLAKE_ULT_GT1   || \
 (devid) == PCI_CHIP_KABYLAKE_ULX_GT1   || \
 (devid) == PCI_CHIP_KABYLAKE_DT_GT1|| \
-(devid) == PCI_CHIP_KABYLAKE_HALO_GT1  || \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT1_0 || \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT1_1 || \
 (devid) == PCI_CHIP_KABYLAKE_SRV_GT1)
 
 #define IS_KBL_GT2(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT2   || \
@@ -425,7 +429,9 @@
 (devid) == PCI_CHIP_KABYLAKE_SRV_GT2   || \
 (devid) == PCI_CHIP_KABYLAKE_WKS_GT2)
 
-#define IS_KBL_GT3(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT3   || \
+#define IS_KBL_GT3(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT3_0 || \
+(devid) == PCI_CHIP_KABYLAKE_ULT_GT3_1 || \
+(devid) == PCI_CHIP_KABYLAKE_ULT_GT3_2 || \
 (devid) == PCI_CHIP_KABYLAKE_HALO_GT3  || \
 (devid) == PCI_CHIP_KABYLAKE_SRV_GT3)
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] isl: Fix isl_tiling_is_any_y()

2016-06-27 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Mon, Jun 27, 2016 at 4:35 PM, Nanley Chery  wrote:

> Signed-off-by: Nanley Chery 
> ---
>  src/intel/isl/isl.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index ef86228..64aced8 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -989,7 +989,7 @@ isl_has_matching_typed_storage_image_format(const
> struct brw_device_info *devinf
>  static inline bool
>  isl_tiling_is_any_y(enum isl_tiling tiling)
>  {
> -   return (1u << tiling) & ISL_TILING_ANY_MASK;
> +   return (1u << tiling) & ISL_TILING_ANY_Y_MASK;
>  }
>
>  static inline bool
> --
> 2.9.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] anv/blit2d: Copy with stencil sources when needed

2016-06-27 Thread Nanley Chery
In the next patch, ISL will unconditionally perform verification of a
surface's tiling and usage. Since it will require that w-tiled images
be stencil buffers, create a stencil surface to copy from a
w-tiled/stencil surface.

Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_meta_blit2d.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_meta_blit2d.c 
b/src/intel/vulkan/anv_meta_blit2d.c
index 06e1043..267c08b 100644
--- a/src/intel/vulkan/anv_meta_blit2d.c
+++ b/src/intel/vulkan/anv_meta_blit2d.c
@@ -105,7 +105,9 @@ create_iview(struct anv_cmd_buffer *cmd_buffer,
const VkImageCreateInfo image_info = {
   .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
   .imageType = VK_IMAGE_TYPE_2D,
-  .format = vk_format_for_size(surf->bs),
+  /* W-tiled images must be stencil-formatted. */
+  .format = surf->tiling == ISL_TILING_W ?
+VK_FORMAT_S8_UINT : vk_format_for_size(surf->bs),
   .extent = {
  .width = width,
  .height = height,
@@ -140,7 +142,7 @@ create_iview(struct anv_cmd_buffer *cmd_buffer,
   .viewType = VK_IMAGE_VIEW_TYPE_2D,
   .format = image_info.format,
   .subresourceRange = {
- .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
+ .aspectMask = 
anv_image_from_handle(*img)->aspects,
  .baseMipLevel = 0,
  .levelCount = 1,
  .baseArrayLayer = 0,
@@ -177,7 +179,16 @@ blit2d_bind_src(struct anv_cmd_buffer *cmd_buffer,
  rect->src_x, rect->src_y,
  , >src_x, >src_y);
 
-  create_iview(cmd_buffer, src, offset, VK_IMAGE_USAGE_SAMPLED_BIT,
+  VkImageUsageFlags usage = VK_IMAGE_USAGE_SAMPLED_BIT;
+
+  /* W-tiled images must be stencil-formatted. Outside of meta,
+   * a stencil image has this usage bit set. Adding it here
+   * ensures the ISL surface is created correctly.
+   */
+  if (src->tiling == ISL_TILING_W)
+ usage |= VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT;
+
+  create_iview(cmd_buffer, src, offset, usage,
rect->src_x + rect->width, rect->src_y + rect->height,
>image, >iview);
 
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] anv/image: Simplify initialization of the isl_tiling

2016-06-27 Thread Nanley Chery
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_image.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 77d9931..b3f5f5c 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -120,10 +120,6 @@ make_surface(const struct anv_device *dev,
   [VK_IMAGE_TYPE_3D] = ISL_SURF_DIM_3D,
};
 
-   isl_tiling_flags_t tiling_flags = anv_info->isl_tiling_flags;
-   if (vk_info->tiling == VK_IMAGE_TILING_LINEAR)
-  tiling_flags = ISL_TILING_LINEAR_BIT;
-
struct anv_surface *anv_surf = get_surface(image, aspect);
 
image->extent = anv_sanitize_image_extent(vk_info->imageType,
@@ -142,7 +138,7 @@ make_surface(const struct anv_device *dev,
   .min_alignment = 0,
   .min_pitch = anv_info->stride,
   .usage = choose_isl_surf_usage(image->usage, aspect),
-  .tiling_flags = tiling_flags);
+  .tiling_flags = anv_info->isl_tiling_flags);
 
/* isl_surf_init() will fail only if provided invalid input. Invalid input
 * is illegal in Vulkan.
@@ -260,7 +256,9 @@ anv_CreateImage(VkDevice device,
return anv_image_create(device,
   &(struct anv_image_create_info) {
  .vk_info = pCreateInfo,
- .isl_tiling_flags = ISL_TILING_ANY_MASK,
+ .isl_tiling_flags = (pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR) ?
+ ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK,
+
   },
   pAllocator,
   pImage);
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] Revert "isl: Don't filter tiling flags if a specific tiling bit is set"

2016-06-27 Thread Nanley Chery
This reverts commit 091f1da902c71ac8d3d27b325a118e2f683f1ae5.

Although a user may specify a specfic tiling bit, ISL should still
prevent incompatible tiling/surface combinations.

Signed-off-by: Nanley Chery 
---

Prior to patch https://patchwork.freedesktop.org/patch/95338/ ,
this change made crucible tests which attempted to make a linear
depth image assert-fail.

 src/intel/isl/isl.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 77b570d..f09863e4 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -187,14 +187,11 @@ isl_surf_choose_tiling(const struct isl_device *dev,
 {
isl_tiling_flags_t tiling_flags = info->tiling_flags;
 
-   /* Filter if multiple tiling options are given */
-   if (!isl_is_pow2(tiling_flags)) {
-  if (ISL_DEV_GEN(dev) >= 7) {
- gen7_filter_tiling(dev, info, _flags);
-  } else {
- isl_finishme("%s: gen%u", __func__, ISL_DEV_GEN(dev));
- gen7_filter_tiling(dev, info, _flags);
-  }
+   if (ISL_DEV_GEN(dev) >= 7) {
+  gen7_filter_tiling(dev, info, _flags);
+   } else {
+  isl_finishme("%s: gen%u", __func__, ISL_DEV_GEN(dev));
+  gen7_filter_tiling(dev, info, _flags);
}
 
#define CHOOSE(__tiling) \
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] isl: Fix isl_tiling_is_any_y()

2016-06-27 Thread Nanley Chery
Signed-off-by: Nanley Chery 
---
 src/intel/isl/isl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index ef86228..64aced8 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -989,7 +989,7 @@ isl_has_matching_typed_storage_image_format(const struct 
brw_device_info *devinf
 static inline bool
 isl_tiling_is_any_y(enum isl_tiling tiling)
 {
-   return (1u << tiling) & ISL_TILING_ANY_MASK;
+   return (1u << tiling) & ISL_TILING_ANY_Y_MASK;
 }
 
 static inline bool
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Close fp on error path.

2016-06-27 Thread Matt Turner
---
 src/mesa/main/dlist.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 3845d2e..1cf814b 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -10366,7 +10366,7 @@ print_list(struct gl_context *ctx, GLuint list, const 
char *fname)
printf
   ("ERROR IN DISPLAY LIST: opcode = %d, address = %p\n",
opcode, (void *) n);
-   return;
+   goto out;
 }
 else {
fprintf(f, "command %d, %u operands\n", opcode,
@@ -10380,6 +10380,7 @@ print_list(struct gl_context *ctx, GLuint list, const 
char *fname)
   }
}
 
+ out:
fflush(f);
if (fname)
   fclose(f);
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gm107/ir: make sure that flagsDef is set when emitting setcond

2016-06-27 Thread Samuel Pitoiset
Rely on the existence of a second destination when emitting a setcond
flag is dangerous, because this doesn't mean that the flag has been
correctly set. Instead rely on flagsDef like what emitX() does
for flagsSrc.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 40afbce..2c5e8f6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -420,7 +420,7 @@ CodeEmitterGM107::emitSAT(int pos)
 void
 CodeEmitterGM107::emitCC(int pos)
 {
-   emitField(pos, 1, insn->defExists(1));
+   emitField(pos, 1, insn->flagsDef >= 0);
 }
 
 void
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] doc: improve INTEL_DEBUG documentation

2016-06-27 Thread Grazvydas Ignotas
Remove 'reg' option that does not actually exist, elaborate more about
'sync' and add the missing options.

Signed-off-by: Grazvydas Ignotas 
---
 no commit access, if this is ok please somebody push

 docs/envvars.html | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index ed957bd..2d9a289 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -144,11 +144,10 @@ See the Xlib software driver 
page for details.
bat - emit batch information
pix - emit messages about pixel operations
buf - emit messages about buffer objects
-   reg - emit messages about regions
fbo - emit messages about framebuffers
fs - dump shader assembly for fragment shaders
gs - dump shader assembly for geometry shaders
-   sync - emit messages about synchronization
+   sync - after sending each batch, emit a message and wait for that batch 
to finish rendering
prim - emit messages about drawing primitives
vert - emit messages about vertex assembly
dri - emit messages about the DRI interface
@@ -163,9 +162,18 @@ See the Xlib software driver 
page for details.
blorp - emit messages about the blorp operations (blits  
clears)
nodualobj - suppress generation of dual-object geometry shader code
optimizer - dump shader assembly to files at each optimization pass and 
iteration that make progress
+   ann - annotate IR in assembly dumps
+   no8 - don't generate SIMD8 fragment shader
vec4 - force vec4 mode in vertex shader
spill_fs - force spilling of all registers in the scalar backend 
(useful to debug spilling code)
spill_vec4 - force spilling of all registers in the vec4 backend 
(useful to debug spilling code)
+   cs - dump shader assembly for compute shaders
+   hex - print instruction hex dump with the disassembly
+   nocompact - disable instruction compaction
+   tcs - dump shader assembly for tessellation control shaders
+   tes - dump shader assembly for tessellation evaluation shaders
+   l3 - emit messages about the new L3 state during transitions
+   do32 - generate compute shader SIMD32 programs even if workgroup size 
doesn't exceed the SIMD16 limit
norbc - disable single sampled render buffer compression
 
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Make single-buffered GLES representation internally consistent

2016-06-27 Thread Gurchetan Singh
Hi Ilia,

The changes for get.c where prompted by the es3fIntegerStateQueryTests (see
modules/gles3/functional/es3fIntegerStateQueryTests.cpp in the dEQP tree).
Specifically, these few lines:

>> const GLint validInitialValues[] = {GL_BACK, GL_NONE};
>> m_verifier->verifyIntegerAnyOf(m_testCtx, GL_READ_BUFFER,
validInitialValues, DE_LENGTH_OF_ARRAY(validInitialValues));
>> expectError(GL_NO_ERROR);

We initially set ColorReadBuffer to GL_FRONT in
_mesa_initialize_window_framebuffer for single-buffered configs.

We could also make sure the context is single-buffered in get.c to further
avoid bugs.  Let me know if that works for you and I'll send a modified
patch.

I do agree it is a bit hacky ... I'd definitely be interested in
alternative solutions.

On Mon, Jun 27, 2016 at 1:45 PM, Ilia Mirkin  wrote:

> On Mon, Jun 27, 2016 at 4:17 PM, Gurchetan Singh
>  wrote:
> > There a few places in the code where clearing and reading are done on
> incorrect
> > buffers for GLES contexts.  See comments for details.  This fixes 75
> GLES3
> > dEQP tests on the surfaceless platform with no regressions.
> >
> > v2:  Corrected unclear comment
> > ---
> >  src/mesa/main/buffers.c | 14 --
> >  src/mesa/main/clear.c   |  8 
> >  src/mesa/main/get.c |  9 +
> >  3 files changed, 29 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c
> > index e8aedde..86696b8 100644
> > --- a/src/mesa/main/buffers.c
> > +++ b/src/mesa/main/buffers.c
> > @@ -173,12 +173,22 @@ draw_buffer_enum_to_bitmask(const struct
> gl_context *ctx, GLenum buffer)
> >   * return -1 for an invalid buffer.
> >   */
> >  static gl_buffer_index
> > -read_buffer_enum_to_index(GLenum buffer)
> > +read_buffer_enum_to_index(const struct gl_context *ctx, GLenum buffer)
> >  {
> > switch (buffer) {
> >case GL_FRONT:
> >   return BUFFER_FRONT_LEFT;
> >case GL_BACK:
> > + if (_mesa_is_gles(ctx)) {
> > +/* In draw_buffer_enum_to_bitmask, when GLES contexts draw
> to
> > + * GL_BACK with a single-buffered configuration, we
> actually end
> > + * up drawing to the sole front buffer in our internal
> > + * representation.  For consistency, we must read from that
> > + * front left buffer too.
> > + */
> > +if (!ctx->DrawBuffer->Visual.doubleBufferMode)
> > +   return BUFFER_FRONT_LEFT;
> > + }
> >   return BUFFER_BACK_LEFT;
> >case GL_RIGHT:
> >   return BUFFER_FRONT_RIGHT;
> > @@ -724,7 +734,7 @@ read_buffer(struct gl_context *ctx, struct
> gl_framebuffer *fb,
> >if (_mesa_is_gles3(ctx) && !is_legal_es3_readbuffer_enum(buffer))
> >   srcBuffer = -1;
> >else
> > - srcBuffer = read_buffer_enum_to_index(buffer);
> > + srcBuffer = read_buffer_enum_to_index(ctx, buffer);
> >
> >if (srcBuffer == -1) {
> >   _mesa_error(ctx, GL_INVALID_ENUM,
> > diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c
> > index 35b912c..a1bb36e 100644
> > --- a/src/mesa/main/clear.c
> > +++ b/src/mesa/main/clear.c
> > @@ -267,6 +267,14 @@ make_color_buffer_mask(struct gl_context *ctx,
> GLint drawbuffer)
> >   mask |= BUFFER_BIT_FRONT_RIGHT;
> >break;
> > case GL_BACK:
> > +  /* For GLES contexts with a single buffered configuration, we
> actually
> > +   * only have a front renderbuffer, so any clear calls to GL_BACK
> should
> > +   * affect that buffer. See draw_buffer_enum_to_bitmask for
> details.
> > +   */
> > +  if (_mesa_is_gles(ctx))
> > + if (!ctx->DrawBuffer->Visual.doubleBufferMode)
> > +if (att[BUFFER_FRONT_LEFT].Renderbuffer)
> > +   mask |= BUFFER_BIT_FRONT_LEFT;
> >if (att[BUFFER_BACK_LEFT].Renderbuffer)
> >   mask |= BUFFER_BIT_BACK_LEFT;
> >if (att[BUFFER_BACK_RIGHT].Renderbuffer)
> > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> > index 6ffa99c..ec6bc3e 100644
> > --- a/src/mesa/main/get.c
> > +++ b/src/mesa/main/get.c
> > @@ -627,6 +627,15 @@ find_custom_value(struct gl_context *ctx, const
> struct value_desc *d, union valu
> >break;
> >
> > case GL_READ_BUFFER:
> > +  /* In _mesa_initialize_window_framebuffer, for single-buffered
> visuals,
> > +   * the ColorReadBuffer is set to be GL_FRONT, even with GLES
> contexts.
> > +   * When calling read_buffer, we verify we are reading from
> GL_BACK in
> > +   * is_legal_es3_readbuffer_enum.  But the default is incorrect,
> and
> > +   * certain dEQP tests check this.  So fix it here.
> > +   */
> > +  if (_mesa_is_gles(ctx))
> > + if (ctx->ReadBuffer->ColorReadBuffer == GL_FRONT)
> > +ctx->ReadBuffer->ColorReadBuffer = GL_BACK;
>
> Why is this OK to do? Shouldn't these just get not get set that way in
> the first 

Re: [Mesa-dev] [PATCH 2/7] glsl: Avoid aliasing violations.

2016-06-27 Thread Matt Turner
On Mon, Jun 27, 2016 at 3:27 PM, Brian Paul  wrote:
> On 06/27/2016 03:42 PM, Matt Turner wrote:
>>
>> ---
>>   src/compiler/glsl/ir_constant_expression.cpp| 11 +++
>>   src/compiler/glsl/link_uniform_initializers.cpp |  3 +--
>>   2 files changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/compiler/glsl/ir_constant_expression.cpp
>> b/src/compiler/glsl/ir_constant_expression.cpp
>> index e79d5ef..7c6a17d 100644
>> --- a/src/compiler/glsl/ir_constant_expression.cpp
>> +++ b/src/compiler/glsl/ir_constant_expression.cpp
>> @@ -1573,18 +1573,13 @@ ir_expression::constant_expression_value(struct
>> hash_table *variable_context)
>>data.f[c] = CLAMP(op[0]->value.f[c], 0.0f, 1.0f);
>> }
>> break;
>> -   case ir_unop_pack_double_2x32: {
>> +   case ir_unop_pack_double_2x32:
>> /* XXX needs to be checked on big-endian */
>> -  uint64_t temp;
>> -  temp = (uint64_t)op[0]->value.u[0] | ((uint64_t)op[0]->value.u[1]
>> << 32);
>> -  data.d[0] = *(double *)
>> -
>> +  memcpy([0], [0]->value.u[0], sizeof(double));
>> break;
>> -   }
>>  case ir_unop_unpack_double_2x32:
>> /* XXX needs to be checked on big-endian */
>> -  data.u[0] = *(uint32_t *)[0]->value.d[0];
>> -  data.u[1] = *((uint32_t *)[0]->value.d[0] + 1);
>> +  memcpy([0], [0]->value.d[0], sizeof(double));
>> break;
>>
>>  case ir_triop_bitfield_extract: {
>> diff --git a/src/compiler/glsl/link_uniform_initializers.cpp
>> b/src/compiler/glsl/link_uniform_initializers.cpp
>> index acf8222..98a2397 100644
>> --- a/src/compiler/glsl/link_uniform_initializers.cpp
>> +++ b/src/compiler/glsl/link_uniform_initializers.cpp
>> @@ -65,8 +65,7 @@ copy_constant_to_storage(union gl_constant_value
>> *storage,
>>  break;
>> case GLSL_TYPE_DOUBLE:
>>/* XXX need to check on big-endian */
>> - storage[i * 2].u = *(uint32_t *)>value.d[i];
>> - storage[i * 2 + 1].u = *(((uint32_t *)>value.d[i]) + 1);
>> + memcpy([i * 2].u, >value.d[i], sizeof(double));
>>break;
>> case GLSL_TYPE_BOOL:
>>  storage[i].b = val->value.b[i] ? boolean_true : 0;
>>
>
> For 2-7, Reviewed-by: Brian Paul 

Thank you.

> Though, in patch 6, as long as you're there, I wonder if we should use
> parenthesized (offset) in the macro bodies, just to be safe.

Oh! Sure. Thanks for the suggestion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] glsl: Avoid aliasing violations.

2016-06-27 Thread Brian Paul

On 06/27/2016 03:42 PM, Matt Turner wrote:

---
  src/compiler/glsl/ir_constant_expression.cpp| 11 +++
  src/compiler/glsl/link_uniform_initializers.cpp |  3 +--
  2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index e79d5ef..7c6a17d 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -1573,18 +1573,13 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   data.f[c] = CLAMP(op[0]->value.f[c], 0.0f, 1.0f);
}
break;
-   case ir_unop_pack_double_2x32: {
+   case ir_unop_pack_double_2x32:
/* XXX needs to be checked on big-endian */
-  uint64_t temp;
-  temp = (uint64_t)op[0]->value.u[0] | ((uint64_t)op[0]->value.u[1] << 32);
-  data.d[0] = *(double *)
-
+  memcpy([0], [0]->value.u[0], sizeof(double));
break;
-   }
 case ir_unop_unpack_double_2x32:
/* XXX needs to be checked on big-endian */
-  data.u[0] = *(uint32_t *)[0]->value.d[0];
-  data.u[1] = *((uint32_t *)[0]->value.d[0] + 1);
+  memcpy([0], [0]->value.d[0], sizeof(double));
break;

 case ir_triop_bitfield_extract: {
diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp
index acf8222..98a2397 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -65,8 +65,7 @@ copy_constant_to_storage(union gl_constant_value *storage,
 break;
case GLSL_TYPE_DOUBLE:
   /* XXX need to check on big-endian */
- storage[i * 2].u = *(uint32_t *)>value.d[i];
- storage[i * 2 + 1].u = *(((uint32_t *)>value.d[i]) + 1);
+ memcpy([i * 2].u, >value.d[i], sizeof(double));
   break;
case GLSL_TYPE_BOOL:
 storage[i].b = val->value.b[i] ? boolean_true : 0;



For 2-7, Reviewed-by: Brian Paul 

Though, in patch 6, as long as you're there, I wonder if we should use 
parenthesized (offset) in the macro bodies, just to be safe.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gm107/ir: add missing setcond flags for LOP variants

2016-06-27 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Jun 27, 2016 at 6:13 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 25a9a52..334c078 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1625,6 +1625,7 @@ CodeEmitterGM107::emitLOP()
>   break;
>}
>emitPRED (0x30);
> +  emitCC   (0x2f);
>emitX(0x2b);
>emitField(0x29, 2, lop);
>emitINV  (0x28, insn->src(1));
> @@ -1635,6 +1636,7 @@ CodeEmitterGM107::emitLOP()
>emitINV  (0x38, insn->src(1));
>emitINV  (0x37, insn->src(0));
>emitField(0x35, 2, lop);
> +  emitCC   (0x34);
>emitIMMD (0x14, 32, insn->src(1));
> }
>
> --
> 2.8.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] gm107/ir: add missing setcond flags for LOP variants

2016-06-27 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 25a9a52..334c078 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1625,6 +1625,7 @@ CodeEmitterGM107::emitLOP()
  break;
   }
   emitPRED (0x30);
+  emitCC   (0x2f);
   emitX(0x2b);
   emitField(0x29, 2, lop);
   emitINV  (0x28, insn->src(1));
@@ -1635,6 +1636,7 @@ CodeEmitterGM107::emitLOP()
   emitINV  (0x38, insn->src(1));
   emitINV  (0x37, insn->src(0));
   emitField(0x35, 2, lop);
+  emitCC   (0x34);
   emitIMMD (0x14, 32, insn->src(1));
}
 
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Samuel Pitoiset



On 06/28/2016 12:06 AM, Ilia Mirkin wrote:

On Mon, Jun 27, 2016 at 6:05 PM, Ilia Mirkin  wrote:

On Mon, Jun 27, 2016 at 6:04 PM, Samuel Pitoiset
 wrote:



On 06/28/2016 12:02 AM, Ilia Mirkin wrote:


This loses you saturation. Does the target account for this?



No saturate flag for FADD32I.


That's not what I asked.


Specifically look at this code:

bool
TargetNVC0::isSatSupported(const Instruction *insn) const
{
   if (insn->op == OP_CVT)
  return true;
   if (!(opInfo[insn->op].dstMods & NV50_IR_MOD_SAT))
  return false;

   if (insn->dType == TYPE_U32)
  return (insn->op == OP_ADD) || (insn->op == OP_MAD);

   // add f32 LIMM cannot saturate
   if (insn->op == OP_ADD && insn->sType == TYPE_F32) {
  if (insn->getSrc(1)->asImm() &&
  insn->getSrc(1)->reg.data.u32 & 0xfff)
 return false;
   }

Note how it will say that sat is supported for SIMMs with FADD? So the
compiler will generate those ops, but then the emitter won't be able
to handle it.



Okay, I get it.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 6:05 PM, Ilia Mirkin  wrote:
> On Mon, Jun 27, 2016 at 6:04 PM, Samuel Pitoiset
>  wrote:
>>
>>
>> On 06/28/2016 12:02 AM, Ilia Mirkin wrote:
>>>
>>> This loses you saturation. Does the target account for this?
>>
>>
>> No saturate flag for FADD32I.
>
> That's not what I asked.

Specifically look at this code:

bool
TargetNVC0::isSatSupported(const Instruction *insn) const
{
   if (insn->op == OP_CVT)
  return true;
   if (!(opInfo[insn->op].dstMods & NV50_IR_MOD_SAT))
  return false;

   if (insn->dType == TYPE_U32)
  return (insn->op == OP_ADD) || (insn->op == OP_MAD);

   // add f32 LIMM cannot saturate
   if (insn->op == OP_ADD && insn->sType == TYPE_F32) {
  if (insn->getSrc(1)->asImm() &&
  insn->getSrc(1)->reg.data.u32 & 0xfff)
 return false;
   }

Note how it will say that sat is supported for SIMMs with FADD? So the
compiler will generate those ops, but then the emitter won't be able
to handle it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 6:04 PM, Samuel Pitoiset
 wrote:
>
>
> On 06/28/2016 12:02 AM, Ilia Mirkin wrote:
>>
>> This loses you saturation. Does the target account for this?
>
>
> No saturate flag for FADD32I.

That's not what I asked.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of LOP32I for all immediates

2016-06-27 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Jun 27, 2016 at 5:55 PM, Samuel Pitoiset
 wrote:
> LOP only allows to emit 19-bits immediates.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index ccc169e..8dc0c22 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1612,7 +1612,7 @@ CodeEmitterGM107::emitLOP()
>break;
> }
>
> -   if (!longIMMD(insn->src(1))) {
> +   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
>switch (insn->src(1).getFile()) {
>case FILE_GPR:
>   emitInsn(0x5c40);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Samuel Pitoiset



On 06/28/2016 12:02 AM, Ilia Mirkin wrote:

This loses you saturation. Does the target account for this?


No saturate flag for FADD32I.



On Mon, Jun 27, 2016 at 5:55 PM, Samuel Pitoiset
 wrote:

FADD only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index c7bd4e1..ccc169e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1215,7 +1215,7 @@ CodeEmitterGM107::emitDSETP()
 void
 CodeEmitterGM107::emitFADD()
 {
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c58);
--
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gm107/ir: add missing setcond flags for LOP variants

2016-06-27 Thread Samuel Pitoiset



On 06/28/2016 12:01 AM, Ilia Mirkin wrote:

Hm, dangerous:

CodeEmitterGM107::emitCC(int pos)
{
   emitField(pos, 1, insn->defExists(1));
}

That should *probably* be insn->flagsDef >= 0. IIRC I fixed up
gf100/gk110 before. But that might have been nv50-specific, I forget
(which has somewhat different flags).


Yeah, right. But this can be fixed in a separate patch.



On Mon, Jun 27, 2016 at 5:55 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 80761e2..c7bd4e1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1632,12 +1632,14 @@ CodeEmitterGM107::emitLOP()
   }
   emitPRED (0x30);
   emitX(0x2b);
+  emitCC   (0x2f);


The pattern might not be obvious, but these are sorted by decreasing
bitfield position. Please keep the order.


Mmmh, I would definitely prefer to sort by arguments... but whatever 
I'll fix that.





   emitField(0x29, 2, lop);
   emitINV  (0x28, insn->src(1));
   emitINV  (0x27, insn->src(0));
} else {
   emitInsn (0x0400);
   emitX(0x39);
+  emitCC   (0x34);
   emitINV  (0x38, insn->src(1));
   emitINV  (0x37, insn->src(0));
   emitField(0x35, 2, lop);
--
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Ilia Mirkin
This loses you saturation. Does the target account for this?

On Mon, Jun 27, 2016 at 5:55 PM, Samuel Pitoiset
 wrote:
> FADD only allows to emit 19-bits immediates.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index c7bd4e1..ccc169e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1215,7 +1215,7 @@ CodeEmitterGM107::emitDSETP()
>  void
>  CodeEmitterGM107::emitFADD()
>  {
> -   if (!longIMMD(insn->src(1))) {
> +   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
>switch (insn->src(1).getFile()) {
>case FILE_GPR:
>   emitInsn(0x5c58);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gm107/ir: add missing setcond flags for LOP variants

2016-06-27 Thread Ilia Mirkin
Hm, dangerous:

CodeEmitterGM107::emitCC(int pos)
{
   emitField(pos, 1, insn->defExists(1));
}

That should *probably* be insn->flagsDef >= 0. IIRC I fixed up
gf100/gk110 before. But that might have been nv50-specific, I forget
(which has somewhat different flags).

On Mon, Jun 27, 2016 at 5:55 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 80761e2..c7bd4e1 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1632,12 +1632,14 @@ CodeEmitterGM107::emitLOP()
>}
>emitPRED (0x30);
>emitX(0x2b);
> +  emitCC   (0x2f);

The pattern might not be obvious, but these are sorted by decreasing
bitfield position. Please keep the order.

>emitField(0x29, 2, lop);
>emitINV  (0x28, insn->src(1));
>emitINV  (0x27, insn->src(0));
> } else {
>emitInsn (0x0400);
>emitX(0x39);
> +  emitCC   (0x34);
>emitINV  (0x38, insn->src(1));
>emitINV  (0x37, insn->src(0));
>emitField(0x35, 2, lop);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gm107/ir: make use of LOP32I for all immediates

2016-06-27 Thread Samuel Pitoiset
LOP only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index ccc169e..8dc0c22 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1612,7 +1612,7 @@ CodeEmitterGM107::emitLOP()
   break;
}
 
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c40);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gm107/ir: make use of FADD32I for all immediates

2016-06-27 Thread Samuel Pitoiset
FADD only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index c7bd4e1..ccc169e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1215,7 +1215,7 @@ CodeEmitterGM107::emitDSETP()
 void
 CodeEmitterGM107::emitFADD()
 {
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c58);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gm107/ir: add missing setcond flags for LOP variants

2016-06-27 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 80761e2..c7bd4e1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1632,12 +1632,14 @@ CodeEmitterGM107::emitLOP()
   }
   emitPRED (0x30);
   emitX(0x2b);
+  emitCC   (0x2f);
   emitField(0x29, 2, lop);
   emitINV  (0x28, insn->src(1));
   emitINV  (0x27, insn->src(0));
} else {
   emitInsn (0x0400);
   emitX(0x39);
+  emitCC   (0x34);
   emitINV  (0x38, insn->src(1));
   emitINV  (0x37, insn->src(0));
   emitField(0x35, 2, lop);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] miptree: Skip attempts to make unsupported images

2016-06-27 Thread Anuj Phogat
On Mon, Jun 27, 2016 at 11:30 AM, Nanley Chery  wrote:
> This causes tests that attempt to create linear depth buffers on
> Gen7+ (unsupported), to be skipped.
>
> Signed-off-by: Nanley Chery 
> ---
>  src/tests/func/miptree/miptree.c | 39 ++-
>  1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/src/tests/func/miptree/miptree.c 
> b/src/tests/func/miptree/miptree.c
> index 0ced389..37002cb 100644
> --- a/src/tests/func/miptree/miptree.c
> +++ b/src/tests/func/miptree/miptree.c
> @@ -410,6 +410,19 @@ miptree_destroy(miptree_t *mt)
>  free(mt);
>  }
>
> +static void
> +can_create_image(VkImageType type, VkImageTiling tiling,
> + uint32_t usage, VkFormat format)
> +{
> +VkImageFormatProperties fmt_properties;
> +VkResult result =
> +  vkGetPhysicalDeviceImageFormatProperties(t_physical_dev, 
> format,
> +   type, tiling, usage,
> +   0, _properties);
> +if (result == VK_ERROR_FORMAT_NOT_SUPPORTED)
> +   t_end(TEST_RESULT_SKIP);
> +}
> +
>  static const miptree_t *
>  miptree_create(void)
>  {
> @@ -424,10 +437,19 @@ miptree_create(void)
>  const uint32_t depth = params->depth;
>  const uint32_t array_length = params->array_length;
>  const size_t buffer_size = miptree_calc_buffer_size();
> +const uint32_t usage_bits = VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
> +VK_IMAGE_USAGE_TRANSFER_DST_BIT |
> +VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
> +VK_IMAGE_USAGE_SAMPLED_BIT;
> +VkImageType image_type = 
> image_type_from_image_view_type(params->view_type);
> +
> +// Determine if an image can be created with this combination
> +can_create_image(image_type, VK_IMAGE_TILING_OPTIMAL,
> + usage_bits, format);
>
>  // Create the image that will contain the real miptree.
>  VkImage image = qoCreateImage(t_device,
> -.imageType = image_type_from_image_view_type(params->view_type),
> +.imageType = image_type,
>  .format = format,
>  .mipLevels = levels,
>  .arrayLayers = array_length,
> @@ -437,10 +459,7 @@ miptree_create(void)
>  .depth = depth,
>  },
>  .tiling = VK_IMAGE_TILING_OPTIMAL,
> -.usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
> - VK_IMAGE_USAGE_TRANSFER_DST_BIT |
> - VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
> - VK_IMAGE_USAGE_SAMPLED_BIT);
> +.usage = usage_bits);
>  VkBuffer src_buffer = qoCreateBuffer(t_device,
>  .size = buffer_size,
>  .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT);
> @@ -501,6 +520,11 @@ miptree_create(void)
>  break;
>  case MIPTREE_UPLOAD_METHOD_COPY_FROM_LINEAR_IMAGE:
>  case MIPTREE_UPLOAD_METHOD_COPY_WITH_DRAW:
> +
> +// Determine if an image can be created with this combination
> +can_create_image(VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR,
> + VK_IMAGE_USAGE_TRANSFER_SRC_BIT, format);
> +
>  src_vk_image = qoCreateImage(t_device,
>  .format = format,
>  .mipLevels = 1,
> @@ -523,6 +547,11 @@ miptree_create(void)
>  break;
>  case MIPTREE_DOWNLOAD_METHOD_COPY_TO_LINEAR_IMAGE:
>  case MIPTREE_DOWNLOAD_METHOD_COPY_WITH_DRAW:
> +
> +// Determine if an image can be created with this combination
> +can_create_image(VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR,
> + VK_IMAGE_USAGE_TRANSFER_DST_BIT, format);
> +
>  dest_vk_image = qoCreateImage(t_device,
>  .format = format,
>  .mipLevels = 1,
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] mesa: Avoid aliasing violation in FXT1.

2016-06-27 Thread Matt Turner
---
 src/mesa/main/texcompress_fxt1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/texcompress_fxt1.c b/src/mesa/main/texcompress_fxt1.c
index ae339e1..c5646fb 100644
--- a/src/mesa/main/texcompress_fxt1.c
+++ b/src/mesa/main/texcompress_fxt1.c
@@ -177,8 +177,8 @@ _mesa_texstore_rgba_fxt1(TEXSTORE_PARAMS)
 #define LL_RMS_D 10 /* fault tolerance (maximum delta) */
 #define LL_RMS_E 255 /* fault tolerance (maximum error) */
 #define ALPHA_TS 2 /* alpha threshold: (255 - ALPHA_TS) deemed opaque */
-#define ISTBLACK(v) (*((GLuint *)(v)) == 0)
-
+static const GLuint zero = 0;
+#define ISTBLACK(v) (memcmp(&(v), , sizeof(zero)) == 0)
 
 /*
  * Define a 64-bit unsigned integer type and macros
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] swrast: Avoid aliasing violation.

2016-06-27 Thread Matt Turner
---
 src/mesa/swrast/s_masking.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/swrast/s_masking.c b/src/mesa/swrast/s_masking.c
index c95587b..c10bf1a 100644
--- a/src/mesa/swrast/s_masking.c
+++ b/src/mesa/swrast/s_masking.c
@@ -56,8 +56,8 @@ _swrast_mask_rgba_span(struct gl_context *ctx, struct 
gl_renderbuffer *rb,
 * Note that we're not using span->array->mask[] here.  We could...
 */
if (span->array->ChanType == GL_UNSIGNED_BYTE) {
-  /* treat 4xGLubyte as 1xGLuint */
-  const GLuint srcMask = *((GLuint *) ctx->Color.ColorMask[buf]);
+  GLuint srcMask;
+  memcpy(, ctx->Color.ColorMask[buf], sizeof(srcMask));
   const GLuint dstMask = ~srcMask;
   const GLuint *dst = (const GLuint *) rbPixels;
   GLuint *src = (GLuint *) span->array->rgba8;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] glsl: Separate overlapping sentinel nodes in exec_list.

2016-06-27 Thread Matt Turner
I do appreciate the cleverness, but unfortunately it prevents a lot more
cleverness in the form of additional compiler optimizations brought on
by -fstrict-aliasing.

No difference in OglBatch7 (n=20).

Co-authored-by: Davin McCall 
---
 src/compiler/glsl/ast.h|   4 +-
 src/compiler/glsl/ast_function.cpp |  22 ++--
 src/compiler/glsl/ast_to_hir.cpp   |   6 +-
 src/compiler/glsl/ast_type.cpp |   2 +-
 src/compiler/glsl/glsl_parser_extras.cpp   |   6 +-
 src/compiler/glsl/ir.cpp   |   8 +-
 src/compiler/glsl/ir_clone.cpp |   2 +-
 src/compiler/glsl/ir_constant_expression.cpp   |   2 +-
 src/compiler/glsl/ir_function.cpp  |  14 +--
 src/compiler/glsl/ir_reader.cpp|   4 +-
 src/compiler/glsl/ir_validate.cpp  |   4 +-
 src/compiler/glsl/list.h   | 136 +
 src/compiler/glsl/lower_distance.cpp   |   4 +-
 src/compiler/glsl/lower_jumps.cpp  |   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp|   8 +-
 src/compiler/glsl/lower_tess_level.cpp |   4 +-
 src/compiler/glsl/opt_conditional_discard.cpp  |   6 +-
 src/compiler/glsl/opt_dead_builtin_varyings.cpp|   2 +-
 src/compiler/glsl/opt_dead_code.cpp|   2 +-
 src/compiler/glsl/opt_flatten_nested_if_blocks.cpp |   2 +-
 src/compiler/nir/nir.h |   4 +-
 src/compiler/nir/nir_opt_gcm.c |   2 +-
 src/mesa/drivers/dri/i965/brw_cfg.h|   2 +-
 src/mesa/drivers/dri/i965/brw_fs_builder.h |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4_builder.h   |   2 +-
 25 files changed, 116 insertions(+), 136 deletions(-)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 06c7b03..5029ba7 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -346,8 +346,8 @@ public:
 
bool is_single_dimension() const
{
-  return this->array_dimensions.tail_pred->prev != NULL &&
- this->array_dimensions.tail_pred->prev->is_head_sentinel();
+  return this->array_dimensions.tail_sentinel.prev->prev != NULL &&
+ 
this->array_dimensions.tail_sentinel.prev->prev->is_head_sentinel();
}
 
virtual void print(void) const;
diff --git a/src/compiler/glsl/ast_function.cpp 
b/src/compiler/glsl/ast_function.cpp
index f74394f..41b0765 100644
--- a/src/compiler/glsl/ast_function.cpp
+++ b/src/compiler/glsl/ast_function.cpp
@@ -186,8 +186,8 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
   exec_list _ir_parameters,
   exec_list _ast_parameters)
 {
-   exec_node *actual_ir_node  = actual_ir_parameters.head;
-   exec_node *actual_ast_node = actual_ast_parameters.head;
+   exec_node *actual_ir_node  = actual_ir_parameters.head_sentinel.next;
+   exec_node *actual_ast_node = actual_ast_parameters.head_sentinel.next;
 
foreach_in_list(const ir_variable, formal, >parameters) {
   /* The lists must be the same length. */
@@ -318,10 +318,12 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
const char *func_name = sig->function_name();
bool is_atomic = is_atomic_function(func_name);
if (is_atomic) {
-  const ir_rvalue *const actual = (ir_rvalue *) actual_ir_parameters.head;
+  const ir_rvalue *const actual =
+ (ir_rvalue *) actual_ir_parameters.head_sentinel.next;
 
   const ast_expression *const actual_ast =
- exec_node_data(ast_expression, actual_ast_parameters.head, link);
+ exec_node_data(ast_expression,
+actual_ast_parameters.head_sentinel.next, link);
   YYLTYPE loc = actual_ast->get_location();
 
   if (!verify_first_atomic_parameter(, state,
@@ -1176,7 +1178,7 @@ constant_record_constructor(const glsl_type 
*constructor_type,
 bool
 single_scalar_parameter(exec_list *parameters)
 {
-   const ir_rvalue *const p = (ir_rvalue *) parameters->head;
+   const ir_rvalue *const p = (ir_rvalue *) parameters->head_sentinel.next;
assert(((ir_rvalue *)p)->as_rvalue() != NULL);
 
return (p->type->is_scalar() && p->next->is_tail_sentinel());
@@ -1220,7 +1222,7 @@ emit_inline_vector_constructor(const glsl_type *type,
 */
const unsigned lhs_components = type->components();
if (single_scalar_parameter(parameters)) {
-  ir_rvalue *first_param = (ir_rvalue *)parameters->head;
+  ir_rvalue *first_param = (ir_rvalue *)parameters->head_sentinel.next;
   ir_rvalue *rhs = new(ctx) ir_swizzle(first_param, 0, 0, 0, 0,
   lhs_components);
   ir_dereference_variable *lhs = new(ctx) ir_dereference_variable(var);
@@ -1421,7 +1423,7 @@ emit_inline_matrix_constructor(const glsl_type *type,
 *to the upper left portion of the constructed matrix, and the remaining
 

[Mesa-dev] [PATCH 5/7] mesa: Avoid aliasing violation in uniform_query.cpp.

2016-06-27 Thread Matt Turner
---
 src/mesa/main/uniform_query.cpp | 45 -
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 127f097..3e460b0 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -382,9 +382,12 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
   case GLSL_TYPE_BOOL:
  dst[didx].f = src[sidx].i ? 1.0f : 0.0f;
  break;
-  case GLSL_TYPE_DOUBLE:
- dst[didx].f = *(double *)[sidx].f;
+   case GLSL_TYPE_DOUBLE: {
+  double tmp;
+  memcpy(, [sidx].f, sizeof(tmp));
+  dst[didx].f = tmp;
  break;
+   }
   default:
  assert(!"Should not get here.");
  break;
@@ -392,20 +395,28 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
   break;
case GLSL_TYPE_DOUBLE:
   switch (uni->type->base_type) {
-  case GLSL_TYPE_UINT:
- *(double *)[didx].f = (double) src[sidx].u;
+   case GLSL_TYPE_UINT: {
+  double tmp = src[sidx].u;
+  memcpy([didx].f, , sizeof(tmp));
  break;
+   }
   case GLSL_TYPE_INT:
   case GLSL_TYPE_SAMPLER:
-  case GLSL_TYPE_IMAGE:
- *(double *)[didx].f = (double) src[sidx].i;
+   case GLSL_TYPE_IMAGE: {
+  double tmp = src[sidx].i;
+  memcpy([didx].f, , sizeof(tmp));
  break;
-  case GLSL_TYPE_BOOL:
- *(double *)[didx].f = src[sidx].i ? 1.0f : 0.0f;
+   }
+   case GLSL_TYPE_BOOL: {
+  double tmp = src[sidx].i ? 1.0 : 0.0;
+  memcpy([didx].f, , sizeof(tmp));
  break;
-  case GLSL_TYPE_FLOAT:
- *(double *)[didx].f = (double) src[sidx].f;
+   }
+   case GLSL_TYPE_FLOAT: {
+  double tmp = src[sidx].f;
+  memcpy([didx].f, , sizeof(tmp));
  break;
+   }
   default:
  assert(!"Should not get here.");
  break;
@@ -437,9 +448,12 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
   case GLSL_TYPE_BOOL:
  dst[didx].i = src[sidx].i ? 1 : 0;
  break;
-  case GLSL_TYPE_DOUBLE:
- dst[didx].i = IROUNDD(*(double *)[sidx].f);
+   case GLSL_TYPE_DOUBLE: {
+  double tmp;
+  memcpy(, [sidx].f, sizeof(tmp));
+  dst[didx].i = IROUNDD(tmp);
  break;
+   }
   default:
  assert(!"Should not get here.");
  break;
@@ -486,9 +500,12 @@ log_uniform(const void *values, enum glsl_base_type 
basicType,
   case GLSL_TYPE_FLOAT:
 printf("%g ", v[i].f);
 break;
-  case GLSL_TYPE_DOUBLE:
- printf("%g ", *(double* )[i * 2].f);
+  case GLSL_TYPE_DOUBLE: {
+ double tmp;
+ memcpy(, [i * 2].f, sizeof(tmp));
+ printf("%g ", tmp);
  break;
+  }
   default:
 assert(!"Should not get here.");
 break;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] glx: Avoid aliasing violations.

2016-06-27 Thread Matt Turner
Compilers are perfectly capable of generating efficient code for calls
like these to memcpy().
---
 src/glx/packrender.h | 34 --
 src/glx/packsingle.h | 15 +--
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/src/glx/packrender.h b/src/glx/packrender.h
index f8f38ca..58094e0 100644
--- a/src/glx/packrender.h
+++ b/src/glx/packrender.h
@@ -155,28 +155,34 @@
 
 /* Single item copy macros */
 #define __GLX_PUT_CHAR(offset,a)\
-   *((INT8 *) (pc + offset)) = a
+   do { \
+  int8_t __tmp = (a);   \
+  memcpy((pc + offset), &__tmp, 1); \
+   } while (0)
 
 #define __GLX_PUT_SHORT(offset,a)   \
-   *((INT16 *) (pc + offset)) = a
+   do { \
+  int16_t __tmp = (a);  \
+  memcpy((pc + offset), &__tmp, 2); \
+   } while (0)
 
 #define __GLX_PUT_LONG(offset,a)\
-   *((INT32 *) (pc + offset)) = a
+   do { \
+  int32_t __tmp = (a);  \
+  memcpy((pc + offset), &__tmp, 4); \
+   } while (0)
 
 #define __GLX_PUT_FLOAT(offset,a)   \
-   *((FLOAT32 *) (pc + offset)) = a
+   do { \
+  float __tmp = (a);\
+  memcpy((pc + offset), &__tmp, 4); \
+   } while (0)
 
-#ifdef __GLX_ALIGN64
-/*
-** This can certainly be done better for a particular machine
-** architecture!
-*/
-#define __GLX_PUT_DOUBLE(offset,a)  \
-   __GLX_MEM_COPY(pc + offset, , 8)
-#else
 #define __GLX_PUT_DOUBLE(offset,a)  \
-   *((FLOAT64 *) (pc + offset)) = a
-#endif
+   do { \
+  double __tmp = (a);   \
+  memcpy((pc + offset), &__tmp, 8); \
+   } while (0)
 
 #define __GLX_PUT_CHAR_ARRAY(offset,a,alen) \
__GLX_MEM_COPY(pc + offset, a, alen * __GLX_SIZE_INT8)
diff --git a/src/glx/packsingle.h b/src/glx/packsingle.h
index fddcbf1..e38e820 100644
--- a/src/glx/packsingle.h
+++ b/src/glx/packsingle.h
@@ -103,24 +103,19 @@
a = (GLint) reply.size
 
 #define __GLX_SINGLE_GET_CHAR(p)\
-   *p = *(GLbyte *)
+   memcpy(p, , 1);
 
 #define __GLX_SINGLE_GET_SHORT(p)   \
-   *p = *(GLshort *)
+   memcpy(p, , 2);
 
 #define __GLX_SINGLE_GET_LONG(p)\
-   *p = *(GLint *)
+   memcpy(p, , 4);
 
 #define __GLX_SINGLE_GET_FLOAT(p)   \
-   *p = *(GLfloat *)
+   memcpy(p, , 4);
 
-#ifdef __GLX_ALIGN64
 #define __GLX_SINGLE_GET_DOUBLE(p)  \
-   __GLX_MEM_COPY(p, , 8)
-#else
-#define __GLX_SINGLE_GET_DOUBLE(p)  \
-   *p = *(GLdouble *)
-#endif
+   memcpy(p, , 8);
 
 /* Get an array of typed data */
 #define __GLX_SINGLE_GET_VOID_ARRAY(a,alen) \
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] glsl: Avoid aliasing violations.

2016-06-27 Thread Matt Turner
---
 src/compiler/glsl/ir_constant_expression.cpp| 11 +++
 src/compiler/glsl/link_uniform_initializers.cpp |  3 +--
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index e79d5ef..7c6a17d 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -1573,18 +1573,13 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
  data.f[c] = CLAMP(op[0]->value.f[c], 0.0f, 1.0f);
   }
   break;
-   case ir_unop_pack_double_2x32: {
+   case ir_unop_pack_double_2x32:
   /* XXX needs to be checked on big-endian */
-  uint64_t temp;
-  temp = (uint64_t)op[0]->value.u[0] | ((uint64_t)op[0]->value.u[1] << 32);
-  data.d[0] = *(double *)
-
+  memcpy([0], [0]->value.u[0], sizeof(double));
   break;
-   }
case ir_unop_unpack_double_2x32:
   /* XXX needs to be checked on big-endian */
-  data.u[0] = *(uint32_t *)[0]->value.d[0];
-  data.u[1] = *((uint32_t *)[0]->value.d[0] + 1);
+  memcpy([0], [0]->value.d[0], sizeof(double));
   break;
 
case ir_triop_bitfield_extract: {
diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp
index acf8222..98a2397 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -65,8 +65,7 @@ copy_constant_to_storage(union gl_constant_value *storage,
 break;
   case GLSL_TYPE_DOUBLE:
  /* XXX need to check on big-endian */
- storage[i * 2].u = *(uint32_t *)>value.d[i];
- storage[i * 2 + 1].u = *(((uint32_t *)>value.d[i]) + 1);
+ memcpy([i * 2].u, >value.d[i], sizeof(double));
  break;
   case GLSL_TYPE_BOOL:
 storage[i].b = val->value.b[i] ? boolean_true : 0;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] mesa: Enable -fstrict-aliasing

2016-06-27 Thread Matt Turner
Based on work by Davin McCall  from last summer.

The biggest change is to exec_list. Previously, the head and tail sentinels
overlapped, saving the size of a pointer. Unfortunately this is not allowed by
the aliasing rules.

I have fixed all warnings GCC reports in my normal build. I have no attempted
to see what else needs to be fixed. I hope that the respective owners of the
rest of Mesa can look into the remaining warnings.

This series depends on my 4 patch series to glx, and the trivial "[PATCH] i965:
Simplify foreach_inst_in_block_safe() macro."

Discuss!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] mesa: Drop -fno-strict-aliasing.

2016-06-27 Thread Matt Turner
Improves performance of OglBatch7 by 4.06851% +/- 1.17925% (n=169) on
Haswell, and cuts ~18k of .text:

   text data  bss  dec  hex  filename
5824627   28781629384  6141827   5db783  before/i965_dri.so
5806354   28781629384  6123554   5d7022  after/i965_dri.so
---
 configure.ac | 6 --
 1 file changed, 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index cc9bc47..bd8842d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -301,9 +301,6 @@ if test "x$GCC" = xyes; then
 # Restore CFLAGS; VISIBILITY_CFLAGS are added to it where needed.
 CFLAGS=$save_CFLAGS
 
-# Work around aliasing bugs - developers should comment this out
-CFLAGS="$CFLAGS -fno-strict-aliasing"
-
 # We don't want floating-point math functions to set errno or trap
 CFLAGS="$CFLAGS -fno-math-errno -fno-trapping-math"
 
@@ -347,9 +344,6 @@ if test "x$GXX" = xyes; then
 # Restore CXXFLAGS; VISIBILITY_CXXFLAGS are added to it where needed.
 CXXFLAGS=$save_CXXFLAGS
 
-# Work around aliasing bugs - developers should comment this out
-CXXFLAGS="$CXXFLAGS -fno-strict-aliasing"
-
 # gcc's builtin memcmp is slower than glibc's
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
 CXXFLAGS="$CXXFLAGS -fno-builtin-memcmp"
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] clover: Fix kernel metadata retrieval after clang r273425

2016-06-27 Thread Jan Vesely
On Wed, 2016-06-22 at 20:52 -0400, Jan Vesely wrote:
> Signed-off-by: Jan Vesely 

sorry for an early ping, but at least the first patch is needed to
unbreak clover with recent llvm (otherwise all kernel launches fail
with unknown kernel name)

thanks,
Jan

> ---
>  .../state_trackers/clover/llvm/invocation.cpp  | 35
> +++---
>  1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 57e..db748b4 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -277,7 +277,16 @@ namespace {
> }
>  
> std::vector
> -   find_kernels(const llvm::Module *mod) {
> +   find_kernels(llvm::Module *mod) {
> +  std::vector kernels;
> +#if HAVE_LLVM >= 0x0309
> +  auto  = mod->getFunctionList();
> +  for_each(list.begin(), list.end(), [&](llvm::Function ){
> + if (f.getMetadata("kernel_arg_type"))
> +   kernels.push_back();
> +  });
> +  return kernels;
> +#endif
>    const llvm::NamedMDNode *kernel_node =
>   mod-
> >getNamedMetadata("opencl.kernels");
>    // This means there are no kernels in the program.  The spec
> does not
> @@ -287,7 +296,6 @@ namespace {
>   return std::vector();
>    }
>  
> -  std::vector kernels;
>    kernels.reserve(kernel_node->getNumOperands());
>    for (unsigned i = 0; i < kernel_node->getNumOperands(); ++i) {
>  #if HAVE_LLVM >= 0x0306
> @@ -373,8 +381,27 @@ namespace {
>    kernel_arg_md(llvm::StringRef type_name_, llvm::StringRef
> access_qual_):
>   type_name(type_name_), access_qual(access_qual_) {}
> };
> +#if HAVE_LLVM >= 0x0309
> +   std::vector
> +   get_kernel_arg_md(const llvm::Function *kernel_func) {
>  
> -#if HAVE_LLVM >= 0x0306
> +  size_t num_args = kernel_func->getArgumentList().size();
> +
> +  auto aq = kernel_func->getMetadata("kernel_arg_access_qual");
> +  auto ty = kernel_func->getMetadata("kernel_arg_type");
> +
> +  std::vector res;
> +  res.reserve(num_args);
> +  for (size_t i = 0; i < num_args; ++i) {
> + res.push_back(kernel_arg_md(
> +llvm::cast(ty->getOperand(i))-
> >getString(),
> +llvm::cast(aq->getOperand(i))-
> >getString()));
> +  }
> +
> +  return res;
> +   }
> +
> +#elif HAVE_LLVM >= 0x0306
>  
> const llvm::MDNode *
> get_kernel_metadata(const llvm::Function *kernel_func) {
> @@ -772,7 +799,7 @@ namespace {
>  
> module
> build_module_native(std::vector ,
> -   const llvm::Module *mod,
> +   llvm::Module *mod,
> const clang::LangAS::Map _spaces,
> std::string _log) {
>  
-- 

Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/omx: add support for nouveau / interlaced

2016-06-27 Thread Julien Isorce
Hi Leo,

Sorry for the inconvenience, could you let me know how to reproduce the
problem ?
I have been playing with some gst pipelines and they all work but I can
only test with nouveau driver.

Cheers
Julien


On 27 June 2016 at 21:35, Leo Liu  wrote:

> This patch break omx decode to file, it got seg fault. Will take look
> further.
>
> Regards,
> Leo
>
>
>
> On 06/27/2016 04:16 AM, Julien Isorce wrote:
>
>> Signed-off-by: Julien Isorce 
>> ---
>>   src/gallium/state_trackers/omx/vid_dec.c | 51
>> 
>>   1 file changed, 26 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/omx/vid_dec.c
>> b/src/gallium/state_trackers/omx/vid_dec.c
>> index 564ca2f..85ffb88 100644
>> --- a/src/gallium/state_trackers/omx/vid_dec.c
>> +++ b/src/gallium/state_trackers/omx/vid_dec.c
>> @@ -48,6 +48,7 @@
>>   #include "pipe/p_video_codec.h"
>>   #include "util/u_memory.h"
>>   #include "util/u_surface.h"
>> +#include "vl/vl_video_buffer.h"
>>   #include "vl/vl_vlc.h"
>> #include "entrypoint.h"
>> @@ -515,34 +516,34 @@ static void vid_dec_FillOutput(vid_dec_PrivateType
>> *priv, struct pipe_video_buff
>>  OMX_VIDEO_PORTDEFINITIONTYPE *def = >sPortParam.format.video;
>>struct pipe_sampler_view **views;
>> -   struct pipe_transfer *transfer;
>> -   struct pipe_box box = { };
>> -   uint8_t *src, *dst;
>> +   unsigned i, j;
>> +   unsigned width, height;
>>views = buf->get_sampler_view_planes(buf);
>>   -   dst = output->pBuffer;
>> -
>> -   box.width = def->nFrameWidth;
>> -   box.height = def->nFrameHeight;
>> -   box.depth = 1;
>> -
>> -   src = priv->pipe->transfer_map(priv->pipe, views[0]->texture, 0,
>> -  PIPE_TRANSFER_READ, , );
>> -   util_copy_rect(dst, views[0]->texture->format, def->nStride, 0, 0,
>> -  box.width, box.height, src, transfer->stride, 0, 0);
>> -   pipe_transfer_unmap(priv->pipe, transfer);
>> -
>> -   dst = ((uint8_t*)output->pBuffer) + (def->nStride * box.height);
>> -
>> -   box.width = def->nFrameWidth / 2;
>> -   box.height = def->nFrameHeight / 2;
>> -
>> -   src = priv->pipe->transfer_map(priv->pipe, views[1]->texture, 0,
>> -  PIPE_TRANSFER_READ, , );
>> -   util_copy_rect(dst, views[1]->texture->format, def->nStride, 0, 0,
>> -  box.width, box.height, src, transfer->stride, 0, 0);
>> -   pipe_transfer_unmap(priv->pipe, transfer);
>> +   for (i = 0; i < 2 /* NV12 */; i++) {
>> +  if (!views[i]) continue;
>> +  width = buf->width;
>> +  height = buf->height;
>> +  vl_video_buffer_adjust_size(, , i, buf->interlaced,
>> buf->chroma_format);
>> +  for (j = 0; j < views[i]->texture->array_size; ++j) {
>> + struct pipe_box box = {0, 0, j, width, height, 1};
>> + struct pipe_transfer *transfer;
>> + uint8_t *map, *dst;
>> + map = priv->pipe->transfer_map(priv->pipe, views[i]->texture, 0,
>> +  PIPE_TRANSFER_READ, , );
>> + if (!map)
>> +return;
>> +
>> + dst = ((uint8_t*)output->pBuffer + output->nOffset) + j *
>> def->nStride + i * buf->width * buf->height;
>> + util_copy_rect(dst,
>> +views[i]->texture->format,
>> +def->nStride * views[i]->texture->array_size, 0, 0,
>> +box.width, box.height, map, transfer->stride, 0, 0);
>> +
>> + pipe_transfer_unmap(priv->pipe, transfer);
>> +  }
>> +   }
>>   }
>> static void vid_dec_FrameDecoded(OMX_COMPONENTTYPE *comp,
>> OMX_BUFFERHEADERTYPE* input,
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] tgsi/exec: consolidate conversion from 64-bit code.

2016-06-27 Thread Dave Airlie
From: Dave Airlie 

These 3 functions could be collapsed into a single one,
passing in some control values.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c | 87 +++---
 1 file changed, 27 insertions(+), 60 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 1457c06..1e07eba 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -3395,7 +3395,8 @@ exec_endswitch(struct tgsi_exec_machine *mach)
 
 typedef void (* micro_dop)(union tgsi_double_channel *dst,
const union tgsi_double_channel *src);
-
+typedef void (* micro_dop_s)(union tgsi_double_channel *dst,
+ const union tgsi_exec_channel *src);
 static void
 fetch_double_channel(struct tgsi_exec_machine *mach,
  union tgsi_double_channel *chan,
@@ -3548,25 +3549,6 @@ exec_double_trinary(struct tgsi_exec_machine *mach,
 }
 
 static void
-exec_f2d(struct tgsi_exec_machine *mach,
- const struct tgsi_full_instruction *inst)
-{
-   union tgsi_exec_channel src;
-   union tgsi_double_channel dst;
-
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_XY) == 
TGSI_WRITEMASK_XY) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_X, 
TGSI_EXEC_DATA_FLOAT);
-  micro_f2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_X, 
TGSI_CHAN_Y);
-   }
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_ZW) == 
TGSI_WRITEMASK_ZW) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_Y, 
TGSI_EXEC_DATA_FLOAT);
-  micro_f2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_Z, 
TGSI_CHAN_W);
-   }
-}
-
-static void
 exec_d2f(struct tgsi_exec_machine *mach,
  const struct tgsi_full_instruction *inst)
 {
@@ -3590,25 +3572,6 @@ exec_d2f(struct tgsi_exec_machine *mach,
 }
 
 static void
-exec_i2d(struct tgsi_exec_machine *mach,
- const struct tgsi_full_instruction *inst)
-{
-   union tgsi_exec_channel src;
-   union tgsi_double_channel dst;
-
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_XY) == 
TGSI_WRITEMASK_XY) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_X, TGSI_EXEC_DATA_INT);
-  micro_i2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_X, 
TGSI_CHAN_Y);
-   }
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_ZW) == 
TGSI_WRITEMASK_ZW) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_Y, TGSI_EXEC_DATA_INT);
-  micro_i2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_Z, 
TGSI_CHAN_W);
-   }
-}
-
-static void
 exec_d2i(struct tgsi_exec_machine *mach,
  const struct tgsi_full_instruction *inst)
 {
@@ -3630,24 +3593,6 @@ exec_d2i(struct tgsi_exec_machine *mach,
   }
}
 }
-static void
-exec_u2d(struct tgsi_exec_machine *mach,
- const struct tgsi_full_instruction *inst)
-{
-   union tgsi_exec_channel src;
-   union tgsi_double_channel dst;
-
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_XY) == 
TGSI_WRITEMASK_XY) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_X, 
TGSI_EXEC_DATA_UINT);
-  micro_u2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_X, 
TGSI_CHAN_Y);
-   }
-   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_ZW) == 
TGSI_WRITEMASK_ZW) {
-  fetch_source(mach, , >Src[0], TGSI_CHAN_Y, 
TGSI_EXEC_DATA_UINT);
-  micro_u2d(, );
-  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_Z, 
TGSI_CHAN_W);
-   }
-}
 
 static void
 exec_d2u(struct tgsi_exec_machine *mach,
@@ -3672,6 +3617,28 @@ exec_d2u(struct tgsi_exec_machine *mach,
}
 }
 
+/* convert from 32-bit to 64-bit type generic */
+static void
+exec_t_2_64(struct tgsi_exec_machine *mach,
+  const struct tgsi_full_instruction *inst,
+  micro_dop_s op,
+  enum tgsi_exec_datatype src_datatype)
+{
+   union tgsi_exec_channel src;
+   union tgsi_double_channel dst;
+
+   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_XY) == 
TGSI_WRITEMASK_XY) {
+  fetch_source(mach, , >Src[0], TGSI_CHAN_X, src_datatype);
+  op(, );
+  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_X, 
TGSI_CHAN_Y);
+   }
+   if ((inst->Dst[0].Register.WriteMask & TGSI_WRITEMASK_ZW) == 
TGSI_WRITEMASK_ZW) {
+  fetch_source(mach, , >Src[0], TGSI_CHAN_Y, src_datatype);
+  op(, );
+  store_double_channel(mach, , >Dst[0], inst, TGSI_CHAN_Z, 
TGSI_CHAN_W);
+   }
+}
+
 static void
 exec_dldexp(struct tgsi_exec_machine *mach,
 const struct tgsi_full_instruction *inst)
@@ -5681,7 +5648,7 @@ exec_instruction(
   break;
 
case TGSI_OPCODE_F2D:
-  exec_f2d(mach, inst);
+  exec_t_2_64(mach, inst, micro_f2d, TGSI_EXEC_DATA_FLOAT);
   break;
 
case TGSI_OPCODE_D2F:
@@ -5757,7 +5724,7 @@ exec_instruction(
   break;
 
case TGSI_OPCODE_I2D:
-  exec_i2d(mach, inst);
+  exec_t_2_64(mach, inst, micro_i2d, TGSI_EXEC_DATA_INT);
   

Re: [Mesa-dev] [PATCH] Make single-buffered GLES representation internally consistent

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 4:17 PM, Gurchetan Singh
 wrote:
> There a few places in the code where clearing and reading are done on 
> incorrect
> buffers for GLES contexts.  See comments for details.  This fixes 75 GLES3
> dEQP tests on the surfaceless platform with no regressions.
>
> v2:  Corrected unclear comment
> ---
>  src/mesa/main/buffers.c | 14 --
>  src/mesa/main/clear.c   |  8 
>  src/mesa/main/get.c |  9 +
>  3 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c
> index e8aedde..86696b8 100644
> --- a/src/mesa/main/buffers.c
> +++ b/src/mesa/main/buffers.c
> @@ -173,12 +173,22 @@ draw_buffer_enum_to_bitmask(const struct gl_context 
> *ctx, GLenum buffer)
>   * return -1 for an invalid buffer.
>   */
>  static gl_buffer_index
> -read_buffer_enum_to_index(GLenum buffer)
> +read_buffer_enum_to_index(const struct gl_context *ctx, GLenum buffer)
>  {
> switch (buffer) {
>case GL_FRONT:
>   return BUFFER_FRONT_LEFT;
>case GL_BACK:
> + if (_mesa_is_gles(ctx)) {
> +/* In draw_buffer_enum_to_bitmask, when GLES contexts draw to
> + * GL_BACK with a single-buffered configuration, we actually end
> + * up drawing to the sole front buffer in our internal
> + * representation.  For consistency, we must read from that
> + * front left buffer too.
> + */
> +if (!ctx->DrawBuffer->Visual.doubleBufferMode)
> +   return BUFFER_FRONT_LEFT;
> + }
>   return BUFFER_BACK_LEFT;
>case GL_RIGHT:
>   return BUFFER_FRONT_RIGHT;
> @@ -724,7 +734,7 @@ read_buffer(struct gl_context *ctx, struct gl_framebuffer 
> *fb,
>if (_mesa_is_gles3(ctx) && !is_legal_es3_readbuffer_enum(buffer))
>   srcBuffer = -1;
>else
> - srcBuffer = read_buffer_enum_to_index(buffer);
> + srcBuffer = read_buffer_enum_to_index(ctx, buffer);
>
>if (srcBuffer == -1) {
>   _mesa_error(ctx, GL_INVALID_ENUM,
> diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c
> index 35b912c..a1bb36e 100644
> --- a/src/mesa/main/clear.c
> +++ b/src/mesa/main/clear.c
> @@ -267,6 +267,14 @@ make_color_buffer_mask(struct gl_context *ctx, GLint 
> drawbuffer)
>   mask |= BUFFER_BIT_FRONT_RIGHT;
>break;
> case GL_BACK:
> +  /* For GLES contexts with a single buffered configuration, we actually
> +   * only have a front renderbuffer, so any clear calls to GL_BACK should
> +   * affect that buffer. See draw_buffer_enum_to_bitmask for details.
> +   */
> +  if (_mesa_is_gles(ctx))
> + if (!ctx->DrawBuffer->Visual.doubleBufferMode)
> +if (att[BUFFER_FRONT_LEFT].Renderbuffer)
> +   mask |= BUFFER_BIT_FRONT_LEFT;
>if (att[BUFFER_BACK_LEFT].Renderbuffer)
>   mask |= BUFFER_BIT_BACK_LEFT;
>if (att[BUFFER_BACK_RIGHT].Renderbuffer)
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 6ffa99c..ec6bc3e 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -627,6 +627,15 @@ find_custom_value(struct gl_context *ctx, const struct 
> value_desc *d, union valu
>break;
>
> case GL_READ_BUFFER:
> +  /* In _mesa_initialize_window_framebuffer, for single-buffered visuals,
> +   * the ColorReadBuffer is set to be GL_FRONT, even with GLES contexts.
> +   * When calling read_buffer, we verify we are reading from GL_BACK in
> +   * is_legal_es3_readbuffer_enum.  But the default is incorrect, and
> +   * certain dEQP tests check this.  So fix it here.
> +   */
> +  if (_mesa_is_gles(ctx))
> + if (ctx->ReadBuffer->ColorReadBuffer == GL_FRONT)
> +ctx->ReadBuffer->ColorReadBuffer = GL_BACK;

Why is this OK to do? Shouldn't these just get not get set that way in
the first place? Writing values when doing a random glGet() seems like
a recipe for confusion and hard-to-reproduce bugs.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96698] [swrast] piglit glsl-array-bounds-05 regression

2016-06-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96698

Bug ID: 96698
   Summary: [swrast] piglit glsl-array-bounds-05 regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: kenn...@whitecape.org, t_arc...@yahoo.com.au

mesa: 6e4cf937f8e742433a4320b1b847454a06dacf03 (12.1.0-devel)

$ ./bin/shader_runner tests/shaders/glsl-array-bounds-05.shader_test -auto
Probe color at (15,15)
  Expected: 0.00 1.00 0.00
  Observed: 1.00 0.00 0.00
PIGLIT: {"result": "fail" }


c264fdbc073a0dfc393f53a8be880f535fd4b988 is the first bad commit
commit c264fdbc073a0dfc393f53a8be880f535fd4b988
Author: Kenneth Graunke 
Date:   Mon Jun 20 11:20:51 2016 -0700

glsl: Split arrays even in the presence of whole-array copies.

Previously, we failed to split constant arrays.  Code such as

   int[2] numbers = int[](1, 2);

would generates a whole-array assignment:

  (assign () (var_ref numbers)
 (constant (array int 4) (constant int 1) (constant int 2)))

opt_array_splitting generally tried to visit ir_dereference_array nodes,
and avoid recursing into the inner ir_dereference_variable.  So if it
ever saw a ir_dereference_variable, it assumed this was a whole-array
read and bailed.  However, in the above case, there's no array deref,
and we can totally handle it - we just have to "unroll" the assignment,
creating assignments for each element.

This was mitigated by the fact that we constant propagate whole arrays,
so a dereference of a single component would usually get the desired
single value anyway.  However, I plan to stop doing that shortly;
early experiments with disabling constant propagation of arrays
revealed this shortcoming.

This patch causes some arrays in Gl32GSCloth's geometry shaders to be
split, which allows other optimizations to eliminate unused GS inputs.
The VS then doesn't have to write them, which eliminates the entire VS
(5 -> 2 instructions).  It still renders correctly.

No other change in shader-db.

v2: Drop !AOA check and improve a comment (feedback from Tim Arceri).

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
Reviewed-by: Timothy Arceri 

:04 04 5d96cd8ba322e2ac1857baf9e401b5d494a85ab2
2b920fe841296c89251cca630f51060de413cbb3 M  src
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/omx: add support for nouveau / interlaced

2016-06-27 Thread Leo Liu
This patch break omx decode to file, it got seg fault. Will take look 
further.


Regards,
Leo


On 06/27/2016 04:16 AM, Julien Isorce wrote:

Signed-off-by: Julien Isorce 
---
  src/gallium/state_trackers/omx/vid_dec.c | 51 
  1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index 564ca2f..85ffb88 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -48,6 +48,7 @@
  #include "pipe/p_video_codec.h"
  #include "util/u_memory.h"
  #include "util/u_surface.h"
+#include "vl/vl_video_buffer.h"
  #include "vl/vl_vlc.h"
  
  #include "entrypoint.h"

@@ -515,34 +516,34 @@ static void vid_dec_FillOutput(vid_dec_PrivateType *priv, 
struct pipe_video_buff
 OMX_VIDEO_PORTDEFINITIONTYPE *def = >sPortParam.format.video;
  
 struct pipe_sampler_view **views;

-   struct pipe_transfer *transfer;
-   struct pipe_box box = { };
-   uint8_t *src, *dst;
+   unsigned i, j;
+   unsigned width, height;
  
 views = buf->get_sampler_view_planes(buf);
  
-   dst = output->pBuffer;

-
-   box.width = def->nFrameWidth;
-   box.height = def->nFrameHeight;
-   box.depth = 1;
-
-   src = priv->pipe->transfer_map(priv->pipe, views[0]->texture, 0,
-  PIPE_TRANSFER_READ, , );
-   util_copy_rect(dst, views[0]->texture->format, def->nStride, 0, 0,
-  box.width, box.height, src, transfer->stride, 0, 0);
-   pipe_transfer_unmap(priv->pipe, transfer);
-
-   dst = ((uint8_t*)output->pBuffer) + (def->nStride * box.height);
-
-   box.width = def->nFrameWidth / 2;
-   box.height = def->nFrameHeight / 2;
-
-   src = priv->pipe->transfer_map(priv->pipe, views[1]->texture, 0,
-  PIPE_TRANSFER_READ, , );
-   util_copy_rect(dst, views[1]->texture->format, def->nStride, 0, 0,
-  box.width, box.height, src, transfer->stride, 0, 0);
-   pipe_transfer_unmap(priv->pipe, transfer);
+   for (i = 0; i < 2 /* NV12 */; i++) {
+  if (!views[i]) continue;
+  width = buf->width;
+  height = buf->height;
+  vl_video_buffer_adjust_size(, , i, buf->interlaced, 
buf->chroma_format);
+  for (j = 0; j < views[i]->texture->array_size; ++j) {
+ struct pipe_box box = {0, 0, j, width, height, 1};
+ struct pipe_transfer *transfer;
+ uint8_t *map, *dst;
+ map = priv->pipe->transfer_map(priv->pipe, views[i]->texture, 0,
+  PIPE_TRANSFER_READ, , );
+ if (!map)
+return;
+
+ dst = ((uint8_t*)output->pBuffer + output->nOffset) + j * def->nStride + i 
* buf->width * buf->height;
+ util_copy_rect(dst,
+views[i]->texture->format,
+def->nStride * views[i]->texture->array_size, 0, 0,
+box.width, box.height, map, transfer->stride, 0, 0);
+
+ pipe_transfer_unmap(priv->pipe, transfer);
+  }
+   }
  }
  
  static void vid_dec_FrameDecoded(OMX_COMPONENTTYPE *comp, OMX_BUFFERHEADERTYPE* input,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/omx: retrieve preferred interlaced and buffer_formats

2016-06-27 Thread Leo Liu



On 06/27/2016 04:16 AM, Julien Isorce wrote:

Interlaced can be true for nouveau driver.

Signed-off-by: Julien Isorce 
---
  src/gallium/state_trackers/omx/vid_dec.c | 26 --
  1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index 108a460..564ca2f 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -385,16 +385,38 @@ static OMX_ERRORTYPE 
vid_dec_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq
  void vid_dec_NeedTarget(vid_dec_PrivateType *priv)
  {
 struct pipe_video_buffer templat = {};
+   struct vl_screen *omx_screen;
+   struct pipe_screen *pscreen;
 omx_base_video_PortType *port;
  
+   omx_screen = priv->screen;

 port = (omx_base_video_PortType 
*)priv->ports[OMX_BASE_FILTER_INPUTPORT_INDEX];
  
+   assert(omx_screen);

+   assert(port);
+
+   pscreen = omx_screen->pscreen;
+
+   assert(pscreen);
+
 if (!priv->target) {
-  templat.buffer_format = PIPE_FORMAT_NV12;
+  memset(, 0, sizeof(templat));
+
templat.chroma_format = PIPE_VIDEO_CHROMA_FORMAT_420;
templat.width = port->sPortParam.format.video.nFrameWidth;
templat.height = port->sPortParam.format.video.nFrameHeight;
-  templat.interlaced = false;
+  templat.buffer_format = pscreen->get_video_param(
+pscreen,
+PIPE_VIDEO_PROFILE_UNKNOWN,
+PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+PIPE_VIDEO_CAP_PREFERED_FORMAT
+  );
+  templat.interlaced = pscreen->get_video_param(
+  pscreen,
+  PIPE_VIDEO_PROFILE_UNKNOWN,
+  PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+  PIPE_VIDEO_CAP_PREFERS_INTERLACED
+  );


This just simply break OMX transcode by setting interlace true.

Regards,
Leo


priv->target = priv->pipe->create_video_buffer(priv->pipe, );
 }
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Make single-buffered GLES representation internally consistent

2016-06-27 Thread Gurchetan Singh
There a few places in the code where clearing and reading are done on incorrect
buffers for GLES contexts.  See comments for details.  This fixes 75 GLES3
dEQP tests on the surfaceless platform with no regressions.

v2:  Corrected unclear comment
---
 src/mesa/main/buffers.c | 14 --
 src/mesa/main/clear.c   |  8 
 src/mesa/main/get.c |  9 +
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c
index e8aedde..86696b8 100644
--- a/src/mesa/main/buffers.c
+++ b/src/mesa/main/buffers.c
@@ -173,12 +173,22 @@ draw_buffer_enum_to_bitmask(const struct gl_context *ctx, 
GLenum buffer)
  * return -1 for an invalid buffer.
  */
 static gl_buffer_index
-read_buffer_enum_to_index(GLenum buffer)
+read_buffer_enum_to_index(const struct gl_context *ctx, GLenum buffer)
 {
switch (buffer) {
   case GL_FRONT:
  return BUFFER_FRONT_LEFT;
   case GL_BACK:
+ if (_mesa_is_gles(ctx)) {
+/* In draw_buffer_enum_to_bitmask, when GLES contexts draw to
+ * GL_BACK with a single-buffered configuration, we actually end
+ * up drawing to the sole front buffer in our internal
+ * representation.  For consistency, we must read from that
+ * front left buffer too.
+ */
+if (!ctx->DrawBuffer->Visual.doubleBufferMode)
+   return BUFFER_FRONT_LEFT;
+ }
  return BUFFER_BACK_LEFT;
   case GL_RIGHT:
  return BUFFER_FRONT_RIGHT;
@@ -724,7 +734,7 @@ read_buffer(struct gl_context *ctx, struct gl_framebuffer 
*fb,
   if (_mesa_is_gles3(ctx) && !is_legal_es3_readbuffer_enum(buffer))
  srcBuffer = -1;
   else
- srcBuffer = read_buffer_enum_to_index(buffer);
+ srcBuffer = read_buffer_enum_to_index(ctx, buffer);
 
   if (srcBuffer == -1) {
  _mesa_error(ctx, GL_INVALID_ENUM,
diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c
index 35b912c..a1bb36e 100644
--- a/src/mesa/main/clear.c
+++ b/src/mesa/main/clear.c
@@ -267,6 +267,14 @@ make_color_buffer_mask(struct gl_context *ctx, GLint 
drawbuffer)
  mask |= BUFFER_BIT_FRONT_RIGHT;
   break;
case GL_BACK:
+  /* For GLES contexts with a single buffered configuration, we actually
+   * only have a front renderbuffer, so any clear calls to GL_BACK should
+   * affect that buffer. See draw_buffer_enum_to_bitmask for details.
+   */
+  if (_mesa_is_gles(ctx))
+ if (!ctx->DrawBuffer->Visual.doubleBufferMode)
+if (att[BUFFER_FRONT_LEFT].Renderbuffer)
+   mask |= BUFFER_BIT_FRONT_LEFT;
   if (att[BUFFER_BACK_LEFT].Renderbuffer)
  mask |= BUFFER_BIT_BACK_LEFT;
   if (att[BUFFER_BACK_RIGHT].Renderbuffer)
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6ffa99c..ec6bc3e 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -627,6 +627,15 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
   break;
 
case GL_READ_BUFFER:
+  /* In _mesa_initialize_window_framebuffer, for single-buffered visuals,
+   * the ColorReadBuffer is set to be GL_FRONT, even with GLES contexts.
+   * When calling read_buffer, we verify we are reading from GL_BACK in
+   * is_legal_es3_readbuffer_enum.  But the default is incorrect, and
+   * certain dEQP tests check this.  So fix it here.
+   */
+  if (_mesa_is_gles(ctx))
+ if (ctx->ReadBuffer->ColorReadBuffer == GL_FRONT)
+ctx->ReadBuffer->ColorReadBuffer = GL_BACK;
   v->value_enum = ctx->ReadBuffer->ColorReadBuffer;
   break;
 
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: update MESA_DEBUG envvar documentation.

2016-06-27 Thread Alejandro Piñeiro
On 27/06/16 15:45, Brian Paul wrote:
> On 06/27/2016 02:10 AM, Alejandro Piñeiro wrote:
>> silent, flush, incomplete_tex and incomplete_fbo flags were not
>> documented (see src/mesa/main.debug.c for more info).
>>
>> FP is not checked anymore.
>> ---
>>
>> Didn't know about the flush option for MESA_DEBUG until Grazvydas
>> Ignotas
>> mentioned it on a freedesktop bug comment (thanks), so this patch
>> documents
>> it.
>>
>> CCing Eric Anholt because MESA_DEBUG=flush and VC4_DEBUG=always_flush
>> seems
>> to have basically the same purpose, in case one needs to be removed.
>>
>> CCing Brian because he added the incomplete_tex and incomplete_fbo flags
>> with commit 93bcf7, and I just added a really basic documentation
>> line for
>> those options, just in case it is worth more details.
>>
>> Note that I assume that all the specific flags for MESA_DEBUG are
>> handled
>> at debuc.c:set_debug_flags. As it is not taking into account FP, Im also
>> dropping the FP line with this patch.
>>
>>
>>   docs/envvars.html | 9 +++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/envvars.html b/docs/envvars.html
>> index ed957bd..abd8a72 100644
>> --- a/docs/envvars.html
>> +++ b/docs/envvars.html
>> @@ -50,8 +50,13 @@ sometimes be useful for debugging end-user issues.
>>  if the application generates a GL_INVALID_ENUM error, a
>> corresponding error
>>  message indicating where the error occurred, and possibly why,
>> will be
>>  printed to stderr.
>> -   If the value of MESA_DEBUG is 'FP' floating point arithmetic
>> errors will
>> -   generate exceptions.
>> +  In addition to just set it, MESA_DEBUG accepts the following flags:
>
> The first part of that sentence doesn't sound right.

Ok, will try to find an alternative language.

>
>> +  
>> +silent - turn off debug messages
>> +flush - flush after each drawing command
>> +incomplete_tex - extra debug messages when a texture is
>> incomplete
>> +incomplete_fbo - extra debug messages when a fbo is
>> incomplete
>> +  
>
> I don't even remember, does MESA_DEBUG take a comma-separated list of
> these value?

Ok, I will check that.

>
> Some obscure env vars/settings haven't been documented because they're
> of limited use to end users.

Well, it is a MESA_DEBUG envvar after all, it is intended for
developers. INTEL_DEBUG and VC4_DEBUG are documented in detail, I don't
see why we can't do the same with MESA_DEBUG.

>
> Also, isn't MESA_DEBUG ignored in release builds?  Maybe you can check
> on that.

Ok, I will also check that.

I will not be able to do all those checks today, will try to take a look
during the week.

>
> -Brian
>
>
>>   MESA_LOG_FILE - specifies a file name for logging all errors,
>> warnings,
>>   etc., rather than stderr
>>   MESA_TEX_PROG - if set, implement conventional texture env
>> modes with
>>
>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Piglit] [RFC PATCH] arb_texture_barrier: call glTextureBarrier after each glDrawRangeElements

2016-06-27 Thread Alejandro Piñeiro
On 27/06/16 18:12, Ilia Mirkin wrote:
> On Mon, Jun 27, 2016 at 6:36 AM, Alejandro Piñeiro  
> wrote:
>>
>> On 27/06/16 12:33, Alejandro Piñeiro wrote:
>>> On 27/06/16 03:08, Grazvydas Ignotas wrote:
 On Sat, Jun 25, 2016 at 4:54 PM, Alejandro Piñeiro  
 wrote:
> In theory they don't overdrawn. The test has a square formed by N
> non-overlapping triangles. With just one call to glDrawRangeElements,
> this always works. But if we split it in M subsets, so M calls to
> glDrawRangeElements with N/M triangles per call, with low resolutions
> and a high amount of triangles, some texels start to fail. And not
> always the same amount of texels (so as you mentioned, undefined
> territory). Although it is a guess, I assume that is cause for artifacts
> on the triangle borders when drawing with so low resolution.
 Perhaps this is somehow related to this bug:
 https://bugs.freedesktop.org/show_bug.cgi?id=96624
 There a texture problem shows up after one of glDrawRangeElements
 calls with a lot of primitives.
>>> Not sure if they are related. Note that the issue I mention gets fully
>>> solved with glTextureBarrier and additionally, it happens on haswell too.
>>>
>>> In any case, reading that bug gave me an idea. I have been toying with
>>> MESA_DEBUG=flush (that until your email I didn't know that exists).
>>> Looking at the code, it adds the equivalent to a glFlush after each
>>> drawing call. Using it gets the test (all 48 subtests) passing. So right
>>> now Im more biased to think that is an issue on glTextureBarrier.
>> Another hint towards being an issue on glTextureBarrier implementation:
>> if instead of using MESA_DEBUG=flush, I replace glTextureBarrier for
>> glFlush on the original test (so one full flush after all the square,
>> but not after each drawing call) all the subtests keep passing too.
> I flipped ARB_texture_barrier on for i965 based on comments that
> others made about how the hw worked. I don't think anyone was *too*
> sure that it was correct, but the tests (at the time) passed, so I
> don't think anyone gave it much more thought. Commit 71e187430c, ftr.

Yes, I already found that one while researching. I found more
interesting the annotate section when you sent the patch:
https://lists.freedesktop.org/archives/mesa-dev/2015-August/091632.html

On that section you mention that the test passes even without the flush.
FWIW, with the new detailed test I sent last week to the mailing list,
if the glTextureBarrier call on the test is removed, it goes from 5-6
failed subtest to 12 failed subtests (out of 48). So I think that that
patch was a step in the correct direction.

FWIW2, if I replace the call to brw_emit_mi_flush with a call to
intel_batchbuffer_flush all the subtests passes correctly, but probably
that is an overkill. Before suggesting that change, I will check if it
is possible to solve it on the call of brw_emit_mi_flush.

BTW, this is a way to say that I NAK my own piglit patch. At this moment
I think that it is something to be solved on the mesa side.

BR





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 19/34] i965/state: Add a helper for emitting a surface state using isl

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:18PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_state.h|  8 +++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 83 
> 
>  2 files changed, 91 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index b29412e..c582873 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -275,6 +275,14 @@ GLuint translate_tex_format(struct brw_context *brw,
>  int brw_get_texture_swizzle(const struct gl_context *ctx,
>  const struct gl_texture_object *t);
>  
> +struct isl_view;
> +void brw_emit_surface_state(struct brw_context *brw,
> +struct intel_mipmap_tree *mt,
> +const struct isl_view *view,
> +uint32_t mocs, bool for_gather,
> +uint32_t *surf_offset, int surf_index,
> +unsigned read_domains, unsigned write_domains);
> +
>  void brw_update_renderbuffer_surfaces(struct brw_context *brw,
>const struct gl_framebuffer *fb,
>uint32_t render_target_start,
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index cfce2c9..ab45959 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -35,6 +35,7 @@
>  #include "main/mtypes.h"
>  #include "main/samplerobj.h"
>  #include "main/shaderimage.h"
> +#include "main/teximage.h"
>  #include "program/prog_parameter.h"
>  #include "program/prog_instruction.h"
>  #include "main/framebuffer.h"
> @@ -52,6 +53,88 @@
>  #include "brw_defines.h"
>  #include "brw_wm.h"
>  
> +struct surface_state_info {
> +   unsigned num_dwords;
> +   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes 
> */
> +   unsigned reloc_dw;
> +   unsigned aux_reloc_dw;
> +   unsigned tex_mocs;
> +   unsigned rb_mocs;
> +};
> +
> +static const struct surface_state_info surface_state_infos[] = {
> +   [4] = {6,  32, 1,  0},
> +   [5] = {6,  32, 1,  0},
> +   [6] = {6,  32, 1,  0},
> +   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
> +   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
> +   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
> +};
> +
> +void
> +brw_emit_surface_state(struct brw_context *brw,
> +   struct intel_mipmap_tree *mt,
> +   const struct isl_view *view,
> +   uint32_t mocs, bool for_gather,
> +   uint32_t *surf_offset, int surf_index,
> +   unsigned read_domains, unsigned write_domains)
> +{
> +   /* TODO: This should go in the context */
> +   struct isl_device isl_dev;
> +   isl_device_init(_dev, brw->intelScreen->devinfo, brw->has_swizzling);
> +
> +   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
> +
> +   struct isl_surf surf;
> +   intel_miptree_get_isl_surf(brw, mt, );
> +
> +   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
> +
> +   struct isl_surf *aux_surf = NULL, aux_surf_s;
> +   uint64_t aux_offset = 0;
> +   enum isl_aux_layout aux_layout = ISL_AUX_LAYOUT_NONE;
> +   if (mt->mcs_mt &&
> +   ((view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
> +mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED)) {
> +  intel_miptree_get_aux_isl_surf(brw, mt, _surf_s, _layout);
> +  aux_surf = _surf_s;
> +  assert(mt->mcs_mt->offset == 0);
> +  aux_offset = mt->mcs_mt->bo->offset64;
> +
> +  /* We only really need a clear color if we also have an auxiliary
> +   * surfacae.  Without one, it does nothing.
> +   */
> +  clear_color = intel_miptree_get_isl_clear_color(brw, mt);
> +   }
> +
> +   uint32_t *dw = __brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> +ss_info.num_dwords * 4, ss_info.ss_align,
> +surf_index, surf_offset);
> +
> +   isl_surf_fill_state(_dev, dw, .surf = , .view = view,
> +   .address = mt->bo->offset64 + mt->offset,
> +   .aux_surf = aux_surf, .aux_layout = aux_layout,
> +   .aux_address = aux_offset,
> +   .mocs = mocs, .clear_color = clear_color);
> +
> +   drm_intel_bo_emit_reloc(brw->batch.bo,
> +   *surf_offset + 4 * ss_info.reloc_dw,
> +   mt->bo, mt->offset,
> +   read_domains, write_domains);
> +
> +   if (aux_surf) {
> +  /* On gen7 and prior, the bottom 12 bits of the MCS base address are
> +   * used to store other information.  This should be ok, however, 
> because
> +   * surface buffer addresses are always 4K page 

Re: [Mesa-dev] [PATCH 1/2] glsl: add driconf to zero-init unintialized vars

2016-06-27 Thread Rob Clark
On Mon, Jun 27, 2016 at 3:06 PM, Kenneth Graunke  wrote:
> On Monday, June 27, 2016 11:43:28 AM PDT Matt Turner wrote:
>> On Mon, Jun 27, 2016 at 4:44 AM, Rob Clark  wrote:
>> > On Mon, Jun 27, 2016 at 7:13 AM, Alan Swanson  
>> > wrote:
>> >> On 2016-06-25 13:37, Rob Clark wrote:
>> >>>
>> >>> Some games are sloppy.. perhaps because it is defined behavior for DX or
>> >>> perhaps because nv blob driver defaults things to zero.
>> >>>
>> >>> So add driconf param to force uninitialized variables to default to zero.
>> >>>
>> >>> This issue was observed with rust, from steam store.  But has surfaced
>> >>> elsewhere in the past.
>> >>>
>> >>> Signed-off-by: Rob Clark 
>> >>> ---
>> >>> Note that I left out the drirc bit, since not entirely sure how to
>> >>> identify this game.  (I don't actually have the game, just working off
>> >>> of an apitrace)
>> >>>
>> >>> Possibly worth mentioning that for the shaders using uninitialized vars
>> >>> having zero-initializers lets constant-propagation get rid of a whole
>> >>> lot of instructions.  One shader I saw dropped to less than half of
>> >>> it's original instruction count.
>> >>
>> >>
>> >> If the default for uninitialised variables is undefined, then with the
>> >> reported shader optimisations why bother with the (DRI) option when
>> >> zeroing could still essentially be classed as undefined?
>> >>
>> >> Cuts the patch down to just the src/compiler/glsl/ast_to_hir.cpp change.
>> >
>> > I did suggest that on #dri-devel, but Jason had a theoretical example
>> > where it would hurt.. iirc something like:
>> >
>> >   float maybe_undef;
>> >   for (int i = 0; i < some_uniform_at_least_one; i++)
>> >  maybe_undef = ...
>> >
>> > also, he didn't want to hide shader bugs that app should fix.
>> >
>> > It would be interesting to rush shaderdb w/ glsl_zero_init=true and
>> > see what happens, but I didn't get around to that yet.
>>
>> Here's what I get on i965. It's not a clear win.
>>
>> total instructions in shared programs: 5249030 -> 5249002 (-0.00%)
>> instructions in affected programs: 28936 -> 28908 (-0.10%)
>> helped: 66
>> HURT: 132
>>
>> total cycles in shared programs: 57966694 -> 57956306 (-0.02%)
>> cycles in affected programs: 1136118 -> 1125730 (-0.91%)
>> helped: 78
>> HURT: 106
>
> I suspect most of the help is because we're missing undef optimizations,
> such as CSE...while zero could be CSE'd.  (I have a patch, but it hurts
> things too...)

right, I was thinking that treating undef as zero in constant-folding
would have the same effect.. ofc it might make shader bugs less
obvious.

Btw, does anyone know what fglrx does?  Afaiu nv blob treats undef as
zero.  If fglrx does the same, I suppose that strengthens the argument
for "just do this unconditionally".

(but I'm still leaning towards "make it conditional" so far..)

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris

2016-06-27 Thread Alex Deucher
On Mon, Jun 27, 2016 at 3:01 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This was missing.
>
> Cc: 12.0 

Reviewed-by: Alex Deucher 

> ---
>  src/gallium/drivers/radeonsi/si_state.c | 5 +
>  src/gallium/drivers/radeonsi/sid.h  | 6 ++
>  2 files changed, 11 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 2e2c5ca..0a2fdbf 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3882,6 +3882,11 @@ static void si_init_config(struct si_context *sctx)
> if (sctx->b.family == CHIP_STONEY)
> si_pm4_set_reg(pm4, R_028C40_PA_SC_SHADER_CONTROL, 0);
>
> +   if (sctx->b.family >= CHIP_POLARIS10)
> +   si_pm4_set_reg(pm4, R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL,
> +  S_028830_SMALL_PRIM_FILTER_ENABLE(1) |
> +  S_028830_LINE_FILTER_DISABLE(1)); /* line bug 
> */
> +
> si_pm4_set_reg(pm4, R_028080_TA_BC_BASE_ADDR, border_color_va >> 8);
> if (sctx->b.chip_class >= CIK)
> si_pm4_set_reg(pm4, R_028084_TA_BC_BASE_ADDR_HI, 
> border_color_va >> 40);
> diff --git a/src/gallium/drivers/radeonsi/sid.h 
> b/src/gallium/drivers/radeonsi/sid.h
> index 1363a44..b973489 100644
> --- a/src/gallium/drivers/radeonsi/sid.h
> +++ b/src/gallium/drivers/radeonsi/sid.h
> @@ -7215,6 +7215,12 @@
>  #define   G_02882C_YMAX_BOTTOM_EXCLUSION(x)   (((x) 
> >> 31) & 0x1)
>  #define   C_02882C_YMAX_BOTTOM_EXCLUSION  
> 0x7FFF
>  /* */
> +#define R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL   
> 0x028830 /* Polaris */
> +#define   S_028830_SMALL_PRIM_FILTER_ENABLE(x)(((x) 
> & 0x1) << 0)
> +#define   S_028830_TRIANGLE_FILTER_DISABLE(x) (((x) 
> & 0x1) << 1)
> +#define   S_028830_LINE_FILTER_DISABLE(x) (((x) 
> & 0x1) << 2)
> +#define   S_028830_POINT_FILTER_DISABLE(x)(((x) 
> & 0x1) << 3)
> +#define   S_028830_RECTANGLE_FILTER_DISABLE(x)(((x) 
> & 0x1) << 4)
>  #define R_028A00_PA_SU_POINT_SIZE   
> 0x028A00
>  #define   S_028A00_HEIGHT(x)  
> (((unsigned)(x) & 0x) << 0)
>  #define   G_028A00_HEIGHT(x)  (((x) 
> >> 0) & 0x)
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add driconf to zero-init unintialized vars

2016-06-27 Thread Kenneth Graunke
On Monday, June 27, 2016 11:43:28 AM PDT Matt Turner wrote:
> On Mon, Jun 27, 2016 at 4:44 AM, Rob Clark  wrote:
> > On Mon, Jun 27, 2016 at 7:13 AM, Alan Swanson  
> > wrote:
> >> On 2016-06-25 13:37, Rob Clark wrote:
> >>>
> >>> Some games are sloppy.. perhaps because it is defined behavior for DX or
> >>> perhaps because nv blob driver defaults things to zero.
> >>>
> >>> So add driconf param to force uninitialized variables to default to zero.
> >>>
> >>> This issue was observed with rust, from steam store.  But has surfaced
> >>> elsewhere in the past.
> >>>
> >>> Signed-off-by: Rob Clark 
> >>> ---
> >>> Note that I left out the drirc bit, since not entirely sure how to
> >>> identify this game.  (I don't actually have the game, just working off
> >>> of an apitrace)
> >>>
> >>> Possibly worth mentioning that for the shaders using uninitialized vars
> >>> having zero-initializers lets constant-propagation get rid of a whole
> >>> lot of instructions.  One shader I saw dropped to less than half of
> >>> it's original instruction count.
> >>
> >>
> >> If the default for uninitialised variables is undefined, then with the
> >> reported shader optimisations why bother with the (DRI) option when
> >> zeroing could still essentially be classed as undefined?
> >>
> >> Cuts the patch down to just the src/compiler/glsl/ast_to_hir.cpp change.
> >
> > I did suggest that on #dri-devel, but Jason had a theoretical example
> > where it would hurt.. iirc something like:
> >
> >   float maybe_undef;
> >   for (int i = 0; i < some_uniform_at_least_one; i++)
> >  maybe_undef = ...
> >
> > also, he didn't want to hide shader bugs that app should fix.
> >
> > It would be interesting to rush shaderdb w/ glsl_zero_init=true and
> > see what happens, but I didn't get around to that yet.
> 
> Here's what I get on i965. It's not a clear win.
> 
> total instructions in shared programs: 5249030 -> 5249002 (-0.00%)
> instructions in affected programs: 28936 -> 28908 (-0.10%)
> helped: 66
> HURT: 132
> 
> total cycles in shared programs: 57966694 -> 57956306 (-0.02%)
> cycles in affected programs: 1136118 -> 1125730 (-0.91%)
> helped: 78
> HURT: 106

I suspect most of the help is because we're missing undef optimizations,
such as CSE...while zero could be CSE'd.  (I have a patch, but it hurts
things too...)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris

2016-06-27 Thread Marek Olšák
From: Marek Olšák 

This was missing.

Cc: 12.0 
---
 src/gallium/drivers/radeonsi/si_state.c | 5 +
 src/gallium/drivers/radeonsi/sid.h  | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 2e2c5ca..0a2fdbf 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3882,6 +3882,11 @@ static void si_init_config(struct si_context *sctx)
if (sctx->b.family == CHIP_STONEY)
si_pm4_set_reg(pm4, R_028C40_PA_SC_SHADER_CONTROL, 0);
 
+   if (sctx->b.family >= CHIP_POLARIS10)
+   si_pm4_set_reg(pm4, R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL,
+  S_028830_SMALL_PRIM_FILTER_ENABLE(1) |
+  S_028830_LINE_FILTER_DISABLE(1)); /* line bug */
+
si_pm4_set_reg(pm4, R_028080_TA_BC_BASE_ADDR, border_color_va >> 8);
if (sctx->b.chip_class >= CIK)
si_pm4_set_reg(pm4, R_028084_TA_BC_BASE_ADDR_HI, 
border_color_va >> 40);
diff --git a/src/gallium/drivers/radeonsi/sid.h 
b/src/gallium/drivers/radeonsi/sid.h
index 1363a44..b973489 100644
--- a/src/gallium/drivers/radeonsi/sid.h
+++ b/src/gallium/drivers/radeonsi/sid.h
@@ -7215,6 +7215,12 @@
 #define   G_02882C_YMAX_BOTTOM_EXCLUSION(x)   (((x) >> 
31) & 0x1)
 #define   C_02882C_YMAX_BOTTOM_EXCLUSION  
0x7FFF
 /* */
+#define R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL   
0x028830 /* Polaris */
+#define   S_028830_SMALL_PRIM_FILTER_ENABLE(x)(((x) & 
0x1) << 0)
+#define   S_028830_TRIANGLE_FILTER_DISABLE(x) (((x) & 
0x1) << 1)
+#define   S_028830_LINE_FILTER_DISABLE(x) (((x) & 
0x1) << 2)
+#define   S_028830_POINT_FILTER_DISABLE(x)(((x) & 
0x1) << 3)
+#define   S_028830_RECTANGLE_FILTER_DISABLE(x)(((x) & 
0x1) << 4)
 #define R_028A00_PA_SU_POINT_SIZE   
0x028A00
 #define   S_028A00_HEIGHT(x)  
(((unsigned)(x) & 0x) << 0)
 #define   G_028A00_HEIGHT(x)  (((x) >> 
0) & 0x)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add driconf to zero-init unintialized vars

2016-06-27 Thread Matt Turner
On Mon, Jun 27, 2016 at 4:44 AM, Rob Clark  wrote:
> On Mon, Jun 27, 2016 at 7:13 AM, Alan Swanson  
> wrote:
>> On 2016-06-25 13:37, Rob Clark wrote:
>>>
>>> Some games are sloppy.. perhaps because it is defined behavior for DX or
>>> perhaps because nv blob driver defaults things to zero.
>>>
>>> So add driconf param to force uninitialized variables to default to zero.
>>>
>>> This issue was observed with rust, from steam store.  But has surfaced
>>> elsewhere in the past.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> Note that I left out the drirc bit, since not entirely sure how to
>>> identify this game.  (I don't actually have the game, just working off
>>> of an apitrace)
>>>
>>> Possibly worth mentioning that for the shaders using uninitialized vars
>>> having zero-initializers lets constant-propagation get rid of a whole
>>> lot of instructions.  One shader I saw dropped to less than half of
>>> it's original instruction count.
>>
>>
>> If the default for uninitialised variables is undefined, then with the
>> reported shader optimisations why bother with the (DRI) option when
>> zeroing could still essentially be classed as undefined?
>>
>> Cuts the patch down to just the src/compiler/glsl/ast_to_hir.cpp change.
>
> I did suggest that on #dri-devel, but Jason had a theoretical example
> where it would hurt.. iirc something like:
>
>   float maybe_undef;
>   for (int i = 0; i < some_uniform_at_least_one; i++)
>  maybe_undef = ...
>
> also, he didn't want to hide shader bugs that app should fix.
>
> It would be interesting to rush shaderdb w/ glsl_zero_init=true and
> see what happens, but I didn't get around to that yet.

Here's what I get on i965. It's not a clear win.

total instructions in shared programs: 5249030 -> 5249002 (-0.00%)
instructions in affected programs: 28936 -> 28908 (-0.10%)
helped: 66
HURT: 132

total cycles in shared programs: 57966694 -> 57956306 (-0.02%)
cycles in affected programs: 1136118 -> 1125730 (-0.91%)
helped: 78
HURT: 106
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 09/34] isl/state: Add support for handling color control surfaces

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:08PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.h   |  7 +++
>  src/intel/isl/isl_surface_state.c | 39 
> ---
>  2 files changed, 43 insertions(+), 3 deletions(-)

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index 5011d15..5f24d6e 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -864,6 +864,13 @@ struct isl_surf_fill_state_info {
> uint32_t mocs;
>  
> /**
> +* The auxilary surface or NULL if no auxilary surface is to be used.
> +*/
> +   const struct isl_surf *aux_surf;
> +   enum isl_aux_layout aux_layout;
> +   uint64_t aux_address;
> +
> +   /**
>  * The clear color for this surface
>  *
>  * Valid values depend on hardware generation.
> diff --git a/src/intel/isl/isl_surface_state.c 
> b/src/intel/isl/isl_surface_state.c
> index 21b057e..b1c60c07 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -86,6 +86,23 @@ static const uint32_t isl_to_gen_multisample_layout[] = {
>  };
>  #endif
>  
> +#if GEN_GEN >= 9
> +static const uint32_t isl_to_gen_aux_mode[] = {
> +   [ISL_AUX_LAYOUT_NONE] = AUX_NONE,
> +   [ISL_AUX_LAYOUT_HIZ] = AUX_HIZ,
> +   [ISL_AUX_LAYOUT_MCS] = AUX_CCS_D,
> +   [ISL_AUX_LAYOUT_CCS_D] = AUX_CCS_D,
> +   [ISL_AUX_LAYOUT_CCS_E] = AUX_CCS_E,
> +};
> +#elif GEN_GEN >= 8
> +static const uint32_t isl_to_gen_aux_mode[] = {
> +   [ISL_AUX_LAYOUT_NONE] = AUX_NONE,
> +   [ISL_AUX_LAYOUT_HIZ] = AUX_HIZ,
> +   [ISL_AUX_LAYOUT_MCS] = AUX_MCS,
> +   [ISL_AUX_LAYOUT_CCS_D] = AUX_MCS,
> +};
> +#endif
> +
>  static uint8_t
>  get_surftype(enum isl_surf_dim dim, isl_surf_usage_flags_t usage)
>  {
> @@ -406,10 +423,26 @@ isl_genX(surf_fill_state_s)(const struct isl_device 
> *dev, void *state,
> assert(info->y_offset == 0);
>  #endif
>  
> +#if GEN_GEN >= 7
> +   if (info->aux_surf && info->aux_layout != ISL_AUX_LAYOUT_NONE) {
> +  struct isl_tile_info tile_info;
> +  isl_surf_get_tile_info(dev, info->aux_surf, _info);
> +  uint32_t pitch_in_tiles = info->aux_surf->row_pitch / tile_info.width;
> +
>  #if GEN_GEN >= 8
> -   s.AuxiliarySurfaceMode = AUX_NONE;
> -#elif GEN_GEN >= 7
> -   s.MCSEnable = false;
> +  assert(GEN_GEN >= 9 || info->aux_layout != ISL_AUX_LAYOUT_CCS_E);
> +  s.AuxiliarySurfacePitch = pitch_in_tiles - 1;
> +  s.AuxiliarySurfaceQPitch = get_qpitch(info->aux_surf) >> 2;
> +  s.AuxiliarySurfaceBaseAddress = info->aux_address;
> +  s.AuxiliarySurfaceMode = isl_to_gen_aux_mode[info->aux_layout];
> +#else
> +  assert(info->aux_layout == ISL_AUX_LAYOUT_MCS ||
> + info->aux_layout == ISL_AUX_LAYOUT_CCS_D);
> +  s.MCSBaseAddress = info->aux_address,
> +  s.MCSSurfacePitch = pitch_in_tiles - 1;
> +  s.MCSEnable = true;
> +#endif
> +   }
>  #endif
>  
>  #if GEN_GEN >= 8
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] gallium/radeon: add a heuristic dynamically enabling DCC for scanout surfaces

2016-06-27 Thread Marek Olšák
On Mon, Jun 27, 2016 at 10:35 AM, Nicolai Hähnle  wrote:
> On 24.06.2016 20:48, Marek Olšák wrote:
>>
>> On Fri, Jun 24, 2016 at 1:09 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 22.06.2016 20:29, Marek Olšák wrote:


 From: Marek Olšák 

 DCC for displayable surfaces is allocated in a separate buffer and is
 enabled or disabled based on PS invocations from 2 frames ago (to let
 queries go idle) and the number of slow clears from the current frame.

 At least an equivalent of 5 fullscreen draws or slow clears must be done
 to enable DCC. (PS invocations / (width * height) + num_slow_clears >=
 5)

 Pipeline statistic queries are always active if a color buffer that can
 have separate DCC is bound, even if separate DCC is disabled. That means
 the window color buffer is always monitored and DCC is enabled only when
 the situation is right.

 The tracking of per-texture queries in r600_common_context is quite
 ugly,
 but I don't see a better way.

 The first fast clear always enables DCC. DCC decompression can disable
 it.
 A later fast clear can enable it again. Enable/disable typically happens
 only once per frame.

 The impact is expected to be negligible because games usually don't have
 a high level of overdraw. DCC usually activates when too much blending
 is happening (smoke rendering) or when testing glClear performance and
 CMASK isn't supported (Stoney).
>>>
>>>
>>>
>>> Nice stuff. One corner case to think of: what happens when DCC is enabled
>>> for a texture that is currently bound? Needs the same treatment as when
>>> DCC
>>> is disabled, right?
>>
>>
>> Not sure what you mean, can you be more specific? Note that it can't
>> be bound as a sampler by apps because it's a window framebuffer.
>
>
> Okay, that makes sense.
>
>
>>>
>>> More comments below...
>>>
>>>
 ---
src/gallium/drivers/radeon/r600_pipe_common.c |  15 ++
src/gallium/drivers/radeon/r600_pipe_common.h |  40 +
src/gallium/drivers/radeon/r600_texture.c | 239
 ++
src/gallium/drivers/radeonsi/si_blit.c|  14 +-
src/gallium/drivers/radeonsi/si_state.c   |  15 ++
src/gallium/drivers/radeonsi/si_state_draw.c  |   5 +-
6 files changed, 326 insertions(+), 2 deletions(-)

 diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
 b/src/gallium/drivers/radeon/r600_pipe_common.c
 index 5d4a679..66afcfa 100644
 --- a/src/gallium/drivers/radeon/r600_pipe_common.c
 +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
 @@ -397,6 +397,21 @@ bool r600_common_context_init(struct
 r600_common_context *rctx,

void r600_common_context_cleanup(struct r600_common_context *rctx)
{
 +   unsigned i,j;
 +
 +   /* Release DCC stats. */
 +   for (i = 0; i < ARRAY_SIZE(rctx->dcc_stats); i++) {
 +   assert(!rctx->dcc_stats[i].query_active);
 +
 +   for (j = 0; j < ARRAY_SIZE(rctx->dcc_stats[i].ps_stats);
 j++)
 +   if (rctx->dcc_stats[i].ps_stats[j])
 +   rctx->b.destroy_query(>b,
 +
 rctx->dcc_stats[i].ps_stats[j]);
 +
 +   pipe_resource_reference((struct pipe_resource**)
 +   >dcc_stats[i].tex, NULL);
 +   }
 +
  if (rctx->gfx.cs)
  rctx->ws->cs_destroy(rctx->gfx.cs);
  if (rctx->dma.cs)
 diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
 b/src/gallium/drivers/radeon/r600_pipe_common.h
 index 92cba13..cdec907 100644
 --- a/src/gallium/drivers/radeon/r600_pipe_common.h
 +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
 @@ -272,6 +272,25 @@ struct r600_texture {
   * dcc_offset contains the absolute GPUVM address, not the
 relative one.
   */
  struct r600_resource*dcc_separate_buffer;
 +   /* When DCC is temporarily disabled, the separate buffer is
 here.
 */
 +   struct r600_resource*last_dcc_separate_buffer;
 +   /* We need to track DCC dirtiness, because st/dri usually calls
 +* flush_resource twice per frame (not a bug) and we don't wanna
 +* decompress DCC twice. Also, the dirty tracking must be done
 even
 +* if DCC isn't used, because it's required by the DCC usage
 analysis
 +* for a possible future enablement.
 +*/
 +   boolseparate_dcc_dirty;
 +   /* Statistics gathering for the DCC enablement heuristic. */
 +   booldcc_gather_statistics;
 +   /* Estimate of how much this color buffer is written 

[Mesa-dev] [PATCH] miptree: Skip attempts to make unsupported images

2016-06-27 Thread Nanley Chery
This causes tests that attempt to create linear depth buffers on
Gen7+ (unsupported), to be skipped.

Signed-off-by: Nanley Chery 
---
 src/tests/func/miptree/miptree.c | 39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/src/tests/func/miptree/miptree.c b/src/tests/func/miptree/miptree.c
index 0ced389..37002cb 100644
--- a/src/tests/func/miptree/miptree.c
+++ b/src/tests/func/miptree/miptree.c
@@ -410,6 +410,19 @@ miptree_destroy(miptree_t *mt)
 free(mt);
 }
 
+static void
+can_create_image(VkImageType type, VkImageTiling tiling,
+ uint32_t usage, VkFormat format)
+{
+VkImageFormatProperties fmt_properties;
+VkResult result =
+  vkGetPhysicalDeviceImageFormatProperties(t_physical_dev, format,
+   type, tiling, usage,
+   0, _properties);
+if (result == VK_ERROR_FORMAT_NOT_SUPPORTED)
+   t_end(TEST_RESULT_SKIP);
+}
+
 static const miptree_t *
 miptree_create(void)
 {
@@ -424,10 +437,19 @@ miptree_create(void)
 const uint32_t depth = params->depth;
 const uint32_t array_length = params->array_length;
 const size_t buffer_size = miptree_calc_buffer_size();
+const uint32_t usage_bits = VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
+VK_IMAGE_USAGE_TRANSFER_DST_BIT |
+VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
+VK_IMAGE_USAGE_SAMPLED_BIT;
+VkImageType image_type = 
image_type_from_image_view_type(params->view_type);
+
+// Determine if an image can be created with this combination
+can_create_image(image_type, VK_IMAGE_TILING_OPTIMAL,
+ usage_bits, format);
 
 // Create the image that will contain the real miptree.
 VkImage image = qoCreateImage(t_device,
-.imageType = image_type_from_image_view_type(params->view_type),
+.imageType = image_type,
 .format = format,
 .mipLevels = levels,
 .arrayLayers = array_length,
@@ -437,10 +459,7 @@ miptree_create(void)
 .depth = depth,
 },
 .tiling = VK_IMAGE_TILING_OPTIMAL,
-.usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
- VK_IMAGE_USAGE_TRANSFER_DST_BIT |
- VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
- VK_IMAGE_USAGE_SAMPLED_BIT);
+.usage = usage_bits);
 VkBuffer src_buffer = qoCreateBuffer(t_device,
 .size = buffer_size,
 .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT);
@@ -501,6 +520,11 @@ miptree_create(void)
 break;
 case MIPTREE_UPLOAD_METHOD_COPY_FROM_LINEAR_IMAGE:
 case MIPTREE_UPLOAD_METHOD_COPY_WITH_DRAW:
+
+// Determine if an image can be created with this combination
+can_create_image(VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR,
+ VK_IMAGE_USAGE_TRANSFER_SRC_BIT, format);
+
 src_vk_image = qoCreateImage(t_device,
 .format = format,
 .mipLevels = 1,
@@ -523,6 +547,11 @@ miptree_create(void)
 break;
 case MIPTREE_DOWNLOAD_METHOD_COPY_TO_LINEAR_IMAGE:
 case MIPTREE_DOWNLOAD_METHOD_COPY_WITH_DRAW:
+
+// Determine if an image can be created with this combination
+can_create_image(VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR,
+ VK_IMAGE_USAGE_TRANSFER_DST_BIT, format);
+
 dest_vk_image = qoCreateImage(t_device,
 .format = format,
 .mipLevels = 1,
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Simplify foreach_inst_in_block_safe() macro.

2016-06-27 Thread Matt Turner
We know what the end looks like without examining .tail: it's NULL. It's
always NULL.
---
 src/mesa/drivers/dri/i965/brw_cfg.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
b/src/mesa/drivers/dri/i965/brw_cfg.h
index 5b770aa..1c90eab 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.h
+++ b/src/mesa/drivers/dri/i965/brw_cfg.h
@@ -334,9 +334,8 @@ struct cfg_t {
 
 #define foreach_inst_in_block_safe(__type, __inst, __block)\
for (__type *__inst = (__type *)__block->instructions.head, \
-   *__next = (__type *)__inst->next,   \
-   *__end = (__type *)__block->instructions.tail;  \
-__next != __end;   \
+   *__next = (__type *)__inst->next;   \
+__next != NULL;\
 __inst = __next,   \
 __next = (__type *)__next->next)
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 10/34] i965/miptree: Add a helper for getting an isl_surf from a miptree

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:09PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 171 
> +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   6 +
>  2 files changed, 175 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index b6265dc..83a9764 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -26,8 +26,6 @@
>  #include 
>  #include 
>  
> -#include "isl/isl.h"
> -
>  #include "intel_batchbuffer.h"
>  #include "intel_mipmap_tree.h"
>  #include "intel_resolve_map.h"
> @@ -2999,3 +2997,172 @@ intel_miptree_unmap(struct brw_context *brw,
>  
> intel_miptree_release_map(mt, level, slice);
>  }
> +
> +void
> +intel_miptree_get_isl_surf(struct brw_context *brw,
> +   const struct intel_mipmap_tree *mt,
> +   struct isl_surf *surf)
> +{
> +   switch (mt->target) {
> +   case GL_TEXTURE_1D:
> +   case GL_TEXTURE_1D_ARRAY: {
> +  surf->dim = ISL_SURF_DIM_1D;
> +  if (brw->gen >= 9 && mt->tiling == I915_TILING_NONE)
> + surf->dim_layout = ISL_DIM_LAYOUT_GEN9_1D;
> +  else
> + surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
> +  break;
> +   }
> +   case GL_TEXTURE_2D:
> +   case GL_TEXTURE_2D_ARRAY:
> +   case GL_TEXTURE_RECTANGLE:
> +   case GL_TEXTURE_CUBE_MAP:
> +   case GL_TEXTURE_CUBE_MAP_ARRAY:
> +   case GL_TEXTURE_2D_MULTISAMPLE:
> +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
> +   case GL_TEXTURE_EXTERNAL_OES:
> +  surf->dim = ISL_SURF_DIM_2D;
> +  surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
> +  break;
> +   case GL_TEXTURE_3D:
> +  surf->dim = ISL_SURF_DIM_3D;
> +  if (brw->gen >= 9)
> + surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
> +  else
> + surf->dim_layout = ISL_DIM_LAYOUT_GEN4_3D;
> +  break;
> +   default:
> +  unreachable("Invalid texture target");
> +   }
> +
> +   switch (mt->msaa_layout) {
> +   case INTEL_MSAA_LAYOUT_NONE:
> +  surf->msaa_layout = ISL_MSAA_LAYOUT_NONE;
> +  break;
> +   case INTEL_MSAA_LAYOUT_IMS:
> +  surf->msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED;
> +  break;
> +   case INTEL_MSAA_LAYOUT_UMS:
> +   case INTEL_MSAA_LAYOUT_CMS:
> +  surf->msaa_layout = ISL_MSAA_LAYOUT_ARRAY;
> +  break;
> +   default:
> +  unreachable("Invalid MSAA layout");
> +   }
> +
> +   if (mt->format == MESA_FORMAT_S_UINT8) {
> +  surf->tiling = ISL_TILING_W;
> +   } else {
> +  switch (mt->tiling) {
> +  case I915_TILING_NONE:
> + surf->tiling = ISL_TILING_LINEAR;
> + break;
> +  case I915_TILING_X:
> + surf->tiling = ISL_TILING_X;
> + break;
> +  case I915_TILING_Y:
> + switch (mt->tr_mode) {
> + case INTEL_MIPTREE_TRMODE_NONE:
> +surf->tiling = ISL_TILING_Y0;
> +break;
> + case INTEL_MIPTREE_TRMODE_YF:
> +surf->tiling = ISL_TILING_Yf;
> +break;
> + case INTEL_MIPTREE_TRMODE_YS:
> +surf->tiling = ISL_TILING_Ys;
> +break;
> + }
> + break;
> +  default:
> + unreachable("Invalid tiling mode");
> +  }
> +   }
> +
> +   surf->format = translate_tex_format(brw, mt->format, false);
> +
> +   if (brw->gen >= 9) {
> +  /* On gen9+, intel_mipmap_tree stores the horizontal and vertical
> +   * alignment in terms of surface elements like we want.
> +   */
> +  surf->image_alignment_el = isl_extent3d(mt->halign, mt->valign, 1);
> +   } else if (brw->gen == 6 && mt->valign > 4) {
> +  /* Sandy Bridge hardware doesn't support multiple mip levels on stencil
> +   * or HiZ buffers.  The miptree calculation code leaves us with a 
> valign
> +   * that we can't actually use.  Just pick something that won't assert
> +   * inside ISL when we try to emit surface state.
> +   */
> +  assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
> +  assert(_mesa_is_depth_or_stencil_format(
> + _mesa_get_format_base_format(mt->format)));
> +  surf->image_alignment_el = isl_extent3d(8, 2, 1);
> +   } else {
> +  /* On earlier gens it's storred in pixels. */
> +  unsigned bw, bh;
> +  _mesa_get_format_block_size(mt->format, , );
> +  surf->image_alignment_el =
> + isl_extent3d(mt->halign / bw, mt->valign / bh, 1);
> +   }
> +
> +   surf->logical_level0_px.width = mt->logical_width0;
> +   surf->logical_level0_px.height = mt->logical_height0;
> +   if (surf->dim == ISL_SURF_DIM_3D) {
> +  surf->logical_level0_px.depth = mt->logical_depth0;
> +  surf->logical_level0_px.array_len = 1;
> +   } else if (mt->target == GL_TEXTURE_CUBE_MAP ||
> +  mt->target == GL_TEXTURE_CUBE_MAP_ARRAY) {
> +  /* For cube maps, mt->logical_depth0 is in number of cubes */
> +  

Re: [Mesa-dev] [PATCH v2 13/34] i965/blorp: Add a generic ISL-based surface state emit path

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:12PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 148 
> ++
>  src/mesa/drivers/dri/i965/brw_blorp.h |   6 ++
>  2 files changed, 154 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 9590968..eac5802 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -240,6 +240,154 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
> struct nir_shader *nir,
> return program;
>  }
>  
> +static enum isl_msaa_layout
> +get_isl_msaa_layout(enum intel_msaa_layout layout)
> +{
> +   switch (layout) {
> +   case INTEL_MSAA_LAYOUT_NONE:
> +  return ISL_MSAA_LAYOUT_NONE;
> +   case INTEL_MSAA_LAYOUT_IMS:
> +  return ISL_MSAA_LAYOUT_INTERLEAVED;
> +   case INTEL_MSAA_LAYOUT_UMS:
> +   case INTEL_MSAA_LAYOUT_CMS:
> +  return ISL_MSAA_LAYOUT_ARRAY;
> +   default:
> +  unreachable("Invalid MSAA layout");
> +   }
> +}
> +
> +struct surface_state_info {
> +   unsigned num_dwords;
> +   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes 
> */
> +   unsigned reloc_dw;
> +   unsigned aux_reloc_dw;
> +   unsigned tex_mocs;
> +   unsigned rb_mocs;
> +};
> +
> +static const struct surface_state_info surface_state_infos[] = {
> +   [6] = {6,  32, 1,  0},
> +   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
> +   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
> +   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
> +};
> +
> +uint32_t
> +brw_blorp_emit_surface_state(struct brw_context *brw,
> + const struct brw_blorp_surface_info *surface,
> + uint32_t read_domains, uint32_t write_domain,
> + bool is_render_target)
> +{
> +   /* TODO: This should go in the context */
> +   struct isl_device isl_dev;
> +   isl_device_init(_dev, brw->intelScreen->devinfo, brw->has_swizzling);

I assume you are going to address this in the follow-up series?

> +
> +   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
> +
> +   struct isl_surf surf;
> +   intel_miptree_get_isl_surf(brw, surface->mt, );
> +
> +   /* Stomp surface dimensions and tiling (if needed) with info from blorp */
> +   surf.dim = ISL_SURF_DIM_2D;
> +   surf.dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
> +   surf.msaa_layout = get_isl_msaa_layout(surface->msaa_layout);
> +   surf.logical_level0_px.width = surface->width;
> +   surf.logical_level0_px.height = surface->height;
> +   surf.levels = 1;
> +   surf.samples = MAX2(surface->num_samples, 1);
> +
> +   if (brw->gen == 6 && surface->num_samples > 1) {
> +  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
> +   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
> +   * divide them each by 2.
> +   */
> +  surf.logical_level0_px.width /= 2;
> +  surf.logical_level0_px.height /= 2;
> +   }
> +
> +   if (surface->map_stencil_as_y_tiled) {
> +  /* We need to fake W-tiling with Y-tiling */
> +  surf.tiling = ISL_TILING_Y0;
> +  surf.row_pitch *= 2;
> +   }
> +
> +   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
> +
> +   struct isl_surf *aux_surf = NULL, aux_surf_s;
> +   uint64_t aux_offset = 0;
> +   enum isl_aux_layout aux_layout = ISL_AUX_LAYOUT_NONE;
> +   if (surface->mt->mcs_mt) {
> +  /* We should probably to similar stomping to above but most of the aux
> +   * surf gets ignored when we fill out the surface state anyway so
> +   * there's no point.
> +   */
> +  intel_miptree_get_aux_isl_surf(brw, surface->mt, _surf_s, 
> _layout);
> +  aux_surf = _surf_s;
> +  assert(surface->mt->mcs_mt->offset == 0);
> +  aux_offset = surface->mt->mcs_mt->bo->offset64;
> +
> +  /* We only really need a clear color if we also have an auxiliary
> +   * surfacae.  Without one, it does nothing.

surface

Other than that patches 13-18 look good to me:

Reviewed-by: Topi Pohjolainen 

> +   */
> +  clear_color = intel_miptree_get_isl_clear_color(brw, surface->mt);
> +   }
> +
> +   struct isl_view view = {
> +  .format = surface->brw_surfaceformat,
> +  .base_level = 0,
> +  .levels = 1,
> +  .base_array_layer = 0,
> +  .array_len = 1,
> +  .channel_select = {
> + ISL_CHANNEL_SELECT_RED,
> + ISL_CHANNEL_SELECT_GREEN,
> + ISL_CHANNEL_SELECT_BLUE,
> + ISL_CHANNEL_SELECT_ALPHA,
> +  },
> +  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> +  ISL_SURF_USAGE_TEXTURE_BIT,
> +   };
> +
> +   uint32_t offset, tile_x, tile_y;
> +   offset = brw_blorp_compute_tile_offsets(surface, _x, _y);
> +
> +   uint32_t surf_offset;
> +   uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> +

Re: [Mesa-dev] [Mesa-stable] [PATCH] mapi: Export all GLES 3.1 functions in libGLESv2.so

2016-06-27 Thread Ian Romanick
On 06/24/2016 09:30 AM, Emil Velikov wrote:
> On 20 June 2016 at 19:14, Ian Romanick  wrote:
>> On 06/17/2016 11:15 AM, Emil Velikov wrote:
>>> On 17 June 2016 at 18:20, Ian Romanick  wrote:
 From: Ian Romanick 

 Khronos recommends that the GLES 3.1 library also be called libGLESv2.
 It also requires that functions be statically linkable from that
 library.

 NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension
 since at least Mesa 10.5, so applications targeting Linux should use
 eglGetProcAddress to avoid problems running binaries on systems with
 older, non-GLES 3.1 libGLESv2 libraries.

>>> Fwiw I'm inclined that we should go the "opposite direction". Namely:
>>> don't expose new symbols and stick to a predefined version (3.0 being
>>> the personal favour of choice).
>>>
>>> Why, you might ask - for a couple of reasons:
>>>  - If the list continues to grow programs will have unstable ABI -
>>> sort of how libGL ended up. Applications are going to link against 3.1
>>> or later symbols [1], even if they only optionally use them. Thus
>>> things will quite hairy and fragile.
>>
>> There are at least two solutions, and piglit uses both.  If use of a set
>> of functions is optional, you can still use GetProcAddress (when
>> EGL_KHR_get_all_proc_addresses is available) or you can use dlsym.
>>
>> For me, piglit is where this whole problem actually started.  Right now,
>> piglit follows the (unextended) rules and does not attempt to use
>> GetProcAddress on core functions.  It uses dlsym.  I tried to extend
>> shader_runner for separate shader objects on GLES.  Guess what?  Since
>> the symbols aren't exported by the library, it didn't work.  So... now
>> piglit would need TWO code paths... one that uses dlsym and one that
>> uses eglGetProcAddress... or require an optional extension.
>>
> I've started looking at piglit last night. There should be some fixes
> for it on the list later on today.
> 
>> If an application requires GLES 3.1 symbols, it should just be able to
>> link with them.  As far as I can tell, that's how it works on Android.
>>
> I look at the Android wrapper too closely for the following reasons:
> 
> - There is libGLESv3.so which is identical copy of the v2 one.
> - Their libGLESv2/3.so periodically grows new symbols, including GLES
> extensions.
> - Android has tight control what and/how it's run on their platform -
> something that Linux distributions cannot do afaict.
> - Applications using GLES should annotate the version used in the
> manifest, which (haven't checked exactly) could serve as a first line
> of defence for applications e.g. using GLES 3.1 on system/drivers
> supporting GLES 3.0.
> 
> That said, there is one very good thing:
> - They use dlsym and then eglGetProcAddress on all symbols. Thus mesa
> will just work.
> 
>>>  - The other desktop GLES* provider NVIDIA does not export even a
>>> single GLES 3.1/3.2 entry point (still going through the 3.0 list) in
>>> their libGLESv2.so.2 binary.
>>>
>>> So what to do with GLES (3.0?)/3.1 and later:
>>>  - tweak the spec so that said version of the API is only supported if
>>> the implementation can get core symbols via eglGetProcAddress. Be that
>>> props to the EGL_KHR_get_all_proc_addresses extension or EGL 1.5 [2].
>>>
> Any "sounds ok" or "that's a horrible idea" input on this suggestion ?

That ship has already sailed.  OpenGL ES 3.0 and 3.1 have both been
shipping for years.  I don't think changing that is how I would use my
time machine. :)

> Thanks
> -Emil
> 
>>> How does this sound ? I guess the best way would be to check with
>>> other implementations (note Catalyst still seems to be missing
>>> libGLES*) and choose the more common route ?
>>>
>>>
>>> Regards,
>>> Emil
>>>
>>> [1] Yes, in practise they should use libepoxy or a similar library,
>>> but from practise we all know that even those tend to have bugs. Sadly
>>> libepoxy in particular hasn't seen much action in a while.
>>>
>>> [2] https://github.com/anholt/libepoxy/issues/21
>>> ___
>>> mesa-stable mailing list
>>> mesa-sta...@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
>>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Piglit] [RFC PATCH] arb_texture_barrier: call glTextureBarrier after each glDrawRangeElements

2016-06-27 Thread Ilia Mirkin
On Mon, Jun 27, 2016 at 6:36 AM, Alejandro Piñeiro  wrote:
>
>
> On 27/06/16 12:33, Alejandro Piñeiro wrote:
>> On 27/06/16 03:08, Grazvydas Ignotas wrote:
>>> On Sat, Jun 25, 2016 at 4:54 PM, Alejandro Piñeiro  
>>> wrote:
 In theory they don't overdrawn. The test has a square formed by N
 non-overlapping triangles. With just one call to glDrawRangeElements,
 this always works. But if we split it in M subsets, so M calls to
 glDrawRangeElements with N/M triangles per call, with low resolutions
 and a high amount of triangles, some texels start to fail. And not
 always the same amount of texels (so as you mentioned, undefined
 territory). Although it is a guess, I assume that is cause for artifacts
 on the triangle borders when drawing with so low resolution.
>>> Perhaps this is somehow related to this bug:
>>> https://bugs.freedesktop.org/show_bug.cgi?id=96624
>>> There a texture problem shows up after one of glDrawRangeElements
>>> calls with a lot of primitives.
>> Not sure if they are related. Note that the issue I mention gets fully
>> solved with glTextureBarrier and additionally, it happens on haswell too.
>>
>> In any case, reading that bug gave me an idea. I have been toying with
>> MESA_DEBUG=flush (that until your email I didn't know that exists).
>> Looking at the code, it adds the equivalent to a glFlush after each
>> drawing call. Using it gets the test (all 48 subtests) passing. So right
>> now Im more biased to think that is an issue on glTextureBarrier.
>
> Another hint towards being an issue on glTextureBarrier implementation:
> if instead of using MESA_DEBUG=flush, I replace glTextureBarrier for
> glFlush on the original test (so one full flush after all the square,
> but not after each drawing call) all the subtests keep passing too.

I flipped ARB_texture_barrier on for i965 based on comments that
others made about how the hw worked. I don't think anyone was *too*
sure that it was correct, but the tests (at the time) passed, so I
don't think anyone gave it much more thought. Commit 71e187430c, ftr.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/34] i965/miptree: Add a helper for getting the aux isl_surf from a miptree

2016-06-27 Thread Pohjolainen, Topi
On Mon, Jun 27, 2016 at 08:40:41AM -0700, Jason Ekstrand wrote:
>On Mon, Jun 27, 2016 at 8:34 AM, Pohjolainen, Topi
><[1]topi.pohjolai...@intel.com> wrote:
> 
>On Thu, Jun 23, 2016 at 02:00:11PM -0700, Jason Ekstrand wrote:
>> ---
>>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 89
>+++
>>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 ++
>>  2 files changed, 94 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> index 8a746ec..0f17411 100644
>> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> @@ -3167,6 +3167,95 @@ intel_miptree_get_isl_surf(struct brw_context
>*brw,
>> surf->usage = 0; /* TODO */
>>  }
>>
>> +/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND
>CANNOT BE
>> + * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A
>SURFACE IS TO
>> + * PASS IT INTO isl_surf_fill_state.
>> + */
>> +void
>> +intel_miptree_get_aux_isl_surf(struct brw_context *brw,
>> +   const struct intel_mipmap_tree *mt,
>> +   struct isl_surf *surf,
>> +   enum isl_aux_layout *layout)
>> +{
>> +   /* Much is the same as the regular surface */
>> +   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
>> +
>> +   /* Figure out the layout */
>> +   if (mt->num_samples > 1) {
>> +  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS)
>> + *layout = ISL_AUX_LAYOUT_MCS;
>> +  else
>> + *layout = ISL_AUX_LAYOUT_NONE;
>> +   } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
>> +  assert(brw->gen >= 9);
>> +  *layout = ISL_AUX_LAYOUT_CCS_E;
>> +   } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS)
>{
>> +  *layout = ISL_AUX_LAYOUT_CCS_D;
>> +   } else {
>> +  *layout = ISL_AUX_LAYOUT_NONE;
>> +   }
> 
>  Logic below doesn't use the resolved value of "*layout". Would it be
>  cleaner
>  to have this if-else-ladder as its own small helper function?
> 
>One version of these patches did exactly that.  Then I realized that
>get_aux_isl_surf and get_aux_layout were always called together and
>thought it was more clear to actually have them together.  I can split
>it back out if you'd like.

Ok, that is what I sort of expected - I haven't gone through the rest of the
series yet. I'm fine leaving it as is - it is easy to split later on when
another user emerges.

>--Jason
> 
>> +
>> +   /* Figure out the format of the auxiliary surface */
>> +   switch (mt->num_samples) {
>> +   case 0:
>> +   case 1:
>> +  /*
>> +   * From the BDW PRM, Volume 2d, page 260
>(RENDER_SURFACE_STATE):
>> +   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
>> +   *
>> +   * From the hardware spec for GEN9:
>> +   * "When Auxiliary Surface Mode is set to AUX_CCS_D or
>AUX_CCS_E, HALIGN
>> +   *  16 must be used."
>> +   */
>> +  if (brw->gen >= 9 || mt->num_samples == 1)
>> + assert(mt->halign == 16);
>> +
>> +  if (brw->gen >= 9) {
>> + assert(mt->tiling == I915_TILING_Y);
>> + switch (_mesa_get_format_bytes(mt->format)) {
>> + case 4:  surf->format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
>> + case 8:  surf->format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
>> + case 16: surf->format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
>> + default:
>> +unreachable("Invalid format size for color
>compression");
>> + }
>> +  } else if (mt->tiling == I915_TILING_Y) {
>> + switch (_mesa_get_format_bytes(mt->format)) {
>> + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_Y;
>break;
>> + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_Y;
>break;
>> + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;
>break;
>> + default:
>> +unreachable("Invalid format size for color
>compression");
>> + }
>> +  } else {
>> + assert(mt->tiling == I915_TILING_X);
>> + switch (_mesa_get_format_bytes(mt->format)) {
>> + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_X;
>break;
>> + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_X;
>break;
>> + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;
>break;
>> + default:
>> +unreachable("Invalid format size for color
>compression");
>> + }
>> +  }
>> +  break;
>> +
>> +   case 2:
>> + 

Re: [Mesa-dev] [PATCH v2 12/34] i965/miptree: Add a helper for getting the aux isl_surf from a miptree

2016-06-27 Thread Jason Ekstrand
On Mon, Jun 27, 2016 at 8:34 AM, Pohjolainen, Topi <
topi.pohjolai...@intel.com> wrote:

> On Thu, Jun 23, 2016 at 02:00:11PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 89
> +++
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 ++
> >  2 files changed, 94 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 8a746ec..0f17411 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -3167,6 +3167,95 @@ intel_miptree_get_isl_surf(struct brw_context
> *brw,
> > surf->usage = 0; /* TODO */
> >  }
> >
> > +/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND
> CANNOT BE
> > + * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A
> SURFACE IS TO
> > + * PASS IT INTO isl_surf_fill_state.
> > + */
> > +void
> > +intel_miptree_get_aux_isl_surf(struct brw_context *brw,
> > +   const struct intel_mipmap_tree *mt,
> > +   struct isl_surf *surf,
> > +   enum isl_aux_layout *layout)
> > +{
> > +   /* Much is the same as the regular surface */
> > +   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
> > +
> > +   /* Figure out the layout */
> > +   if (mt->num_samples > 1) {
> > +  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS)
> > + *layout = ISL_AUX_LAYOUT_MCS;
> > +  else
> > + *layout = ISL_AUX_LAYOUT_NONE;
> > +   } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
> > +  assert(brw->gen >= 9);
> > +  *layout = ISL_AUX_LAYOUT_CCS_E;
> > +   } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
> > +  *layout = ISL_AUX_LAYOUT_CCS_D;
> > +   } else {
> > +  *layout = ISL_AUX_LAYOUT_NONE;
> > +   }
>
> Logic below doesn't use the resolved value of "*layout". Would it be
> cleaner
> to have this if-else-ladder as its own small helper function?
>

One version of these patches did exactly that.  Then I realized that
get_aux_isl_surf and get_aux_layout were always called together and thought
it was more clear to actually have them together.  I can split it back out
if you'd like.
--Jason


> > +
> > +   /* Figure out the format of the auxiliary surface */
> > +   switch (mt->num_samples) {
> > +   case 0:
> > +   case 1:
> > +  /*
> > +   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> > +   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> > +   *
> > +   * From the hardware spec for GEN9:
> > +   * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E,
> HALIGN
> > +   *  16 must be used."
> > +   */
> > +  if (brw->gen >= 9 || mt->num_samples == 1)
> > + assert(mt->halign == 16);
> > +
> > +  if (brw->gen >= 9) {
> > + assert(mt->tiling == I915_TILING_Y);
> > + switch (_mesa_get_format_bytes(mt->format)) {
> > + case 4:  surf->format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
> > + case 8:  surf->format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
> > + case 16: surf->format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
> > + default:
> > +unreachable("Invalid format size for color compression");
> > + }
> > +  } else if (mt->tiling == I915_TILING_Y) {
> > + switch (_mesa_get_format_bytes(mt->format)) {
> > + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_Y;break;
> > + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_Y;break;
> > + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
> > + default:
> > +unreachable("Invalid format size for color compression");
> > + }
> > +  } else {
> > + assert(mt->tiling == I915_TILING_X);
> > + switch (_mesa_get_format_bytes(mt->format)) {
> > + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_X;break;
> > + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_X;break;
> > + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
> > + default:
> > +unreachable("Invalid format size for color compression");
> > + }
> > +  }
> > +  break;
> > +
> > +   case 2:
> > +  surf->format = ISL_FORMAT_MCS_2X;
> > +  break;
> > +   case 4:
> > +  surf->format = ISL_FORMAT_MCS_4X;
> > +  break;
> > +   case 8:
> > +  surf->format = ISL_FORMAT_MCS_8X;
> > +  break;
> > +   case 16:
> > +  surf->format = ISL_FORMAT_MCS_16X;
> > +  break;
> > +   default:
> > +  unreachable("Invalid number of samples");
> > +   }
> > +}
> > +
> >  union isl_color_value
> >  intel_miptree_get_isl_clear_color(struct brw_context *brw,
> >const struct intel_mipmap_tree *mt)
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> 

Re: [Mesa-dev] [PATCH v2 12/34] i965/miptree: Add a helper for getting the aux isl_surf from a miptree

2016-06-27 Thread Pohjolainen, Topi
On Thu, Jun 23, 2016 at 02:00:11PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 89 
> +++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 ++
>  2 files changed, 94 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 8a746ec..0f17411 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3167,6 +3167,95 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
> surf->usage = 0; /* TODO */
>  }
>  
> +/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND CANNOT 
> BE
> + * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A SURFACE IS 
> TO
> + * PASS IT INTO isl_surf_fill_state.
> + */
> +void
> +intel_miptree_get_aux_isl_surf(struct brw_context *brw,
> +   const struct intel_mipmap_tree *mt,
> +   struct isl_surf *surf,
> +   enum isl_aux_layout *layout)
> +{
> +   /* Much is the same as the regular surface */
> +   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
> +
> +   /* Figure out the layout */
> +   if (mt->num_samples > 1) {
> +  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS)
> + *layout = ISL_AUX_LAYOUT_MCS;
> +  else
> + *layout = ISL_AUX_LAYOUT_NONE;
> +   } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
> +  assert(brw->gen >= 9);
> +  *layout = ISL_AUX_LAYOUT_CCS_E;
> +   } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
> +  *layout = ISL_AUX_LAYOUT_CCS_D;
> +   } else {
> +  *layout = ISL_AUX_LAYOUT_NONE;
> +   }

Logic below doesn't use the resolved value of "*layout". Would it be cleaner
to have this if-else-ladder as its own small helper function?

> +
> +   /* Figure out the format of the auxiliary surface */
> +   switch (mt->num_samples) {
> +   case 0:
> +   case 1:
> +  /*
> +   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> +   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> +   *
> +   * From the hardware spec for GEN9:
> +   * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, 
> HALIGN
> +   *  16 must be used."
> +   */
> +  if (brw->gen >= 9 || mt->num_samples == 1)
> + assert(mt->halign == 16);
> +
> +  if (brw->gen >= 9) {
> + assert(mt->tiling == I915_TILING_Y);
> + switch (_mesa_get_format_bytes(mt->format)) {
> + case 4:  surf->format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
> + case 8:  surf->format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
> + case 16: surf->format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
> + default:
> +unreachable("Invalid format size for color compression");
> + }
> +  } else if (mt->tiling == I915_TILING_Y) {
> + switch (_mesa_get_format_bytes(mt->format)) {
> + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_Y;break;
> + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_Y;break;
> + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
> + default:
> +unreachable("Invalid format size for color compression");
> + }
> +  } else {
> + assert(mt->tiling == I915_TILING_X);
> + switch (_mesa_get_format_bytes(mt->format)) {
> + case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_X;break;
> + case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_X;break;
> + case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
> + default:
> +unreachable("Invalid format size for color compression");
> + }
> +  }
> +  break;
> +
> +   case 2:
> +  surf->format = ISL_FORMAT_MCS_2X;
> +  break;
> +   case 4:
> +  surf->format = ISL_FORMAT_MCS_4X;
> +  break;
> +   case 8:
> +  surf->format = ISL_FORMAT_MCS_8X;
> +  break;
> +   case 16:
> +  surf->format = ISL_FORMAT_MCS_16X;
> +  break;
> +   default:
> +  unreachable("Invalid number of samples");
> +   }
> +}
> +
>  union isl_color_value
>  intel_miptree_get_isl_clear_color(struct brw_context *brw,
>const struct intel_mipmap_tree *mt)
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index a50f181..6422e42 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -801,6 +801,11 @@ void
>  intel_miptree_get_isl_surf(struct brw_context *brw,
> const struct intel_mipmap_tree *mt,
> struct isl_surf *surf);
> +void
> +intel_miptree_get_aux_isl_surf(struct brw_context *brw,
> +   const struct intel_mipmap_tree *mt,
> +   

[Mesa-dev] [PATCH] glsl: atomic counters are different than their uniforms

2016-06-27 Thread Andres Gomez
The linker deals with atomic counters in terms of uniforms. This is OK
but when we want to know the number of used atomic counters since a 2
elements atomic counters array will use 2 counters but only 1 uniform.

Renamed the data structures used in the linker for disambiguation.

Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/link_atomics.cpp | 83 --
 1 file changed, 44 insertions(+), 39 deletions(-)

diff --git a/src/compiler/glsl/link_atomics.cpp 
b/src/compiler/glsl/link_atomics.cpp
index 277d473..10c3c1a 100644
--- a/src/compiler/glsl/link_atomics.cpp
+++ b/src/compiler/glsl/link_atomics.cpp
@@ -30,9 +30,9 @@
 
 namespace {
/*
-* Atomic counter as seen by the program.
+* Atomic counter uniform as seen by the program.
 */
-   struct active_atomic_counter {
+   struct active_atomic_counter_uniform {
   unsigned uniform_loc;
   ir_variable *var;
};
@@ -44,44 +44,44 @@ namespace {
 */
struct active_atomic_buffer {
   active_atomic_buffer()
- : counters(0), num_counters(0), stage_references(), size(0)
+ : uniforms(0), num_uniforms(0), stage_counter_references(), size(0)
   {}
 
   ~active_atomic_buffer()
   {
- free(counters);
+ free(uniforms);
   }
 
   void push_back(unsigned uniform_loc, ir_variable *var)
   {
- active_atomic_counter *new_counters;
+ active_atomic_counter_uniform *new_uniforms;
 
- new_counters = (active_atomic_counter *)
-realloc(counters, sizeof(active_atomic_counter) *
-(num_counters + 1));
+ new_uniforms = (active_atomic_counter_uniform *)
+realloc(uniforms, sizeof(active_atomic_counter_uniform) *
+(num_uniforms + 1));
 
- if (new_counters == NULL) {
+ if (new_uniforms == NULL) {
 _mesa_error_no_memory(__func__);
 return;
  }
 
- counters = new_counters;
- counters[num_counters].uniform_loc = uniform_loc;
- counters[num_counters].var = var;
- num_counters++;
+ uniforms = new_uniforms;
+ uniforms[num_uniforms].uniform_loc = uniform_loc;
+ uniforms[num_uniforms].var = var;
+ num_uniforms++;
   }
 
-  active_atomic_counter *counters;
-  unsigned num_counters;
-  unsigned stage_references[MESA_SHADER_STAGES];
+  active_atomic_counter_uniform *uniforms;
+  unsigned num_uniforms;
+  unsigned stage_counter_references[MESA_SHADER_STAGES];
   unsigned size;
};
 
int
cmp_actives(const void *a, const void *b)
{
-  const active_atomic_counter *const first = (active_atomic_counter *) a;
-  const active_atomic_counter *const second = (active_atomic_counter *) b;
+  const active_atomic_counter_uniform *const first = 
(active_atomic_counter_uniform *) a;
+  const active_atomic_counter_uniform *const second = 
(active_atomic_counter_uniform *) b;
 
   return int(first->var->data.offset) - int(second->var->data.offset);
}
@@ -103,9 +103,9 @@ namespace {
const unsigned shader_stage)
{
   /* FIXME: Arrays of arrays get counted separately. For example:
-   * x1[3][3][2] = 9 counters
-   * x2[3][2]= 3 counters
-   * x3[2]   = 1 counter
+   * x1[3][3][2] = 9 uniforms, 18 atomic counters
+   * x2[3][2]= 3 uniforms, 6 atomic counters
+   * x3[2]   = 1 uniforms, 2 atomic counters
*
* However this code marks all the counters as active even when they
* might not be used.
@@ -129,7 +129,12 @@ namespace {
 
  buf->push_back(*uniform_loc, var);
 
- buf->stage_references[shader_stage]++;
+ /* When checking for atomic counters we should count every member in
+  * an array as an atomic counter reference. */
+ if (t->is_array())
+buf->stage_counter_references[shader_stage] += t->length;
+ else
+buf->stage_counter_references[shader_stage]++;
  buf->size = MAX2(buf->size, *offset + t->atomic_size());
 
  storage->offset = *offset;
@@ -170,22 +175,22 @@ namespace {
  if (buffers[i].size == 0)
 continue;
 
- qsort(buffers[i].counters, buffers[i].num_counters,
-   sizeof(active_atomic_counter),
+ qsort(buffers[i].uniforms, buffers[i].num_uniforms,
+   sizeof(active_atomic_counter_uniform),
cmp_actives);
 
- for (unsigned j = 1; j < buffers[i].num_counters; j++) {
+ for (unsigned j = 1; j < buffers[i].num_uniforms; j++) {
 /* If an overlapping counter found, it must be a reference to the
  * same counter from a different shader stage.
  */
-if (check_atomic_counters_overlap(buffers[i].counters[j-1].var,
- 

Re: [Mesa-dev] [PATCH 1/2] i965: Make emit_urb_writes() not produce an EOT message for GS.

2016-06-27 Thread Kenneth Graunke
On Monday, June 27, 2016 2:50:02 PM PDT Iago Toral wrote:
> On Sun, 2016-06-26 at 01:53 -0700, Kenneth Graunke wrote:
> > emit_urb_writes() contains code to emit an EOT write with no actual
> > data when there are no output varyings.  This makes sense for the VS
> > and TES stages, where it's called once at the end of the program.
> > 
> > However, in the geometry shader stage, emit_urb_writes() is called once
> > for every EmitVertex().  We explicitly emit a URB write with EOT set at
> > the end of the shader, separately from this path.  So we'd better not
> > terminate the thread.  This could get us into trouble for shaders which
> > do EmitVertex() with no varyings followed by SSBO/image/atomic writes.
> > 
> > It also caused us to emit multiple sends with EOT set, which apparently
> > confuses the register allocator into not using g112-g127 for all but
> > the first one.  This caused EU validation failures in OglGSCloth
> > shaders in shader-db.  (The actual application was fine, but shader-db
> > thinks there are no outputs because it doesn't understand transform
> > feedback.)
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > index 3a49794..4a1ff30 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > @@ -594,6 +594,10 @@ fs_visitor::emit_urb_writes(const fs_reg 
> > _vertex_count)
> >  *"The write data payload can be between 1 and 8 message phases 
> > long."
> >  */
> > if (vue_map->slots_valid == 0) {
> > +  /* For GS, just turn EmitVertex() into a no-op. */
> 
> Maybe it would be better to explain in this comment why we can do this
> safely here, which as you say would be because for GS we will send a
> send with EOT set at the end of the shader in any case.
> 
> Both patches are:
> Reviewed-by: Iago Toral Quiroga 

Good call.  I've updated this locally to:

  /* For GS, just turn EmitVertex() into a no-op.  We don't want it to
   * end the thread, and emit_gs_thread_end() already emits a SEND with
   * EOT at the end of the program for us.
   */

Thanks Iago!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

2016-06-27 Thread Christian König

This code fragment:

+   /* t = frac(i_vtex*size)
...
+   ureg_MUL(shader, t, i_vtex, ureg_imm1f(shader, size));

Probably doesn't do what you expect it to do when the pixel center is at 
0.5 instead of 0.0.


For the matrix and most other filters the difference doesn't matter 
because you get the same offset on x/y as input you need to apply in the 
texture instructions as well.


Regards,
Christian.

Am 27.06.2016 um 15:51 schrieb Nayan Deshmukh:

Hi Christian,

I haven't taken that into account, but how will it any way affect my 
calculation. I have written

the code taking inspiration from the way matrix_filter uses offsets.

Regards,
Nayan.

On Mon, Jun 27, 2016 at 6:55 PM, Christian König 
> wrote:


Hi guys,

Nayan have you taken into account that the pixel center is at 0.5
and not 0.0?

Regards,
Christian.


Am 26.06.2016 um 22:30 schrieb Andy Furniss:

Nayan Deshmukh wrote:

Hi Andy,

On Sun, Jun 26, 2016 at 12:25 AM, Andy Furniss
> wrote:

Nayan Deshmukh wrote:

Hi Andy,

Thanks for testing the patches.

Please send me the videos and ratios with which
there is corruption.




https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing


This has no aspect encoded and displayed fullscreen on
a 1920x1080
monitor shows vertical line artifacts over the first
2/3 of the image.

When I say lines they are not lines as such just that
the distortion
on the pendulum shows as it passes over imaginary
lines at fixed
points on the screen.

with mplayer -aspect 4/3 or 16/9 it doesn't.


I tested the videos and found out that the distortion is
because of the
amount
of calculation done in the fragment shader. I tested the
video with
vl_median_filter
and it showed no distortion however, with
vl_matrix_filter( which requires
more
calculations than vl_median_filter) it showed the same
distortion. I'll try
to make it
more efficient. But it still requires a lot of processing
for a single
pixel as it uses
15 neighbouring pixel.


Seems a bit strange, does the processing needed vary greatly with
similar scale amounts? I have a powerful GPU and can force clocks
high, but it makes no difference.

Below is a png showing the artifacts I see on pendulum fullscreen
are these what you see?

If rather than full screen I stretch out the window to scale,
there
will be many sizes that don't produce those.


https://drive.google.com/file/d/0BxP5-S1t9VEEd2hwNVp0ZXRSZTA/view?usp=sharing


Also I don't see any offsets with the videos, may be I am
missing something.
If could tell me more about the offsets, I'll try to debug
them.



https://drive.google.com/file/d/0BxP5-S1t9VEEUGZTbndOMzBNZnM/view?usp=sharing


Is a default scale, if you download both pngs and use something to
display them both at the same time and line up the windows one on
top of the other then flip between them you can see although the
windows are lined up the images contained are not.

You can make your own screen/window shots with xwd and display
them
with xwud. For me using fluxbox as a desktop it's easy to line up
windows as they snap a bit towards the edge of the screen YMMV.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

https://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium: add pipe_surface::alpha_one field

2016-06-27 Thread Brian Paul

On 06/24/2016 08:45 AM, Roland Scheidegger wrote:

For what formats is this really needed? I think that usually if you have
a rgb surface, the corresponding rgbx format should be used instead of
rgba (which implicitly has the alpha == 1 property for blending). But
maybe some formats are missing...


This all came about while finishing up support for GL_ARB_copy_image. 
Briefly, supporting RGBX caused quite a bit of grief that could be 
avoided by disabling the RGBX format and using RGBA instead.  But this 
leads to the blending issue (which, again, is a corner case not likely 
to be hit in practice).


Some 'fallback' paths for GL_ARB_copy_image use pipe_context::blit() and 
involves creating RGBA8_UNORM sampler views and surface views of the 
underlying resource.  Since our svga3d device doesn't have RGBX8_UINT, 
RGBX8_SNORM, etc. we use RGBA8_UINT, RGBA8_SNORM, etc. but we do support 
RGBX8_UNORM.  So copying from GL_RGB8UI to GL_RGB8, for example, 
involves creating a RGBA8_UNORM view of a RGBX8_TYPLESS surface, which 
IIRC, is disallowed by our svga3d device.


Before I quit on Friday, however, I think I found another piece of this 
puzzle that might make RGBX work.  I'll try to get back on that today.


-Brian



Am 24.06.2016 um 04:07 schrieb Brian Paul:

If the user requests an RGB drawing surface but we actually create an
RGBA surface, we need it to act as if A=1.  For blending, this means
adjusting the blending terms to use 1/0 instead of DST_ALPHA/INV_DST_ALPHA.
Drivers can use this flag to determine when that's needed.

A previous patch I posted last year did this entirely in the state tracker
but it involved making blend state dependent on the framebuffer state.
This approach avoids that dependency.
---
  src/gallium/include/pipe/p_state.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 9c69355..8b0c3a2 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -387,6 +387,7 @@ struct pipe_surface
 unsigned height;  /**< logical height in pixels */

 unsigned writable:1;  /**< writable shader resource */
+   unsigned alpha_one:1;/**< Should an RGBA surface should act like RGB? */

 union {
struct {





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

2016-06-27 Thread Nayan Deshmukh
Hi Christian,

I haven't taken that into account, but how will it any way affect my
calculation. I have written
the code taking inspiration from the way matrix_filter uses offsets.

Regards,
Nayan.

On Mon, Jun 27, 2016 at 6:55 PM, Christian König 
wrote:

> Hi guys,
>
> Nayan have you taken into account that the pixel center is at 0.5 and not
> 0.0?
>
> Regards,
> Christian.
>
>
> Am 26.06.2016 um 22:30 schrieb Andy Furniss:
>
>> Nayan Deshmukh wrote:
>>
>>> Hi Andy,
>>>
>>> On Sun, Jun 26, 2016 at 12:25 AM, Andy Furniss 
>>> wrote:
>>>
>>> Nayan Deshmukh wrote:

 Hi Andy,
>
> Thanks for testing the patches.
>
> Please send me the videos and ratios with which there is corruption.
>
>


 https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing

 This has no aspect encoded and displayed fullscreen on a 1920x1080
 monitor shows vertical line artifacts over the first 2/3 of the image.

 When I say lines they are not lines as such just that the distortion
 on the pendulum shows as it passes over imaginary lines at fixed
 points on the screen.

 with mplayer -aspect 4/3 or 16/9 it doesn't.


>>> I tested the videos and found out that the distortion is because of the
>>> amount
>>> of calculation done in the fragment shader. I tested the video with
>>> vl_median_filter
>>> and it showed no distortion however, with vl_matrix_filter( which
>>> requires
>>> more
>>> calculations than vl_median_filter) it showed the same distortion. I'll
>>> try
>>> to make it
>>> more efficient. But it still requires a lot of processing for a single
>>> pixel as it uses
>>> 15 neighbouring pixel.
>>>
>>
>> Seems a bit strange, does the processing needed vary greatly with
>> similar scale amounts? I have a powerful GPU and can force clocks
>> high, but it makes no difference.
>>
>> Below is a png showing the artifacts I see on pendulum fullscreen
>> are these what you see?
>>
>> If rather than full screen I stretch out the window to scale, there
>> will be many sizes that don't produce those.
>>
>>
>> https://drive.google.com/file/d/0BxP5-S1t9VEEd2hwNVp0ZXRSZTA/view?usp=sharing
>>
>> Also I don't see any offsets with the videos, may be I am missing
>>> something.
>>> If could tell me more about the offsets, I'll try to debug them.
>>>
>>
>>
>> https://drive.google.com/file/d/0BxP5-S1t9VEEUGZTbndOMzBNZnM/view?usp=sharing
>>
>> Is a default scale, if you download both pngs and use something to
>> display them both at the same time and line up the windows one on
>> top of the other then flip between them you can see although the
>> windows are lined up the images contained are not.
>>
>> You can make your own screen/window shots with xwd and display them
>> with xwud. For me using fluxbox as a desktop it's easy to line up
>> windows as they snap a bit towards the edge of the screen YMMV.
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: update MESA_DEBUG envvar documentation.

2016-06-27 Thread Brian Paul

On 06/27/2016 02:10 AM, Alejandro Piñeiro wrote:

silent, flush, incomplete_tex and incomplete_fbo flags were not
documented (see src/mesa/main.debug.c for more info).

FP is not checked anymore.
---

Didn't know about the flush option for MESA_DEBUG until Grazvydas Ignotas
mentioned it on a freedesktop bug comment (thanks), so this patch documents
it.

CCing Eric Anholt because MESA_DEBUG=flush and VC4_DEBUG=always_flush seems
to have basically the same purpose, in case one needs to be removed.

CCing Brian because he added the incomplete_tex and incomplete_fbo flags
with commit 93bcf7, and I just added a really basic documentation line for
those options, just in case it is worth more details.

Note that I assume that all the specific flags for MESA_DEBUG are handled
at debuc.c:set_debug_flags. As it is not taking into account FP, Im also
dropping the FP line with this patch.


  docs/envvars.html | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index ed957bd..abd8a72 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -50,8 +50,13 @@ sometimes be useful for debugging end-user issues.
 if the application generates a GL_INVALID_ENUM error, a corresponding error
 message indicating where the error occurred, and possibly why, will be
 printed to stderr.
-   If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will
-   generate exceptions.
+  In addition to just set it, MESA_DEBUG accepts the following flags:


The first part of that sentence doesn't sound right.


+  
+silent - turn off debug messages
+flush - flush after each drawing command
+incomplete_tex - extra debug messages when a texture is incomplete
+incomplete_fbo - extra debug messages when a fbo is incomplete
+  


I don't even remember, does MESA_DEBUG take a comma-separated list of 
these value?


Some obscure env vars/settings haven't been documented because they're 
of limited use to end users.


Also, isn't MESA_DEBUG ignored in release builds?  Maybe you can check 
on that.


-Brian



  MESA_LOG_FILE - specifies a file name for logging all errors, warnings,
  etc., rather than stderr
  MESA_TEX_PROG - if set, implement conventional texture env modes with



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

2016-06-27 Thread Christian König

Hi guys,

Nayan have you taken into account that the pixel center is at 0.5 and 
not 0.0?


Regards,
Christian.

Am 26.06.2016 um 22:30 schrieb Andy Furniss:

Nayan Deshmukh wrote:

Hi Andy,

On Sun, Jun 26, 2016 at 12:25 AM, Andy Furniss  
wrote:



Nayan Deshmukh wrote:


Hi Andy,

Thanks for testing the patches.

Please send me the videos and ratios with which there is corruption.




https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing 



This has no aspect encoded and displayed fullscreen on a 1920x1080
monitor shows vertical line artifacts over the first 2/3 of the image.

When I say lines they are not lines as such just that the distortion
on the pendulum shows as it passes over imaginary lines at fixed
points on the screen.

with mplayer -aspect 4/3 or 16/9 it doesn't.



I tested the videos and found out that the distortion is because of the
amount
of calculation done in the fragment shader. I tested the video with
vl_median_filter
and it showed no distortion however, with vl_matrix_filter( which 
requires

more
calculations than vl_median_filter) it showed the same distortion. 
I'll try

to make it
more efficient. But it still requires a lot of processing for a single
pixel as it uses
15 neighbouring pixel.


Seems a bit strange, does the processing needed vary greatly with
similar scale amounts? I have a powerful GPU and can force clocks
high, but it makes no difference.

Below is a png showing the artifacts I see on pendulum fullscreen
are these what you see?

If rather than full screen I stretch out the window to scale, there
will be many sizes that don't produce those.

https://drive.google.com/file/d/0BxP5-S1t9VEEd2hwNVp0ZXRSZTA/view?usp=sharing 



Also I don't see any offsets with the videos, may be I am missing 
something.

If could tell me more about the offsets, I'll try to debug them.


https://drive.google.com/file/d/0BxP5-S1t9VEEUGZTbndOMzBNZnM/view?usp=sharing 



Is a default scale, if you download both pngs and use something to
display them both at the same time and line up the windows one on
top of the other then flip between them you can see although the
windows are lined up the images contained are not.

You can make your own screen/window shots with xwd and display them
with xwud. For me using fluxbox as a desktop it's easy to line up
windows as they snap a bit towards the edge of the screen YMMV.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Make emit_urb_writes() not produce an EOT message for GS.

2016-06-27 Thread Iago Toral
On Sun, 2016-06-26 at 01:53 -0700, Kenneth Graunke wrote:
> emit_urb_writes() contains code to emit an EOT write with no actual
> data when there are no output varyings.  This makes sense for the VS
> and TES stages, where it's called once at the end of the program.
> 
> However, in the geometry shader stage, emit_urb_writes() is called once
> for every EmitVertex().  We explicitly emit a URB write with EOT set at
> the end of the shader, separately from this path.  So we'd better not
> terminate the thread.  This could get us into trouble for shaders which
> do EmitVertex() with no varyings followed by SSBO/image/atomic writes.
> 
> It also caused us to emit multiple sends with EOT set, which apparently
> confuses the register allocator into not using g112-g127 for all but
> the first one.  This caused EU validation failures in OglGSCloth
> shaders in shader-db.  (The actual application was fine, but shader-db
> thinks there are no outputs because it doesn't understand transform
> feedback.)
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 3a49794..4a1ff30 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -594,6 +594,10 @@ fs_visitor::emit_urb_writes(const fs_reg 
> _vertex_count)
>  *"The write data payload can be between 1 and 8 message phases long."
>  */
> if (vue_map->slots_valid == 0) {
> +  /* For GS, just turn EmitVertex() into a no-op. */

Maybe it would be better to explain in this comment why we can do this
safely here, which as you say would be because for GS we will send a
send with EOT set at the end of the shader in any case.

Both patches are:
Reviewed-by: Iago Toral Quiroga 

> +  if (stage == MESA_SHADER_GEOMETRY)
> + return;
> +
>fs_reg payload = fs_reg(VGRF, alloc.allocate(2), BRW_REGISTER_TYPE_UD);
>bld.exec_all().MOV(payload, urb_handle);
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add driconf to zero-init unintialized vars

2016-06-27 Thread Rob Clark
On Mon, Jun 27, 2016 at 7:13 AM, Alan Swanson  wrote:
> On 2016-06-25 13:37, Rob Clark wrote:
>>
>> Some games are sloppy.. perhaps because it is defined behavior for DX or
>> perhaps because nv blob driver defaults things to zero.
>>
>> So add driconf param to force uninitialized variables to default to zero.
>>
>> This issue was observed with rust, from steam store.  But has surfaced
>> elsewhere in the past.
>>
>> Signed-off-by: Rob Clark 
>> ---
>> Note that I left out the drirc bit, since not entirely sure how to
>> identify this game.  (I don't actually have the game, just working off
>> of an apitrace)
>>
>> Possibly worth mentioning that for the shaders using uninitialized vars
>> having zero-initializers lets constant-propagation get rid of a whole
>> lot of instructions.  One shader I saw dropped to less than half of
>> it's original instruction count.
>
>
> If the default for uninitialised variables is undefined, then with the
> reported shader optimisations why bother with the (DRI) option when
> zeroing could still essentially be classed as undefined?
>
> Cuts the patch down to just the src/compiler/glsl/ast_to_hir.cpp change.

I did suggest that on #dri-devel, but Jason had a theoretical example
where it would hurt.. iirc something like:

  float maybe_undef;
  for (int i = 0; i < some_uniform_at_least_one; i++)
 maybe_undef = ...

also, he didn't want to hide shader bugs that app should fix.

It would be interesting to rush shaderdb w/ glsl_zero_init=true and
see what happens, but I didn't get around to that yet.

I did have the idea to try and only inject the initializer when the
compiler warns about uninitialized vars, but in the presence of flow
control there seem like some cases that glsl doesn't catch.

BR,
-R


> --
> Alan.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add driconf to zero-init unintialized vars

2016-06-27 Thread Alan Swanson

On 2016-06-25 13:37, Rob Clark wrote:
Some games are sloppy.. perhaps because it is defined behavior for DX 
or

perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to 
zero.


This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark 
---
Note that I left out the drirc bit, since not entirely sure how to
identify this game.  (I don't actually have the game, just working off
of an apitrace)

Possibly worth mentioning that for the shaders using uninitialized vars
having zero-initializers lets constant-propagation get rid of a whole
lot of instructions.  One shader I saw dropped to less than half of
it's original instruction count.


If the default for uninitialised variables is undefined, then with the
reported shader optimisations why bother with the (DRI) option when
zeroing could still essentially be classed as undefined?

Cuts the patch down to just the src/compiler/glsl/ast_to_hir.cpp change.

--
Alan.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] Coding style scripts (Was Re: [PATCH 1/2] gallium: replace [0-9]*.f with [0-9]*.0f)

2016-06-27 Thread Jose Fonseca

On 22/06/16 13:25, Emil Velikov wrote:

Hi All,

Seems like we have a few people are keen on the idea of having some
form of at least semi-automated way to handle coding style issues.

Some options/ideas:
  - Combine the emacs .dir-local.el + emacs -batch to do the checking:
Pros: rules aren't duplicated in multiple places (like the second
option). Cons: not everyone has emacs






  - or, check-in a few (as needed) xa-indent style scripts based on indent.
Pros: indent seems (imho) more widely spread. Cons: the style rules
are duplicated.

IMHO we don't have to 'enforce' one or the other throughout the tree.
Having either one would be beneficial, but definitely not a
requirement.

Once we're happy with that, we could have a simple toplevel
"check-all-style" script, which can be used by both developers and git
hooks. And with time patchwork/other solution will be able to
pre-emptively run these and provide feedback, at which time we'll
toggle the git hooks to reject 'non-compliant' pushes (or even before
the pw stuff is in place) ?

There's a couple small catches
  - I cannot convince emacs to honour .dir-locals.el in batch mode. Any takers ?
  - We might have a couple of initial "x: unify coding style" commits.

Rob, seems like most (all?) of your mode lines in freedreno are busted
(missing : after tab-width, typo(s) in tab-width). Considering there's
a fdno .dir-locals.el are you ok with just nuking them all together ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




I like the idea of using automated checks to check style, and maybe in 
the future do even more things like check for common errors (e.g., 
particularly portability issues that are not caught by any compiler 
warning, using unsafe functions, etc)



But using Emacs seems at odds with not enforcing.  Not enforcing means 
it must be easy for developers to runs the checks/formatters.  So a more 
lightweight dependency would be better.



BTW, I've been using http://editorconfig.org/ on several projects. It's 
widely supported by many editors including Emacs.


There's even Python based tools to check editorconfig ( 
https://github.com/editorconfig/editorconfig/wiki/FAQ#my-files-are-not-automatically-reformatted-the-editorconfig-plugin-is-not-working 
).  We already require Python, downloading a Python package is trivial 
to do nowadays with pip, so we could easily standardize on .editorconfig 
files everywhere.



Another solution could be http://clang.llvm.org/extra/clang-tidy/ .  I 
read it can handle C too.  And this could be a tool we could use beyond 
formatting.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Piglit] [RFC PATCH] arb_texture_barrier: call glTextureBarrier after each glDrawRangeElements

2016-06-27 Thread Alejandro Piñeiro
On 27/06/16 03:08, Grazvydas Ignotas wrote:
> On Sat, Jun 25, 2016 at 4:54 PM, Alejandro Piñeiro  
> wrote:
>> In theory they don't overdrawn. The test has a square formed by N
>> non-overlapping triangles. With just one call to glDrawRangeElements,
>> this always works. But if we split it in M subsets, so M calls to
>> glDrawRangeElements with N/M triangles per call, with low resolutions
>> and a high amount of triangles, some texels start to fail. And not
>> always the same amount of texels (so as you mentioned, undefined
>> territory). Although it is a guess, I assume that is cause for artifacts
>> on the triangle borders when drawing with so low resolution.
> Perhaps this is somehow related to this bug:
> https://bugs.freedesktop.org/show_bug.cgi?id=96624
> There a texture problem shows up after one of glDrawRangeElements
> calls with a lot of primitives.

Not sure if they are related. Note that the issue I mention gets fully
solved with glTextureBarrier and additionally, it happens on haswell too.

In any case, reading that bug gave me an idea. I have been toying with
MESA_DEBUG=flush (that until your email I didn't know that exists).
Looking at the code, it adds the equivalent to a glFlush after each
drawing call. Using it gets the test (all 48 subtests) passing. So right
now Im more biased to think that is an issue on glTextureBarrier.

BR

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Piglit] [RFC PATCH] arb_texture_barrier: call glTextureBarrier after each glDrawRangeElements

2016-06-27 Thread Alejandro Piñeiro


On 27/06/16 12:33, Alejandro Piñeiro wrote:
> On 27/06/16 03:08, Grazvydas Ignotas wrote:
>> On Sat, Jun 25, 2016 at 4:54 PM, Alejandro Piñeiro  
>> wrote:
>>> In theory they don't overdrawn. The test has a square formed by N
>>> non-overlapping triangles. With just one call to glDrawRangeElements,
>>> this always works. But if we split it in M subsets, so M calls to
>>> glDrawRangeElements with N/M triangles per call, with low resolutions
>>> and a high amount of triangles, some texels start to fail. And not
>>> always the same amount of texels (so as you mentioned, undefined
>>> territory). Although it is a guess, I assume that is cause for artifacts
>>> on the triangle borders when drawing with so low resolution.
>> Perhaps this is somehow related to this bug:
>> https://bugs.freedesktop.org/show_bug.cgi?id=96624
>> There a texture problem shows up after one of glDrawRangeElements
>> calls with a lot of primitives.
> Not sure if they are related. Note that the issue I mention gets fully
> solved with glTextureBarrier and additionally, it happens on haswell too.
>
> In any case, reading that bug gave me an idea. I have been toying with
> MESA_DEBUG=flush (that until your email I didn't know that exists).
> Looking at the code, it adds the equivalent to a glFlush after each
> drawing call. Using it gets the test (all 48 subtests) passing. So right
> now Im more biased to think that is an issue on glTextureBarrier.

Another hint towards being an issue on glTextureBarrier implementation:
if instead of using MESA_DEBUG=flush, I replace glTextureBarrier for
glFlush on the original test (so one full flush after all the square,
but not after each drawing call) all the subtests keep passing too.

BR
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] st/omx: add initial support for nouveau

2016-06-27 Thread Christian König
Reviewed-by: Christian König  for the whole 
series.


Regards,
Christian.

Am 27.06.2016 um 10:16 schrieb Julien Isorce:

3 patches I made in last january to allow running st/omx with
the nouveau video driver. But currently the video is full of
blockiness because nouveau driver requires manual reference
frames handling. Which is not the case for AMD hardware.
At least these patches avoid to crash directly.

Tested with Bellagio library on Desktop with gstomxh264dec plugin.

Julien Isorce (3):
   st/omx: retrieve preferred interlaced and buffer_formats
   st/omx: add support for nouveau / interlaced
   st/omx: count number of slices

  src/gallium/state_trackers/omx/vid_dec.c  | 77 +--
  src/gallium/state_trackers/omx/vid_dec_h264.c |  3 ++
  2 files changed, 53 insertions(+), 27 deletions(-)



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: use optimal WD settings for primitive restart on Polaris

2016-06-27 Thread Nicolai Hähnle

Reviewed-by: Nicolai Hähnle 

On 24.06.2016 18:20, Marek Olšák wrote:

From: Marek Olšák 

ported from Vulkan
---
  src/gallium/drivers/radeonsi/si_state_draw.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 717149b..5f866d5 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -291,13 +291,21 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
if (sctx->b.chip_class >= CIK) {
/* WD_SWITCH_ON_EOP has no effect on GPUs with less than
 * 4 shader engines. Set 1 to pass the assertion below.
-* The other cases are hardware requirements. */
+* The other cases are hardware requirements.
+*
+* Polaris supports primitive restart with WD_SWITCH_ON_EOP=0
+* for points, line strips, and tri strips.
+*/
if (sctx->b.screen->info.max_se < 4 ||
prim == PIPE_PRIM_POLYGON ||
prim == PIPE_PRIM_LINE_LOOP ||
prim == PIPE_PRIM_TRIANGLE_FAN ||
prim == PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY ||
-   info->primitive_restart ||
+   (info->primitive_restart &&
+(sctx->b.family < CHIP_POLARIS10 ||
+ (prim != PIPE_PRIM_POINTS &&
+  prim != PIPE_PRIM_LINE_STRIP &&
+  prim != PIPE_PRIM_TRIANGLE_STRIP))) ||
info->count_from_stream_output)
wd_switch_on_eop = true;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] gallium/radeon: add a heuristic dynamically enabling DCC for scanout surfaces

2016-06-27 Thread Nicolai Hähnle

On 24.06.2016 20:48, Marek Olšák wrote:

On Fri, Jun 24, 2016 at 1:09 PM, Nicolai Hähnle  wrote:

On 22.06.2016 20:29, Marek Olšák wrote:


From: Marek Olšák 

DCC for displayable surfaces is allocated in a separate buffer and is
enabled or disabled based on PS invocations from 2 frames ago (to let
queries go idle) and the number of slow clears from the current frame.

At least an equivalent of 5 fullscreen draws or slow clears must be done
to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5)

Pipeline statistic queries are always active if a color buffer that can
have separate DCC is bound, even if separate DCC is disabled. That means
the window color buffer is always monitored and DCC is enabled only when
the situation is right.

The tracking of per-texture queries in r600_common_context is quite ugly,
but I don't see a better way.

The first fast clear always enables DCC. DCC decompression can disable it.
A later fast clear can enable it again. Enable/disable typically happens
only once per frame.

The impact is expected to be negligible because games usually don't have
a high level of overdraw. DCC usually activates when too much blending
is happening (smoke rendering) or when testing glClear performance and
CMASK isn't supported (Stoney).



Nice stuff. One corner case to think of: what happens when DCC is enabled
for a texture that is currently bound? Needs the same treatment as when DCC
is disabled, right?


Not sure what you mean, can you be more specific? Note that it can't
be bound as a sampler by apps because it's a window framebuffer.


Okay, that makes sense.



More comments below...



---
   src/gallium/drivers/radeon/r600_pipe_common.c |  15 ++
   src/gallium/drivers/radeon/r600_pipe_common.h |  40 +
   src/gallium/drivers/radeon/r600_texture.c | 239
++
   src/gallium/drivers/radeonsi/si_blit.c|  14 +-
   src/gallium/drivers/radeonsi/si_state.c   |  15 ++
   src/gallium/drivers/radeonsi/si_state_draw.c  |   5 +-
   6 files changed, 326 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 5d4a679..66afcfa 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -397,6 +397,21 @@ bool r600_common_context_init(struct
r600_common_context *rctx,

   void r600_common_context_cleanup(struct r600_common_context *rctx)
   {
+   unsigned i,j;
+
+   /* Release DCC stats. */
+   for (i = 0; i < ARRAY_SIZE(rctx->dcc_stats); i++) {
+   assert(!rctx->dcc_stats[i].query_active);
+
+   for (j = 0; j < ARRAY_SIZE(rctx->dcc_stats[i].ps_stats);
j++)
+   if (rctx->dcc_stats[i].ps_stats[j])
+   rctx->b.destroy_query(>b,
+
rctx->dcc_stats[i].ps_stats[j]);
+
+   pipe_resource_reference((struct pipe_resource**)
+   >dcc_stats[i].tex, NULL);
+   }
+
 if (rctx->gfx.cs)
 rctx->ws->cs_destroy(rctx->gfx.cs);
 if (rctx->dma.cs)
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 92cba13..cdec907 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -272,6 +272,25 @@ struct r600_texture {
  * dcc_offset contains the absolute GPUVM address, not the
relative one.
  */
 struct r600_resource*dcc_separate_buffer;
+   /* When DCC is temporarily disabled, the separate buffer is here.
*/
+   struct r600_resource*last_dcc_separate_buffer;
+   /* We need to track DCC dirtiness, because st/dri usually calls
+* flush_resource twice per frame (not a bug) and we don't wanna
+* decompress DCC twice. Also, the dirty tracking must be done
even
+* if DCC isn't used, because it's required by the DCC usage
analysis
+* for a possible future enablement.
+*/
+   boolseparate_dcc_dirty;
+   /* Statistics gathering for the DCC enablement heuristic. */
+   booldcc_gather_statistics;
+   /* Estimate of how much this color buffer is written to in units
of
+* full-screen draws: ps_invocations / (width * height)
+* Shader kills, late Z, and blending with trivial discards make
it
+* inaccurate (we need to count CB updates, not PS invocations).
+*/
+   unsignedps_draw_ratio;
+   /* The number of clears since the last DCC usage analysis. */
+   unsignednum_slow_clears;

 /* Counter that should be non-zero if the texture is bound to a
  * framebuffer. Implemented in radeonsi only.
@@ -536,6 +555,21 @@ struct r600_common_context {
 float  

  1   2   >