Re: [Mesa-dev] [PATCH 2/7] glsl/glcpp: use ralloc_sprint_rewrite_tail to avoid slow vsprintf

2016-12-31 Thread Kenneth Graunke
On Sunday, January 1, 2017 1:34:27 AM PST Marek Olšák wrote:
> From: Marek Olšák 
> 
> This reduces compile times by 4.5% with the Gallium noop driver and
> gl_constants::GLSLOptimizeConservatively == true.

Compile times of...what exactly?  Do you have any statistics for this
by itself?

Assuming we add your helper, this patch looks reasonable.
Reviewed-by: Kenneth Graunke 

BTW, I suspect you could get some additional speed up by changing

   parser->output = ralloc_strdup(parser, "");

to something like:

   parser->output = ralloc_size(parser, strlen(orig_concatenated_src));
   parser->output[0] = '\0';

to try and avoid reallocations.  rewrite_tail will realloc just enough
space every time it allocates, which means once you reallocate, you're
going to be calling realloc on every single token.  Yuck!

ralloc/talloc's string libraries were never meant for serious string
processing like the preprocessor does.  They're meant for convenience
when constructing debug messages which don't need to be that efficient.

Perhaps a better approach would be to have the preprocessor do this
itself.  Just ralloc_size() output and initialize the null byte.
reralloc to double the size if you need more space.  At the end of
preprocessing, reralloc to output_length at the end of free any waste
from doubling.

I suspect that would be a *lot* more efficient, and is probably what
we should have done in the first place...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] ralloc: add a new printing helper ralloc_sprint_rewrite_tail

2016-12-31 Thread Kenneth Graunke
On Sunday, January 1, 2017 1:34:26 AM PST Marek Olšák wrote:
> From: Marek Olšák 
> 
> This one is much faster when you don't need vsprintf.
> ---
>  src/util/ralloc.c | 25 +
>  src/util/ralloc.h | 24 
>  2 files changed, 49 insertions(+)
> 
> diff --git a/src/util/ralloc.c b/src/util/ralloc.c
> index 980e4e4..7976ca6 100644
> --- a/src/util/ralloc.c
> +++ b/src/util/ralloc.c
> @@ -522,20 +522,45 @@ ralloc_vasprintf_rewrite_tail(char **str, size_t 
> *start, const char *fmt,
> ptr = resize(*str, *start + new_length + 1);
> if (unlikely(ptr == NULL))
>return false;
>  
> vsnprintf(ptr + *start, new_length + 1, fmt, args);
> *str = ptr;
> *start += new_length;
> return true;
>  }
>  
> +bool
> +ralloc_sprint_rewrite_tail(char **str, size_t *start, const char *text,
> +   unsigned text_length)
> +{

Cool.  I always thought about adding this.

Can we shorten it to "ralloc_rewrite_tail()" or
"ralloc_str_rewrite_tail()"?

> +   char *ptr;
> +
> +   assert(str != NULL);
> +
> +   if (unlikely(*str == NULL)) {
> +  /* Assuming a NULL context is probably bad, but it's expected 
> behavior. */
> +  *str = ralloc_strdup(NULL, text);
> +  *start = strlen(*str);
> +  return true;
> +   }
> +
> +   ptr = resize(*str, *start + text_length + 1);
> +   if (unlikely(ptr == NULL))
> +  return false;
> +
> +   memcpy(ptr + *start, text, text_length + 1); /* also copy '\0' */

I don't know how I feel about copying the \0 from the source string.
I suppose that works, and is probably more efficient.  But I noticed
that this has an awful lot of similarity to the cat() helper (for
ralloc_strncat), which copies n bytes and appends an \0 explicitly.

I suppose if we wanted to add a ralloc_strn_rewrite_tail(), we'd need
to do it that way.  I don't know that there's any use in that, though.

> +   *str = ptr;
> +   *start += text_length;
> +   return true;
> +}
> +
>  /***
>   * Linear allocator for short-lived allocations.
>   ***
>   *
>   * The allocator consists of a parent node (2K buffer), which requires
>   * a ralloc parent, and child nodes (allocations). Child nodes can't be freed
>   * directly, because the parent doesn't track them. You have to release
>   * the parent node in order to release all its children.
>   *
>   * The allocator uses a fixed-sized buffer with a monotonically increasing
> diff --git a/src/util/ralloc.h b/src/util/ralloc.h
> index 3e2d342..6c31a6d 100644
> --- a/src/util/ralloc.h
> +++ b/src/util/ralloc.h
> @@ -401,20 +401,44 @@ bool ralloc_asprintf_append (char **str, const char 
> *fmt, ...)
>   * \sa ralloc_strcat
>   *
>   * \p str will be updated to the new pointer unless allocation fails.
>   *
>   * \return True unless allocation failed.
>   */
>  bool ralloc_vasprintf_append(char **str, const char *fmt, va_list args);
>  /// @}
>  
>  /**
> + * Rewrite the tail of an existing string, starting at a given index.
> + *
> + * Overwrites the contents of *str starting at \p start with "text",
> + * including a new null-terminator.  Allocates more memory as necessary.
> + *
> + * This can be used to append formatted text when the length of the existing
> + * string is already known, saving a strlen() call.
> + *
> + * \sa ralloc_asprintf_rewrite_tail
> + *
> + * \param str  The string to be updated.
> + * \param startThe index to start appending new data at.
> + * \param text The input string terminated by zero.
> + * \param text_length  The length of the input string.
> + *
> + * \p str will be updated to the new pointer unless allocation fails.
> + * \p start will be increased by the length of the newly formatted text.
> + *
> + * \return True unless allocation failed.
> + */
> +bool ralloc_sprint_rewrite_tail(char **str, size_t *start, const char *text,
> +unsigned text_length);
> +
> +/**
>   * Declare C++ new and delete operators which use ralloc.
>   *
>   * Placing this macro in the body of a class makes it possible to do:
>   *
>   * TYPE *var = new(mem_ctx) TYPE(...);
>   * delete var;
>   *
>   * which is more idiomatic in C++ than calling ralloc.
>   */
>  #define DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE(TYPE, ALLOC_FUNC)   \
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 76 ++
 1 file changed, 67 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 59d4d69..b68a02a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6785,40 +6785,88 @@ get_mesa_program(struct gl_context *ctx,
  break;
 
   default:
  unreachable("unhandled shader stage");
   }
}
 
return prog;
 }
 
+/* See if there are unsupported control flow statements. */
+class ir_control_flow_info_visitor : public ir_hierarchical_visitor {
+private:
+   const struct gl_shader_compiler_options *options;
+public:
+   ir_control_flow_info_visitor(const struct gl_shader_compiler_options 
*options)
+  : options(options),
+unsupported(false)
+   {
+   }
+
+   virtual ir_visitor_status visit_enter(ir_function *ir)
+   {
+  /* Other functions are skipped (same as glsl_to_tgsi). */
+  if (strcmp(ir->name, "main") == 0)
+ return visit_continue;
+
+  return visit_continue_with_parent;
+   }
+
+   virtual ir_visitor_status visit_enter(ir_call *ir)
+   {
+  if (!ir->callee->is_intrinsic()) {
+ unsupported = true; /* it's a function call */
+ return visit_stop;
+  }
+  return visit_continue;
+   }
+
+   virtual ir_visitor_status visit_enter(ir_return *ir)
+   {
+  if (options->EmitNoMainReturn) {
+ unsupported = true;
+ return visit_stop;
+  }
+  return visit_continue;
+   }
+
+   bool unsupported;
+};
+
+static bool
+has_unsupported_control_flow(exec_list *ir,
+ const struct gl_shader_compiler_options *options)
+{
+   ir_control_flow_info_visitor visitor(options);
+   visit_list_elements(, ir);
+   return visitor.unsupported;
+}
 
 extern "C" {
 
 /**
  * Link a shader.
  * Called via ctx->Driver.LinkShader()
  * This actually involves converting GLSL IR into an intermediate TGSI-like IR
  * with code lowering and other optimizations.
  */
 GLboolean
 st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
 {
struct pipe_screen *pscreen = ctx->st->pipe->screen;
assert(prog->data->LinkStatus);
 
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] == NULL)
  continue;
 
-  bool progress;
   exec_list *ir = prog->_LinkedShaders[i]->ir;
   gl_shader_stage stage = prog->_LinkedShaders[i]->Stage;
   const struct gl_shader_compiler_options *options =
 >Const.ShaderCompilerOptions[stage];
   enum pipe_shader_type ptarget = st_shader_stage_to_ptarget(stage);
   bool have_dround = pscreen->get_shader_param(pscreen, ptarget,

PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED);
   bool have_dfrexp = pscreen->get_shader_param(pscreen, ptarget,

PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED);
   unsigned if_threshold = pscreen->get_shader_param(pscreen, ptarget,
@@ -6888,28 +6936,38 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   : 0));
 
   do_vec_index_to_cond_assign(ir);
   lower_vector_insert(ir, true);
   lower_quadop_vector(ir, false);
   lower_noise(ir);
   if (options->MaxIfDepth == 0) {
  lower_discard(ir);
   }
 
-  do {
- progress = do_common_optimization(ir, true, true, options,
-   ctx->Const.NativeIntegers);
- progress = lower_if_to_cond_assign((gl_shader_stage)i, ir,
-options->MaxIfDepth, if_threshold) 
||
-progress;
-
-  } while (progress);
+  if (ctx->Const.GLSLOptimizeConservatively) {
+ /* Do it once and repeat only if there's unsupported control flow. */
+ do {
+do_common_optimization(ir, true, true, options,
+   ctx->Const.NativeIntegers);
+lower_if_to_cond_assign((gl_shader_stage)i, ir,
+options->MaxIfDepth, if_threshold);
+ } while (has_unsupported_control_flow(ir, options));
+  } else {
+ /* Repeat it until it stops making changes. */
+ bool progress;
+ do {
+progress = do_common_optimization(ir, true, true, options,
+  ctx->Const.NativeIntegers);
+progress |= lower_if_to_cond_assign((gl_shader_stage)i, ir,
+options->MaxIfDepth, 
if_threshold);
+ } while (progress);
+  }
 
   validate_ir_tree(ir);
}
 
build_program_resource_list(ctx, prog);
 
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   struct gl_program 

[Mesa-dev] [PATCH 5/7] mesa: add gl_constants::GLSLOptimizeConservatively

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

to reduce the amount of GLSL optimizations for drivers that can do better.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 14 +++---
 src/compiler/glsl/linker.cpp | 16 
 src/mesa/main/ff_fragment_shader.cpp | 10 +++---
 src/mesa/main/mtypes.h   |  7 +++
 4 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index b12cf3d..e97cbf4 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1942,26 +1942,34 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, 
struct gl_shader *shader,
   }
}
 
 
if (!state->error && !shader->ir->is_empty()) {
   struct gl_shader_compiler_options *options =
  >Const.ShaderCompilerOptions[shader->Stage];
 
   assign_subroutine_indexes(shader, state);
   lower_subroutine(shader->ir, state);
+
   /* Do some optimization at compile time to reduce shader IR size
* and reduce later work if the same shader is linked multiple times
*/
-  while (do_common_optimization(shader->ir, false, false, options,
-ctx->Const.NativeIntegers))
- ;
+  if (ctx->Const.GLSLOptimizeConservatively) {
+ /* Run it just once. */
+ do_common_optimization(shader->ir, false, false, options,
+ctx->Const.NativeIntegers);
+  } else {
+ /* Repeat it until it stops making changes. */
+ while (do_common_optimization(shader->ir, false, false, options,
+   ctx->Const.NativeIntegers))
+;
+  }
 
   validate_ir_tree(shader->ir);
 
   enum ir_variable_mode other;
   switch (shader->Stage) {
   case MESA_SHADER_VERTEX:
  other = ir_var_shader_in;
  break;
   case MESA_SHADER_FRAGMENT:
  other = ir_var_shader_out;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f4f918a..13fbb30 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -5041,24 +5041,32 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  goto done;
 
   if (ctx->Const.ShaderCompilerOptions[i].LowerCombinedClipCullDistance) {
  lower_clip_cull_distance(prog, prog->_LinkedShaders[i]);
   }
 
   if (ctx->Const.LowerTessLevel) {
  lower_tess_level(prog->_LinkedShaders[i]);
   }
 
-  while (do_common_optimization(prog->_LinkedShaders[i]->ir, true, false,
->Const.ShaderCompilerOptions[i],
-ctx->Const.NativeIntegers))
- ;
+  if (ctx->Const.GLSLOptimizeConservatively) {
+ /* Run it just once. */
+ do_common_optimization(prog->_LinkedShaders[i]->ir, true, false,
+>Const.ShaderCompilerOptions[i],
+ctx->Const.NativeIntegers);
+  } else {
+ /* Repeat it until it stops making changes. */
+ while (do_common_optimization(prog->_LinkedShaders[i]->ir, true, 
false,
+   >Const.ShaderCompilerOptions[i],
+   ctx->Const.NativeIntegers))
+;
+  }
 
   lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir, i);
   propagate_invariance(prog->_LinkedShaders[i]->ir);
}
 
/* Validation for special cases where we allow sampler array indexing
 * with loop induction variable. This check emits a warning or error
 * depending if backend can handle dynamic indexing.
 */
if ((!prog->IsES && prog->data->Version < 130) ||
diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index fd2c71f..48b84e8 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -1247,23 +1247,27 @@ create_new_program(struct gl_context *ctx, struct 
state_key *key)
 
p.instructions = _sig->body;
if (key->num_draw_buffers)
   emit_instructions();
 
validate_ir_tree(p.shader->ir);
 
const struct gl_shader_compiler_options *options =
   >Const.ShaderCompilerOptions[MESA_SHADER_FRAGMENT];
 
-   while (do_common_optimization(p.shader->ir, false, false, options,
- ctx->Const.NativeIntegers))
-  ;
+   /* Conservative approach: Don't optimize here, the linker does it too. */
+   if (!ctx->Const.GLSLOptimizeConservatively) {
+  while (do_common_optimization(p.shader->ir, false, false, options,
+ctx->Const.NativeIntegers))
+ ;
+   }
+
reparent_ir(p.shader->ir, p.shader->ir);
 
p.shader->CompileStatus = true;
p.shader->Version = state->language_version;
p.shader_program->Shaders =
   (gl_shader 

[Mesa-dev] [PATCH 7/7] st/mesa: enable GLSLOptimizeConservatively for drivers that want it

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

GLSL compilation now takes 24% less time with the Gallium noop driver.
I used my shader-db for the measurement. The difference for the whole
radeonsi driver can be ~10%.

The generated TGSI is mostly the same. For example, the compilation success
rate with a TGSI->GCN bytecode converter without any optimizations is
the same. Note that glsl_to_tgsi does its own copy propagation and simple
register allocation.

shader-db GCN report:
- Talos spills fewer SGPRs.
- DOTA 2 spills more SGPRs.
- The average shader-db score is better, but it's just due to randomness.

29045 shaders in 17564 tests
Totals:
SGPRS: 1325929 -> 1325017 (-0.07 %)
VGPRS: 1010808 -> 1010172 (-0.06 %)
Spilled SGPRs: 1432 -> 1399 (-2.30 %)
Spilled VGPRs: 93 -> 92 (-1.08 %)
Private memory VGPRs: 688 -> 688 (0.00 %)
Scratch size: 2540 -> 2484 (-2.20 %) dwords per thread
Code Size: 39336732 -> 39342936 (0.02 %) bytes
Max Waves: 217937 -> 217969 (0.01 %)
---
 src/mesa/state_tracker/st_extensions.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index ef926e4..7ff5716 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -303,20 +303,22 @@ void st_init_limits(struct pipe_screen *screen,
  65536);
   else
  options->MaxUnrollIterations =
 screen->get_shader_param(screen, sh,
   PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT);
 
   options->LowerCombinedClipCullDistance = true;
   options->LowerBufferInterfaceBlocks = true;
}
 
+   c->GLSLOptimizeConservatively =
+  screen->get_param(screen, PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY);
c->LowerTessLevel = true;
c->LowerCsDerivedVariables = true;
c->PrimitiveRestartForPatches =
   screen->get_param(screen, PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES);
 
c->MaxCombinedTextureImageUnits =
  _min(c->Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
   c->Program[MESA_SHADER_TESS_CTRL].MaxTextureImageUnits +
   c->Program[MESA_SHADER_TESS_EVAL].MaxTextureImageUnits +
   c->Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] glsl: run do_lower_jumps properly in do_common_optimizations

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

so that backends don't have to run it manually
---
 src/compiler/glsl/glsl_parser_extras.cpp   | 3 ++-
 src/mesa/program/ir_to_mesa.cpp| 2 --
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 8 +---
 3 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 4566aa9..b12cf3d 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2099,21 +2099,22 @@ do_common_optimization(exec_list *ir, bool linked,
OPT(do_tree_grafting, ir);
OPT(do_constant_propagation, ir);
if (linked)
   OPT(do_constant_variable, ir);
else
   OPT(do_constant_variable_unlinked, ir);
OPT(do_constant_folding, ir);
OPT(do_minmax_prune, ir);
OPT(do_rebalance_tree, ir);
OPT(do_algebraic, ir, native_integers, options);
-   OPT(do_lower_jumps, ir);
+   OPT(do_lower_jumps, ir, true, true, options->EmitNoMainReturn,
+   options->EmitNoCont, options->EmitNoLoops);
OPT(do_vec_index_to_swizzle, ir);
OPT(lower_vector_insert, ir, false);
OPT(do_swizzle_swizzle, ir);
OPT(do_noop_swizzle, ir);
 
OPT(optimize_split_arrays, ir, linked);
OPT(optimize_redundant_jumps, ir);
 
if (options->MaxUnrollIterations) {
   loop_state *ls = analyze_loop_variables(ir);
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 653b822..0089e80 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2972,22 +2972,20 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
   do {
 progress = false;
 
 /* Lowering */
 do_mat_op_to_vec(ir);
 lower_instructions(ir, (MOD_TO_FLOOR | DIV_TO_MUL_RCP | EXP_TO_EXP2
 | LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP
 | ((options->EmitNoPow) ? POW_TO_EXP2 : 0)));
 
-progress = do_lower_jumps(ir, true, true, options->EmitNoMainReturn, 
options->EmitNoCont, options->EmitNoLoops) || progress;
-
 progress = do_common_optimization(ir, true, true,
options, ctx->Const.NativeIntegers)
   || progress;
 
 progress = lower_quadop_vector(ir, true) || progress;
 
 if (options->MaxIfDepth == 0)
progress = lower_discard(ir) || progress;
 
 progress = lower_if_to_cond_assign((gl_shader_stage)i, ir,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index f738084..59d4d69 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6889,28 +6889,22 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
   do_vec_index_to_cond_assign(ir);
   lower_vector_insert(ir, true);
   lower_quadop_vector(ir, false);
   lower_noise(ir);
   if (options->MaxIfDepth == 0) {
  lower_discard(ir);
   }
 
   do {
- progress = false;
-
- progress = do_lower_jumps(ir, true, true, options->EmitNoMainReturn, 
options->EmitNoCont, options->EmitNoLoops) || progress;
-
  progress = do_common_optimization(ir, true, true, options,
-   ctx->Const.NativeIntegers)
-   || progress;
-
+   ctx->Const.NativeIntegers);
  progress = lower_if_to_cond_assign((gl_shader_stage)i, ir,
 options->MaxIfDepth, if_threshold) 
||
 progress;
 
   } while (progress);
 
   validate_ir_tree(ir);
}
 
build_program_resource_list(ctx, prog);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] gallium: add PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

Drivers with good compilers don't need aggressive optimizations before TGSI.
---
 src/gallium/docs/source/screen.rst   | 3 +++
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 19 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 86aa259..000551a 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -359,20 +359,23 @@ The integer capabilities:
   UsageMask of  xy or yzw is allowed, but xz or yw isn't. Declarations with
   overlapping locations must have matching semantic names and indices, and
   equal interpolation qualifiers.
   Components may overlap, notably when the gaps in an array of dvec3 are
   filled in.
 * ``PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS``: Whether interleaved stream
   output mode is able to interleave across buffers. This is required for
   ARB_transform_feedback3.
 * ``PIPE_CAP_TGSI_CAN_READ_OUTPUTS``: Whether every TGSI shader stage can read
   from the output file.
+* ``PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY``: Tell the GLSL compiler to use
+  the minimum amount of optimizations just to be able to do all the linking
+  and lowering.
 
 
 .. _pipe_capf:
 
 PIPE_CAPF_*
 
 
 The floating-point capabilities are:
 
 * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line.
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index d84cd82..3b631372 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -288,20 +288,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
+case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
return 1;
 
case PIPE_CAP_SHAREABLE_SHADERS:
/* manage the variants for these ourself, to avoid breaking precompile: 
*/
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
if (is_ir3(screen))
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 14f4271..18578c0 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -289,20 +289,21 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
+   case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
   return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
   return 1;
 
case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
   return 64;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
   return 120;
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index c3fad73..20a0e8d 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -512,20 +512,21 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:

[Mesa-dev] [PATCH 2/7] glsl/glcpp: use ralloc_sprint_rewrite_tail to avoid slow vsprintf

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

This reduces compile times by 4.5% with the Gallium noop driver and
gl_constants::GLSLOptimizeConservatively == true.
---
 src/compiler/glsl/glcpp/glcpp-parse.y | 39 +++
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index 63012bc..b84e5ff 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -202,21 +202,21 @@ add_builtin_define(glcpp_parser_t *parser, const char 
*name, int value);
 input:
/* empty */
 |  input line
 ;
 
 line:
control_line
 |  SPACE control_line
 |  text_line {
_glcpp_parser_print_expanded_token_list (parser, $1);
-   ralloc_asprintf_rewrite_tail (>output, 
>output_length, "\n");
+   ralloc_sprint_rewrite_tail(>output, 
>output_length, "\n", 1);
}
 |  expanded_line
 ;
 
 expanded_line:
IF_EXPANDED expression NEWLINE {
if (parser->is_gles && $2.undefined_macro)
glcpp_error(& @1, parser, "undefined macro %s in 
expression (illegal in GLES)", $2.undefined_macro);
_glcpp_parser_skip_stack_push_if (parser, & @1, $2.value);
}
@@ -252,21 +252,21 @@ define:
 |  FUNC_IDENTIFIER '(' ')' replacement_list NEWLINE {
_define_function_macro (parser, & @1, $1, NULL, $4);
}
 |  FUNC_IDENTIFIER '(' identifier_list ')' replacement_list NEWLINE {
_define_function_macro (parser, & @1, $1, $3, $5);
}
 ;
 
 control_line:
control_line_success {
-   ralloc_asprintf_rewrite_tail (>output, 
>output_length, "\n");
+   ralloc_sprint_rewrite_tail(>output, 
>output_length, "\n", 1);
}
 |  control_line_error
 |  HASH_TOKEN LINE pp_tokens NEWLINE {
 
if (parser->skip_stack == NULL ||
parser->skip_stack->type == SKIP_NO_SKIP)
{
_glcpp_parser_expand_and_lex_from (parser,
   LINE_EXPANDED, $3,
   
EXPANSION_MODE_IGNORE_DEFINED);
@@ -428,21 +428,22 @@ control_line_success:
 |  HASH_TOKEN VERSION_TOKEN version_constant IDENTIFIER NEWLINE {
if (parser->version_set) {
glcpp_error(& @1, parser, "#version must appear on the 
first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, $4, true);
}
 |  HASH_TOKEN NEWLINE {
glcpp_parser_resolve_implicit_version(parser);
}
 |  HASH_TOKEN PRAGMA NEWLINE {
-   ralloc_asprintf_rewrite_tail (>output, 
>output_length, "#%s", $2);
+   ralloc_sprint_rewrite_tail(>output, 
>output_length, "#", 1);
+   ralloc_sprint_rewrite_tail(>output, 
>output_length, $2, strlen($2));
}
 ;
 
 control_line_error:
HASH_TOKEN ERROR_TOKEN NEWLINE {
glcpp_error(& @1, parser, "#%s", $2);
}
 |  HASH_TOKEN DEFINE_TOKEN NEWLINE {
glcpp_error (& @1, parser, "#define without macro name");
}
@@ -1109,71 +1110,73 @@ _token_list_equal_ignoring_space(token_list_t *a, 
token_list_t *b)
   node_b = node_b->next;
}
 
return 1;
 }
 
 static void
 _token_print(char **out, size_t *len, token_t *token)
 {
if (token->type < 256) {
-  ralloc_asprintf_rewrite_tail (out, len, "%c", token->type);
+  char s[2] = {token->type, 0};
+  ralloc_sprint_rewrite_tail(out, len, s, 1);
   return;
}
 
switch (token->type) {
case INTEGER:
   ralloc_asprintf_rewrite_tail (out, len, "%" PRIiMAX, token->value.ival);
   break;
case IDENTIFIER:
case INTEGER_STRING:
case OTHER:
-  ralloc_asprintf_rewrite_tail (out, len, "%s", token->value.str);
+  ralloc_sprint_rewrite_tail(out, len, token->value.str,
+ strlen(token->value.str));
   break;
case SPACE:
-  ralloc_asprintf_rewrite_tail (out, len, " ");
+  ralloc_sprint_rewrite_tail(out, len, " ", 1);
   break;
case LEFT_SHIFT:
-  ralloc_asprintf_rewrite_tail (out, len, "<<");
+  ralloc_sprint_rewrite_tail(out, len, "<<", 2);
   break;
case RIGHT_SHIFT:
-  ralloc_asprintf_rewrite_tail (out, len, ">>");
+  ralloc_sprint_rewrite_tail(out, len, ">>", 2);
   break;
case LESS_OR_EQUAL:
-  ralloc_asprintf_rewrite_tail (out, len, "<=");
+  ralloc_sprint_rewrite_tail(out, len, "<=", 2);
   break;
case GREATER_OR_EQUAL:
-  ralloc_asprintf_rewrite_tail (out, len, ">=");
+  ralloc_sprint_rewrite_tail(out, len, ">=", 2);
   break;
case EQUAL:
-  ralloc_asprintf_rewrite_tail (out, len, "==");
+  ralloc_sprint_rewrite_tail(out, len, "==", 2);
   

[Mesa-dev] [PATCH 1/7] ralloc: add a new printing helper ralloc_sprint_rewrite_tail

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

This one is much faster when you don't need vsprintf.
---
 src/util/ralloc.c | 25 +
 src/util/ralloc.h | 24 
 2 files changed, 49 insertions(+)

diff --git a/src/util/ralloc.c b/src/util/ralloc.c
index 980e4e4..7976ca6 100644
--- a/src/util/ralloc.c
+++ b/src/util/ralloc.c
@@ -522,20 +522,45 @@ ralloc_vasprintf_rewrite_tail(char **str, size_t *start, 
const char *fmt,
ptr = resize(*str, *start + new_length + 1);
if (unlikely(ptr == NULL))
   return false;
 
vsnprintf(ptr + *start, new_length + 1, fmt, args);
*str = ptr;
*start += new_length;
return true;
 }
 
+bool
+ralloc_sprint_rewrite_tail(char **str, size_t *start, const char *text,
+   unsigned text_length)
+{
+   char *ptr;
+
+   assert(str != NULL);
+
+   if (unlikely(*str == NULL)) {
+  /* Assuming a NULL context is probably bad, but it's expected behavior. 
*/
+  *str = ralloc_strdup(NULL, text);
+  *start = strlen(*str);
+  return true;
+   }
+
+   ptr = resize(*str, *start + text_length + 1);
+   if (unlikely(ptr == NULL))
+  return false;
+
+   memcpy(ptr + *start, text, text_length + 1); /* also copy '\0' */
+   *str = ptr;
+   *start += text_length;
+   return true;
+}
+
 /***
  * Linear allocator for short-lived allocations.
  ***
  *
  * The allocator consists of a parent node (2K buffer), which requires
  * a ralloc parent, and child nodes (allocations). Child nodes can't be freed
  * directly, because the parent doesn't track them. You have to release
  * the parent node in order to release all its children.
  *
  * The allocator uses a fixed-sized buffer with a monotonically increasing
diff --git a/src/util/ralloc.h b/src/util/ralloc.h
index 3e2d342..6c31a6d 100644
--- a/src/util/ralloc.h
+++ b/src/util/ralloc.h
@@ -401,20 +401,44 @@ bool ralloc_asprintf_append (char **str, const char *fmt, 
...)
  * \sa ralloc_strcat
  *
  * \p str will be updated to the new pointer unless allocation fails.
  *
  * \return True unless allocation failed.
  */
 bool ralloc_vasprintf_append(char **str, const char *fmt, va_list args);
 /// @}
 
 /**
+ * Rewrite the tail of an existing string, starting at a given index.
+ *
+ * Overwrites the contents of *str starting at \p start with "text",
+ * including a new null-terminator.  Allocates more memory as necessary.
+ *
+ * This can be used to append formatted text when the length of the existing
+ * string is already known, saving a strlen() call.
+ *
+ * \sa ralloc_asprintf_rewrite_tail
+ *
+ * \param str  The string to be updated.
+ * \param startThe index to start appending new data at.
+ * \param text The input string terminated by zero.
+ * \param text_length  The length of the input string.
+ *
+ * \p str will be updated to the new pointer unless allocation fails.
+ * \p start will be increased by the length of the newly formatted text.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_sprint_rewrite_tail(char **str, size_t *start, const char *text,
+unsigned text_length);
+
+/**
  * Declare C++ new and delete operators which use ralloc.
  *
  * Placing this macro in the body of a class makes it possible to do:
  *
  * TYPE *var = new(mem_ctx) TYPE(...);
  * delete var;
  *
  * which is more idiomatic in C++ than calling ralloc.
  */
 #define DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE(TYPE, ALLOC_FUNC)   \
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] Faster GLSL compilation (for Gallium)

2016-12-31 Thread Marek Olšák
Hi,

The first 2 patches make the GLSL preprocessor a little faster.

The others add an optional CAP/Const flag to Mesa and Gallium that
decreases the amount of GLSL optimizations that are executed by tweaking
the do_common_optimizations call sites. I've not seen a drop in the quality
of the produced TGSI (thanks to TGSI passes in glsl_to_tgsi?), but there is
a CAP just in case some other drivers don't want this. It reduces GLSL
compile times by 24% with the Gallium noop driver.
(or 10% with full radeonsi?)

Please review.

BTW, after this series, it became more obvious that debug builds have much
slower compilation because of these two:
- validate_ir_tree
- tgsi_sanity_check

If you wanna profile the compiler, disable those two or don't use a debug
build.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] gallium: remove TGSI_OPCODE_SUB

2016-12-31 Thread Ilia Mirkin
On Sat, Dec 31, 2016 at 7:04 PM, Marek Olšák  wrote:
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -1695,21 +1695,22 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
> ir, st_src_reg *op)
> * driver.
> */
>emit_asm(ir, TGSI_OPCODE_MOV, result_dst, st_src_reg_for_float(0.5));
>break;
> }
>
> case ir_binop_add:
>emit_asm(ir, TGSI_OPCODE_ADD, result_dst, op[0], op[1]);
>break;
> case ir_binop_sub:
> -  emit_asm(ir, TGSI_OPCODE_SUB, result_dst, op[0], op[1]);
> +  op[1].negate = 1;
> +  emit_asm(ir, TGSI_OPCODE_ADD, result_dst, op[0], op[1]);
>break;

I think you want op[1].negate = ~op[1].negate, as it could have been
inverted by ir_unop_neg. [Note that this is not a full review, just
something I happened to notice.]

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] gallium/hud: add an option to reset the color counter

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 21 ++---
 src/gallium/auxiliary/hud/hud_private.h |  1 +
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 9e17d9b..eefbe60 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -817,31 +817,32 @@ hud_pane_add_graph(struct hud_pane *pane, struct 
hud_graph *gr)
   {1, 0.5, 0.5},
   {0.5, 1, 1},
   {1, 0.5, 1},
   {1, 1, 0.5},
   {0, 0.5, 0},
   {0.5, 0, 0},
   {0, 0.5, 0.5},
   {0.5, 0, 0.5},
   {0.5, 0.5, 0},
};
-   unsigned color = pane->num_graphs % ARRAY_SIZE(colors);
+   unsigned color = pane->next_color % ARRAY_SIZE(colors);
 
strip_hyphens(gr->name);
 
gr->vertices = MALLOC(pane->max_num_vertices * sizeof(float) * 2);
gr->color[0] = colors[color][0];
gr->color[1] = colors[color][1];
gr->color[2] = colors[color][2];
gr->pane = pane;
LIST_ADDTAIL(>head, >graph_list);
pane->num_graphs++;
+   pane->next_color++;
 }
 
 void
 hud_graph_add_value(struct hud_graph *gr, uint64_t value)
 {
gr->current_value = value;
value = value > gr->pane->ceiling ? gr->pane->ceiling : value;
 
if (gr->fd)
   fprintf(gr->fd, "%" PRIu64 "\n", value);
@@ -916,21 +917,22 @@ parse_string(const char *s, char *out)
   "parsing a string\n", *s, *s);
   fflush(stderr);
}
 
return i;
 }
 
 static char *
 read_pane_settings(char *str, unsigned * const x, unsigned * const y,
unsigned * const width, unsigned * const height,
-   uint64_t * const ceiling, boolean * const dyn_ceiling)
+   uint64_t * const ceiling, boolean * const dyn_ceiling,
+   boolean *reset_colors)
 {
char *ret = str;
unsigned tmp;
 
while (*str == '.') {
   ++str;
   switch (*str) {
   case 'x':
  ++str;
  *x = strtoul(str, , 10);
@@ -967,20 +969,26 @@ read_pane_settings(char *str, unsigned * const x, 
unsigned * const y,
  *ceiling = tmp > 10 ? tmp : 10;
  str = ret;
  break;
 
   case 'd':
  ++str;
  ret = str;
  *dyn_ceiling = true;
  break;
 
+  case 'r':
+ ++str;
+ ret = str;
+ *reset_colors = true;
+ break;
+
   default:
  fprintf(stderr, "gallium_hud: syntax error: unexpected '%c'\n", *str);
  fflush(stderr);
   }
 
}
 
return ret;
 }
 
@@ -1008,56 +1016,62 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
unsigned num, i;
char name_a[256], s[256];
char *name;
struct hud_pane *pane = NULL;
unsigned x = 10, y = 10;
unsigned width = 251, height = 100;
unsigned period = 500 * 1000;  /* default period (1/2 second) */
uint64_t ceiling = UINT64_MAX;
unsigned column_width = 251;
boolean dyn_ceiling = false;
+   boolean reset_colors = false;
const char *period_env;
 
/*
 * The GALLIUM_HUD_PERIOD env var sets the graph update rate.
 * The env var is in seconds (a float).
 * Zero means update after every frame.
 */
period_env = getenv("GALLIUM_HUD_PERIOD");
if (period_env) {
   float p = (float) atof(period_env);
   if (p >= 0.0f) {
  period = (unsigned) (p * 1000 * 1000);
   }
}
 
while ((num = parse_string(env, name_a)) != 0) {
   env += num;
 
   /* check for explicit location, size and etc. settings */
   name = read_pane_settings(name_a, , , , , ,
- _ceiling);
+ _ceiling, _colors);
 
  /*
   * Keep track of overall column width to avoid pane overlapping in case
   * later we create a new column while the bottom pane in the current
   * column is less wide than the rest of the panes in it.
   */
  column_width = width > column_width ? width : column_width;
 
   if (!pane) {
  pane = hud_pane_create(x, y, x + width, y + height, period, 10,
  ceiling, dyn_ceiling);
  if (!pane)
 return;
   }
 
+  if (reset_colors) {
+ pane->next_color = 0;
+ reset_colors = false;
+  }
+
   /* Add a graph. */
 #if HAVE_GALLIUM_EXTRA_HUD || HAVE_LIBSENSORS
   char arg_name[64];
 #endif
   /* IF YOU CHANGE THIS, UPDATE print_help! */
   if (strcmp(name, "fps") == 0) {
  hud_fps_graph_install(pane);
   }
   else if (strcmp(name, "cpu") == 0) {
  hud_cpu_graph_install(pane, ALL_CPUS);
@@ -1306,20 +1320,21 @@ print_help(struct pipe_screen *screen)
puts(" to the upper-left corner of the viewport, in pixels.");
puts("  'y[value]' sets the location of the pane on the y axis relative");
puts(" to the upper-left corner of the viewport, in pixels.");
puts("  

[Mesa-dev] [PATCH 2/5] gallium/hud: allow more data sources per pane

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 4c65af3..9e17d9b 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -806,37 +806,39 @@ strip_hyphens(char *s)
  */
 void
 hud_pane_add_graph(struct hud_pane *pane, struct hud_graph *gr)
 {
static const float colors[][3] = {
   {0, 1, 0},
   {1, 0, 0},
   {0, 1, 1},
   {1, 0, 1},
   {1, 1, 0},
-  {0.5, 0.5, 1},
-  {0.5, 0.5, 0.5},
+  {0.5, 1, 0.5},
+  {1, 0.5, 0.5},
+  {0.5, 1, 1},
+  {1, 0.5, 1},
+  {1, 1, 0.5},
+  {0, 0.5, 0},
+  {0.5, 0, 0},
+  {0, 0.5, 0.5},
+  {0.5, 0, 0.5},
+  {0.5, 0.5, 0},
};
-   char *name = gr->name;
+   unsigned color = pane->num_graphs % ARRAY_SIZE(colors);
 
-   /* replace '-' with a space */
-   while (*name) {
-  if (*name == '-')
- *name = ' ';
-  name++;
-   }
+   strip_hyphens(gr->name);
 
-   assert(pane->num_graphs < ARRAY_SIZE(colors));
gr->vertices = MALLOC(pane->max_num_vertices * sizeof(float) * 2);
-   gr->color[0] = colors[pane->num_graphs][0];
-   gr->color[1] = colors[pane->num_graphs][1];
-   gr->color[2] = colors[pane->num_graphs][2];
+   gr->color[0] = colors[color][0];
+   gr->color[1] = colors[color][1];
+   gr->color[2] = colors[color][2];
gr->pane = pane;
LIST_ADDTAIL(>head, >graph_list);
pane->num_graphs++;
 }
 
 void
 hud_graph_add_value(struct hud_graph *gr, uint64_t value)
 {
gr->current_value = value;
value = value > gr->pane->ceiling ? gr->pane->ceiling : value;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] gallium/hud: add an option to sort items below graphs

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 37 -
 src/gallium/auxiliary/hud/hud_private.h |  1 +
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index eefbe60..6892289 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -476,21 +476,21 @@ void
 hud_draw(struct hud_context *hud, struct pipe_resource *tex)
 {
struct cso_context *cso = hud->cso;
struct pipe_context *pipe = hud->pipe;
struct pipe_framebuffer_state fb;
struct pipe_surface surf_templ, *surf;
struct pipe_viewport_state viewport;
const struct pipe_sampler_state *sampler_states[] =
  { >font_sampler_state };
struct hud_pane *pane;
-   struct hud_graph *gr;
+   struct hud_graph *gr, *next;
 
if (!huds_visible)
   return;
 
hud->fb_width = tex->width0;
hud->fb_height = tex->height0;
hud->constants.two_div_fb_width = 2.0f / hud->fb_width;
hud->constants.two_div_fb_height = 2.0f / hud->fb_height;
 
cso_save_state(cso, (CSO_BIT_FRAMEBUFFER |
@@ -569,20 +569,37 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
hud_alloc_vertices(hud, >text, 4 * 1024, 4 * sizeof(float));
 
/* prepare all graphs */
hud_batch_query_update(hud->batch_query);
 
LIST_FOR_EACH_ENTRY(pane, >pane_list, head) {
   LIST_FOR_EACH_ENTRY(gr, >graph_list, head) {
  gr->query_new_value(gr);
   }
 
+  if (pane->sort_items) {
+ LIST_FOR_EACH_ENTRY_SAFE(gr, next, >graph_list, head) {
+/* ignore the last one */
+if (>head == pane->graph_list.prev)
+   continue;
+
+/* This is an incremental bubble sort, because we only do one pass
+ * per frame. It will eventually reach an equilibrium.
+ */
+if (gr->current_value <
+LIST_ENTRY(struct hud_graph, next, head)->current_value) {
+   LIST_DEL(>head);
+   LIST_ADD(>head, >head);
+}
+ }
+  }
+
   hud_pane_accumulate_vertices(hud, pane);
}
 
/* unmap the uploader's vertex buffer before drawing */
u_upload_unmap(hud->uploader);
 
/* draw accumulated vertices for background quads */
cso_set_blend(cso, >alpha_blend);
cso_set_fragment_shader_handle(hud->cso, hud->fs_color);
 
@@ -754,21 +771,21 @@ hud_pane_update_dyn_ceiling(struct hud_graph *gr, struct 
hud_pane *pane)
/*
 * Mark this adjustment run so we could avoid repeating a full update
 * again needlessly in case the pane has more than one graph.
 */
pane->dyn_ceil_last_ran = gr->index;
 }
 
 static struct hud_pane *
 hud_pane_create(unsigned x1, unsigned y1, unsigned x2, unsigned y2,
 unsigned period, uint64_t max_value, uint64_t ceiling,
-boolean dyn_ceiling)
+boolean dyn_ceiling, boolean sort_items)
 {
struct hud_pane *pane = CALLOC_STRUCT(hud_pane);
 
if (!pane)
   return NULL;
 
pane->x1 = x1;
pane->y1 = y1;
pane->x2 = x2;
pane->y2 = y2;
@@ -776,20 +793,21 @@ hud_pane_create(unsigned x1, unsigned y1, unsigned x2, 
unsigned y2,
pane->inner_x2 = x2 - 1;
pane->inner_y1 = y1 + 1;
pane->inner_y2 = y2 - 1;
pane->inner_width = pane->inner_x2 - pane->inner_x1;
pane->inner_height = pane->inner_y2 - pane->inner_y1;
pane->period = period;
pane->max_num_vertices = (x2 - x1 + 2) / 2;
pane->ceiling = ceiling;
pane->dyn_ceiling = dyn_ceiling;
pane->dyn_ceil_last_ran = 0;
+   pane->sort_items = sort_items;
pane->initial_max_value = max_value;
hud_pane_set_max_value(pane, max_value);
LIST_INITHEAD(>graph_list);
return pane;
 }
 
 /* replace '-' with a space */
 static void
 strip_hyphens(char *s)
 {
@@ -918,21 +936,21 @@ parse_string(const char *s, char *out)
   fflush(stderr);
}
 
return i;
 }
 
 static char *
 read_pane_settings(char *str, unsigned * const x, unsigned * const y,
unsigned * const width, unsigned * const height,
uint64_t * const ceiling, boolean * const dyn_ceiling,
-   boolean *reset_colors)
+   boolean *reset_colors, boolean *sort_items)
 {
char *ret = str;
unsigned tmp;
 
while (*str == '.') {
   ++str;
   switch (*str) {
   case 'x':
  ++str;
  *x = strtoul(str, , 10);
@@ -975,20 +993,26 @@ read_pane_settings(char *str, unsigned * const x, 
unsigned * const y,
  ret = str;
  *dyn_ceiling = true;
  break;
 
   case 'r':
  ++str;
  ret = str;
  *reset_colors = true;
  break;
 
+  case 's':
+ ++str;
+ ret = str;
+ *sort_items = true;
+ break;
+
   default:
  fprintf(stderr, "gallium_hud: syntax error: unexpected 

[Mesa-dev] [PATCH 5/5] gallium/hud: increase the vertex buffer size for text

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 6892289..50c2f80 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -559,21 +559,21 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
cso_set_vertex_elements(cso, 2, hud->velems);
cso_set_render_condition(cso, NULL, FALSE, 0);
cso_set_sampler_views(cso, PIPE_SHADER_FRAGMENT, 1,
  >font_sampler_view);
cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, 1, sampler_states);
cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, >constbuf);
 
/* prepare vertex buffers */
hud_alloc_vertices(hud, >bg, 4 * 256, 2 * sizeof(float));
hud_alloc_vertices(hud, >whitelines, 4 * 256, 2 * sizeof(float));
-   hud_alloc_vertices(hud, >text, 4 * 1024, 4 * sizeof(float));
+   hud_alloc_vertices(hud, >text, 16 * 1024, 4 * sizeof(float));
 
/* prepare all graphs */
hud_batch_query_update(hud->batch_query);
 
LIST_FOR_EACH_ENTRY(pane, >pane_list, head) {
   LIST_FOR_EACH_ENTRY(gr, >graph_list, head) {
  gr->query_new_value(gr);
   }
 
   if (pane->sort_items) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] gallium/hud: add an option to rename each data source

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

useful for radeonsi performance counters
---
 src/gallium/auxiliary/hud/hud_context.c | 40 -
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 779c116..4c65af3 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -782,20 +782,31 @@ hud_pane_create(unsigned x1, unsigned y1, unsigned x2, 
unsigned y2,
pane->max_num_vertices = (x2 - x1 + 2) / 2;
pane->ceiling = ceiling;
pane->dyn_ceiling = dyn_ceiling;
pane->dyn_ceil_last_ran = 0;
pane->initial_max_value = max_value;
hud_pane_set_max_value(pane, max_value);
LIST_INITHEAD(>graph_list);
return pane;
 }
 
+/* replace '-' with a space */
+static void
+strip_hyphens(char *s)
+{
+   while (*s) {
+  if (*s == '-')
+ *s = ' ';
+  s++;
+   }
+}
+
 /**
  * Add a graph to an existing pane.
  * One pane can contain multiple graphs over each other.
  */
 void
 hud_pane_add_graph(struct hud_pane *pane, struct hud_graph *gr)
 {
static const float colors[][3] = {
   {0, 1, 0},
   {1, 0, 0},
@@ -885,21 +896,21 @@ hud_graph_set_dump_file(struct hud_graph *gr)
 /**
  * Read a string from the environment variable.
  * The separators "+", ",", ":", and ";" terminate the string.
  * Return the number of read characters.
  */
 static int
 parse_string(const char *s, char *out)
 {
int i;
 
-   for (i = 0; *s && *s != '+' && *s != ',' && *s != ':' && *s != ';';
+   for (i = 0; *s && *s != '+' && *s != ',' && *s != ':' && *s != ';' && *s != 
'=';
 s++, out++, i++)
   *out = *s;
 
*out = 0;
 
if (*s && !i) {
   fprintf(stderr, "gallium_hud: syntax error: unexpected '%c' (%i) while "
   "parsing a string\n", *s, *s);
   fflush(stderr);
}
@@ -1164,41 +1175,48 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
  /* driver queries */
  if (!processed) {
 if (!hud_driver_query_install(>batch_query, pane, hud->pipe,
   name)) {
fprintf(stderr, "gallium_hud: unknown driver query '%s'\n", 
name);
fflush(stderr);
 }
  }
   }
 
-  if (*env == ':') {
+  if (*env == ':' || *env == '=') {
+ char key = *env;
  env++;
 
  if (!pane) {
 fprintf(stderr, "gallium_hud: syntax error: unexpected ':', "
 "expected a name\n");
 fflush(stderr);
 break;
  }
 
  num = parse_string(env, s);
  env += num;
 
- if (num && sscanf(s, "%u", ) == 1) {
-hud_pane_set_max_value(pane, i);
-pane->initial_max_value = i;
- }
- else {
-fprintf(stderr, "gallium_hud: syntax error: unexpected '%c' (%i) "
-"after ':'\n", *env, *env);
-fflush(stderr);
+ if (key == ':') {
+if (num && sscanf(s, "%u", ) == 1) {
+   hud_pane_set_max_value(pane, i);
+   pane->initial_max_value = i;
+}
+else {
+   fprintf(stderr, "gallium_hud: syntax error: unexpected '%c' 
(%i) "
+   "after ':'\n", *env, *env);
+   fflush(stderr);
+}
+ } else if (key == '=') {
+strip_hyphens(s);
+strcpy(LIST_ENTRY(struct hud_graph,
+  pane->graph_list.prev, head)->name, s);
  }
   }
 
   if (*env == 0)
  break;
 
   /* parse a separator */
   switch (*env) {
   case '+':
  env++;
@@ -1264,20 +1282,22 @@ print_help(struct pipe_screen *screen)
puts("");
puts("  Names are identifiers of data sources which will be drawn as 
graphs");
puts("  in panes. Multiple graphs can be drawn in the same pane.");
puts("  There can be multiple panes placed in rows and columns.");
puts("");
puts("  '+' separates names which will share a pane.");
puts("  ':[value]' specifies the initial maximum value of the Y axis");
puts(" for the given pane.");
puts("  ',' creates a new pane below the last one.");
puts("  ';' creates a new pane at the top of the next column.");
+   puts("  '=' followed by a string, changes the name of the last data 
source");
+   puts("  to that string");
puts("");
puts("  Example: GALLIUM_HUD=\"cpu,fps;primitives-generated\"");
puts("");
puts("  Additionally, by prepending '.[identifier][value]' modifiers to");
puts("  a name, it is possible to explicitly set the location and size");
puts("  of a pane, along with limiting overall maximum value of the");
puts("  Y axis and activating dynamic readjustment of the Y axis.");
puts("  Several modifiers may be applied to the same pane simultaneously.");
 

[Mesa-dev] [PATCH 4/5] gallium: remove TGSI_OPCODE_ABS

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

It's redundant with the source modifier.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.c|  2 +-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 16 +-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c|  2 +-
 src/gallium/auxiliary/nir/tgsi_to_nir.c|  1 -
 src/gallium/auxiliary/tgsi/tgsi_exec.c |  4 ---
 src/gallium/auxiliary/tgsi/tgsi_info.c |  2 +-
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h   |  1 -
 src/gallium/auxiliary/tgsi/tgsi_util.c |  1 -
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c  |  5 
 src/gallium/drivers/i915/i915_fpc_optimize.c   |  1 -
 src/gallium/drivers/i915/i915_fpc_translate.c  |  9 --
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |  4 ---
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  3 --
 src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c   |  3 --
 src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c   |  3 --
 src/gallium/drivers/r300/r300_tgsi_to_rc.c |  1 -
 src/gallium/drivers/r600/r600_shader.c |  9 ++
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c  |  2 --
 src/gallium/drivers/svga/svga_tgsi_insn.c  |  1 -
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 24 ---
 src/gallium/include/pipe/p_shader_tokens.h |  2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 35 ++
 src/mesa/state_tracker/st_mesa_to_tgsi.c   |  6 ++--
 23 files changed, 41 insertions(+), 96 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
index 68ac695..d368f38 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
@@ -353,21 +353,21 @@ lp_build_emit_fetch(
   assert(0 && "invalid src register in emit_fetch()");
   return bld_base->base.undef;
}
 
if (reg->Register.Absolute) {
   switch (stype) {
   case TGSI_TYPE_FLOAT:
   case TGSI_TYPE_DOUBLE:
   case TGSI_TYPE_UNTYPED:
   /* modifiers on movs assume data is float */
- res = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_ABS, res);
+ res = lp_build_abs(_base->base, res);
  break;
   case TGSI_TYPE_UNSIGNED:
   case TGSI_TYPE_SIGNED:
   case TGSI_TYPE_UNSIGNED64:
   case TGSI_TYPE_SIGNED64:
   case TGSI_TYPE_VOID:
   default:
  /* abs modifier is only legal on floating point types */
  assert(0);
  break;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 9c6fc4b..7d939e8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -492,22 +492,21 @@ static struct lp_build_tgsi_action lit_action = {
 static void
 log_emit(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
 
LLVMValueRef abs_x, log_abs_x, flr_log_abs_x, ex2_flr_log_abs_x;
 
/* abs( src0.x) */
-   abs_x = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_ABS,
-emit_data->args[0] /* src0.x */);
+   abs_x = lp_build_abs(_base->base, emit_data->args[0] /* src0.x */);
 
/* log( abs( src0.x ) ) */
log_abs_x = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_LG2,
 abs_x);
 
/* floor( log( abs( src0.x ) ) ) */
flr_log_abs_x = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_FLR,
 log_abs_x);
/* dst.x */
emit_data->output[TGSI_CHAN_X] = flr_log_abs_x;
@@ -1405,32 +1404,20 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
bld_base)
bld_base->op_actions[TGSI_OPCODE_U642D].emit = u642d_emit;
 
 }
 
 /* CPU Only default actions */
 
 /* These actions are CPU only, because they could potentially output SSE
  * intrinsics.
  */
 
-/* TGSI_OPCODE_ABS (CPU Only)*/
-
-static void
-abs_emit_cpu(
-   const struct lp_build_tgsi_action * action,
-   struct lp_build_tgsi_context * bld_base,
-   struct lp_build_emit_data * emit_data)
-{
-   emit_data->output[emit_data->chan] = lp_build_abs(_base->base,
-   emit_data->args[0]);
-}
-
 /* TGSI_OPCODE_ADD (CPU Only) */
 static void
 add_emit_cpu(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
emit_data->output[emit_data->chan] = lp_build_add(_base->base,
emit_data->args[0], emit_data->args[1]);
 }
@@ -2581,21 +2568,20 @@ u64shr_emit_cpu(
LLVMValueRef masked_count = lp_build_and(uint_bld, emit_data->args[1], 
mask);
emit_data->output[emit_data->chan] = lp_build_shr(uint_bld, 
emit_data->args[0],
  

[Mesa-dev] [PATCH 5/5] gallium: remove TGSI_OPCODE_SUB

2016-12-31 Thread Marek Olšák
From: Marek Olšák 

It's redundant with the source modifier.
---
 src/gallium/auxiliary/draw/draw_pipe_aaline.c  |  2 +-
 src/gallium/auxiliary/draw/draw_pipe_aapoint.c | 20 ++--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 38 +++---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c|  6 
 src/gallium/auxiliary/nir/tgsi_to_nir.c|  1 -
 src/gallium/auxiliary/tgsi/tgsi_aa_point.c | 20 ++--
 src/gallium/auxiliary/tgsi/tgsi_exec.c |  4 ---
 src/gallium/auxiliary/tgsi/tgsi_info.c |  2 +-
 src/gallium/auxiliary/tgsi/tgsi_lowering.c | 22 -
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h   |  1 -
 src/gallium/auxiliary/tgsi/tgsi_point_sprite.c | 12 +++
 src/gallium/auxiliary/tgsi/tgsi_transform.h|  8 +++--
 src/gallium/auxiliary/tgsi/tgsi_util.c |  1 -
 src/gallium/auxiliary/util/u_pstipple.c|  2 +-
 src/gallium/auxiliary/vl/vl_bicubic_filter.c   |  4 +--
 src/gallium/auxiliary/vl/vl_compositor.c   |  4 +--
 src/gallium/auxiliary/vl/vl_deint_filter.c |  8 ++---
 src/gallium/drivers/i915/i915_fpc_optimize.c   |  1 -
 src/gallium/drivers/i915/i915_fpc_translate.c  | 11 ---
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |  6 
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  2 --
 src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c   |  3 --
 src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c   |  3 --
 src/gallium/drivers/r300/r300_tgsi_to_rc.c |  1 -
 src/gallium/drivers/r600/r600_shader.c | 14 
 src/gallium/drivers/svga/svga_tgsi_insn.c  | 27 ---
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 25 --
 src/gallium/include/pipe/p_shader_tokens.h |  2 +-
 src/gallium/state_trackers/xa/xa_tgsi.c|  4 +--
 src/mesa/state_tracker/st_atifs_to_tgsi.c  | 18 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  3 +-
 src/mesa/state_tracker/st_mesa_to_tgsi.c   |  6 ++--
 src/mesa/state_tracker/st_tgsi_lower_yuv.c |  3 +-
 33 files changed, 82 insertions(+), 202 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c 
b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
index c236caa..57ca12e 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
@@ -278,21 +278,21 @@ aa_transform_epilog(struct tgsi_transform_context *ctx)
   tgsi_transform_op1_inst(ctx, TGSI_OPCODE_MOV,
   TGSI_FILE_OUTPUT, aactx->colorOutput,
   TGSI_WRITEMASK_XYZ,
   TGSI_FILE_TEMPORARY, aactx->colorTemp);
 
   /* MUL alpha */
   tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
   TGSI_FILE_OUTPUT, aactx->colorOutput,
   TGSI_WRITEMASK_W,
   TGSI_FILE_TEMPORARY, aactx->colorTemp,
-  TGSI_FILE_TEMPORARY, aactx->texTemp);
+  TGSI_FILE_TEMPORARY, aactx->texTemp, false);
}
 }
 
 
 /**
  * TGSI instruction transform callback.
  * Replace writes to result.color w/ a temp reg.
  */
 static void
 aa_transform_inst(struct tgsi_transform_context *ctx,
diff --git a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c 
b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
index 33ef8ec..2b96b8a 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
@@ -206,88 +206,88 @@ aa_transform_prolog(struct tgsi_transform_context *ctx)
 *  t0.x = distance of fragment from center point
 *  t0.y = boolean, is t0.x > 1.0, also misc temp usage
 *  t0.z = temporary for computing 1/(1-k) value
 *  t0.w = final coverage value
 */
 
/* MUL t0.xy, tex, tex;  # compute x^2, y^2 */
tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_XY,
TGSI_FILE_INPUT, texInput,
-   TGSI_FILE_INPUT, texInput);
+   TGSI_FILE_INPUT, texInput, false);
 
/* ADD t0.x, t0.x, t0.y;  # x^2 + y^2 */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_ADD,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_X,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_X,
-   TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y);
+   TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y, 
false);
 
 #if NORMALIZE  /* OPTIONAL normalization of length */
/* RSQ t0.x, t0.x; */
tgsi_transform_op1_inst(ctx, TGSI_OPCODE_RSQ,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_X,
TGSI_FILE_TEMPORARY, tmp0);
 
/* RCP t0.x, t0.x; */

[Mesa-dev] [PATCH 3/5] st/nine: Remove all usage of ureg_SUB in nine_shader

2016-12-31 Thread Marek Olšák
From: Axel Davy 

This is required to drop gallium SUB.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index a1e0070..0a75c07 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1075,22 +1075,22 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_src(tx->regs.address);
 break;
 case D3DSPR_MISCTYPE:
 switch (param->idx) {
 case D3DSMO_POSITION:
if (ureg_src_is_undef(tx->regs.vPos))
   tx->regs.vPos = nine_get_position_input(tx);
if (tx->shift_wpos) {
/* TODO: do this only once */
struct ureg_dst wpos = tx_scratch(tx);
-   ureg_SUB(ureg, wpos, tx->regs.vPos,
-ureg_imm4f(ureg, 0.5f, 0.5f, 0.0f, 0.0f));
+   ureg_ADD(ureg, wpos, tx->regs.vPos,
+ureg_imm4f(ureg, -0.5f, -0.5f, 0.0f, 0.0f));
src = ureg_src(wpos);
} else {
src = tx->regs.vPos;
}
break;
 case D3DSMO_FACE:
if (ureg_src_is_undef(tx->regs.vFace)) {
if (tx->face_is_sysval_integer) {
tmp = tx_scratch(tx);
tx->regs.vFace =
@@ -1154,39 +1154,39 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_abs(src);
 break;
 case NINED3DSPSM_ABSNEG:
 src = ureg_negate(ureg_abs(src));
 break;
 case NINED3DSPSM_NEG:
 src = ureg_negate(src);
 break;
 case NINED3DSPSM_BIAS:
 tmp = tx_scratch(tx);
-ureg_SUB(ureg, tmp, src, ureg_imm1f(ureg, 0.5f));
+ureg_ADD(ureg, tmp, src, ureg_imm1f(ureg, -0.5f));
 src = ureg_src(tmp);
 break;
 case NINED3DSPSM_BIASNEG:
 tmp = tx_scratch(tx);
-ureg_SUB(ureg, tmp, ureg_imm1f(ureg, 0.5f), src);
+ureg_ADD(ureg, tmp, ureg_imm1f(ureg, 0.5f), ureg_negate(src));
 src = ureg_src(tmp);
 break;
 case NINED3DSPSM_NOT:
 if (tx->native_integers) {
 tmp = tx_scratch(tx);
 ureg_NOT(ureg, tmp, src);
 src = ureg_src(tmp);
 break;
 }
 /* fall through */
 case NINED3DSPSM_COMP:
 tmp = tx_scratch(tx);
-ureg_SUB(ureg, tmp, ureg_imm1f(ureg, 1.0f), src);
+ureg_ADD(ureg, tmp, ureg_imm1f(ureg, 1.0f), ureg_negate(src));
 src = ureg_src(tmp);
 break;
 case NINED3DSPSM_DZ:
 case NINED3DSPSM_DW:
 /* Already handled*/
 break;
 case NINED3DSPSM_SIGN:
 tmp = tx_scratch(tx);
 ureg_MAD(ureg, tmp, src, ureg_imm1f(ureg, 2.0f), ureg_imm1f(ureg, 
-1.0f));
 src = ureg_src(tmp);
@@ -2545,21 +2545,21 @@ DECL_SPECIAL(TEXM3x3SPEC)
 ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), 
ureg_src(dst));
 ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
 /* at this step tmp.x = 1/N.N */
 ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), E);
 /* at this step tmp.y = N.E */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_Y));
 /* at this step tmp.x = N.E/N.N */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
 ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), 
ureg_src(dst));
 /* at this step tmp.xyz = 2 * (N.E / N.N) * N */
-ureg_SUB(ureg, tmp, ureg_src(tmp), E);
+ureg_ADD(ureg, tmp, ureg_src(tmp), ureg_negate(E));
 ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), 
sample);
 
 return D3D_OK;
 }
 
 DECL_SPECIAL(TEXREG2RGB)
 {
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
 struct ureg_src sample;
@@ -2684,21 +2684,21 @@ DECL_SPECIAL(TEXM3x3)
 ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), 
ureg_src(dst));
 ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
 /* at this step tmp.x = 1/N.N */
 ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), 
ureg_src(E));
 /* at this step tmp.y = N.E */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_Y));
 /* at this step tmp.x = N.E/N.N */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 

[Mesa-dev] [PATCH 1/5] st/nine: Do not map SUB and ABS to their gallium equivalent.

2016-12-31 Thread Marek Olšák
From: Axel Davy 

This is required for gallium SUB and ABS to be removed.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 5effc2c..a1e0070 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1573,20 +1573,41 @@ d3dsio_to_string( unsigned opcode )
 static HRESULT
 NineTranslateInstruction_Generic(struct shader_translator *);
 
 DECL_SPECIAL(NOP)
 {
 /* Nothing to do. NOP was used to avoid hangs
  * with very old d3d drivers. */
 return D3D_OK;
 }
 
+DECL_SPECIAL(SUB)
+{
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
+struct ureg_src src0 = tx_src_param(tx, >insn.src[0]);
+struct ureg_src src1 = tx_src_param(tx, >insn.src[1]);
+
+ureg_ADD(ureg, dst, src0, ureg_negate(src1));
+return D3D_OK;
+}
+
+DECL_SPECIAL(ABS)
+{
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
+struct ureg_src src = tx_src_param(tx, >insn.src[0]);
+
+ureg_MOV(ureg, dst, ureg_abs(src));
+return D3D_OK;
+}
+
 DECL_SPECIAL(M4x4)
 {
 return NineTranslateInstruction_Mkxn(tx, 4, 4);
 }
 
 DECL_SPECIAL(M4x3)
 {
 return NineTranslateInstruction_Mkxn(tx, 4, 3);
 }
 
@@ -2866,21 +2887,21 @@ DECL_SPECIAL(COMMENT)
 
 
 #define _OPI(o,t,vv1,vv2,pv1,pv2,d,s,h) \
 { D3DSIO_##o, TGSI_OPCODE_##t, { vv1, vv2 }, { pv1, pv2, }, d, s, h }
 
 struct sm1_op_info inst_table[] =
 {
 _OPI(NOP, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(NOP)), /* 0 */
 _OPI(MOV, MOV, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
 _OPI(ADD, ADD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 2 */
-_OPI(SUB, SUB, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 3 */
+_OPI(SUB, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(SUB)), /* 3 */
 _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
 _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
 _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
 _OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
 _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
 _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
 _OPI(MIN, MIN, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 10 */
 _OPI(MAX, MAX, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 11 */
 _OPI(SLT, SLT, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 12 */
 _OPI(SGE, SGE, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 13 */
@@ -2902,21 +2923,21 @@ struct sm1_op_info inst_table[] =
 _OPI(LOOP,BGNLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 2, 
SPECIAL(LOOP)),
 _OPI(RET, RET, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(RET)),
 _OPI(ENDLOOP, ENDLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 0, 
SPECIAL(ENDLOOP)),
 _OPI(LABEL,   NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, 
SPECIAL(LABEL)),
 
 _OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
 
 _OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(POW)),
 _OPI(CRS, XPD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* XXX: .w */
 _OPI(SGN, SSG, V(2,0), V(3,0), V(0,0), V(0,0), 1, 3, SPECIAL(SGN)), /* 
ignore src1,2 */
-_OPI(ABS, ABS, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
+_OPI(ABS, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(ABS)),
 _OPI(NRM, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(NRM)), /* NRM 
doesn't fit */
 
 _OPI(SINCOS, SCS, V(2,0), V(2,1), V(2,0), V(2,1), 1, 3, SPECIAL(SINCOS)),
 _OPI(SINCOS, SCS, V(3,0), V(3,0), V(3,0), V(3,0), 1, 1, SPECIAL(SINCOS)),
 
 /* More flow control */
 _OPI(REP,NOP,V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(REP)),
 _OPI(ENDREP, NOP,V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, 
SPECIAL(ENDREP)),
 _OPI(IF, IF, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(IF)),
 _OPI(IFC,IF, V(2,1), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(IFC)),
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] st/nine: Remove all usage of ureg_SUB in nine_ff

2016-12-31 Thread Marek Olšák
From: Axel Davy 

This is required to remove gallium SUB.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 40 +++
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index a0a33cd..7cbe3f7 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -442,23 +442,23 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MOV(ureg, oPos, vs->aVtx);
 } else {
 struct ureg_dst tmp = ureg_DECL_temporary(ureg);
 /* vs->aVtx contains the coordinates buffer wise.
 * later in the pipeline, clipping, viewport and division
 * by w (rhw = 1/w) are going to be applied, so do the reverse
 * of these transformations (except clipping) to have the good
 * position at the end.*/
 ureg_MOV(ureg, tmp, vs->aVtx);
 /* X from [X_min, X_min + width] to [-1, 1], same for Y. Z to [0, 
1] */
-ureg_SUB(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _CONST(101));
+ureg_ADD(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), ureg_negate(_CONST(101)));
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _CONST(100));
-ureg_SUB(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XY), 
ureg_src(tmp), ureg_imm1f(ureg, 1.0f));
+ureg_ADD(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XY), 
ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
 /* Y needs to be reversed */
 ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), 
ureg_negate(ureg_src(tmp)));
 /* inverse rhw */
 ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_W), _W(tmp));
 /* multiply X, Y, Z by w */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _W(tmp));
 ureg_MOV(ureg, oPos, ureg_src(tmp));
 ureg_release_temporary(ureg, tmp);
 }
 } else if (key->vertexblend) {
@@ -504,21 +504,21 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MAD(ureg, tmp2, _(vs->aNrm), cWM[1], ureg_src(tmp2));
 ureg_MAD(ureg, tmp2, _(vs->aNrm), cWM[2], ureg_src(tmp2));
 }
 
 if (i < (key->vertexblend - 1)) {
 /* accumulate weighted position value */
 ureg_MAD(ureg, aVtx_dst, ureg_src(tmp), ureg_scalar(vs->aWgt, 
i), ureg_src(aVtx_dst));
 if (has_aNrm)
 ureg_MAD(ureg, aNrm_dst, ureg_src(tmp2), 
ureg_scalar(vs->aWgt, i), ureg_src(aNrm_dst));
 /* subtract weighted position value for last value */
-ureg_SUB(ureg, sum_blendweights, ureg_src(sum_blendweights), 
ureg_scalar(vs->aWgt, i));
+ureg_ADD(ureg, sum_blendweights, ureg_src(sum_blendweights), 
ureg_negate(ureg_scalar(vs->aWgt, i)));
 }
 }
 
 /* the last weighted position is always 1 - sum_of_previous_weights */
 ureg_MAD(ureg, aVtx_dst, ureg_src(tmp), 
ureg_scalar(ureg_src(sum_blendweights), key->vertexblend - 1), 
ureg_src(aVtx_dst));
 if (has_aNrm)
 ureg_MAD(ureg, aNrm_dst, ureg_src(tmp2), 
ureg_scalar(ureg_src(sum_blendweights), key->vertexblend - 1), 
ureg_src(aNrm_dst));
 
 /* multiply by VIEW_PROJ */
 ureg_MUL(ureg, tmp, _X(aVtx_dst), _CONST(8));
@@ -654,36 +654,36 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MOV(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_W), 
ureg_imm1f(ureg, 1.0f));
 dim_input = 4;
 break;
 case NINED3DTSS_TCI_CAMERASPACEREFLECTIONVECTOR:
 tmp.WriteMask = TGSI_WRITEMASK_XYZ;
 aVtx_normed = ureg_DECL_temporary(ureg);
 ureg_normalize3(ureg, aVtx_normed, vs->aVtx);
 ureg_DP3(ureg, tmp_x, ureg_src(aVtx_normed), vs->aNrm);
 ureg_MUL(ureg, tmp, vs->aNrm, _X(tmp));
 ureg_ADD(ureg, tmp, ureg_src(tmp), ureg_src(tmp));
-ureg_SUB(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_XYZ), 
ureg_src(aVtx_normed), ureg_src(tmp));
+ureg_ADD(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_XYZ), 
ureg_src(aVtx_normed), ureg_negate(ureg_src(tmp)));
 ureg_MOV(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_W), 
ureg_imm1f(ureg, 1.0f));
 ureg_release_temporary(ureg, aVtx_normed);
 dim_input = 4;
 tmp.WriteMask = TGSI_WRITEMASK_XYZW;
 break;
 case NINED3DTSS_TCI_SPHEREMAP:
 /* Implement the formula of GL_SPHERE_MAP */
 tmp.WriteMask = TGSI_WRITEMASK_XYZ;
 aVtx_normed = ureg_DECL_temporary(ureg);
 tmp2 = 

[Mesa-dev] [Bug 99125] Log to a file all GALLIUM_HUD infos

2016-12-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99125

--- Comment #2 from Edmondo Tommasina  ---

FYI: Marek pushed the series of patches to mesa git master.

https://cgit.freedesktop.org/mesa/mesa/commit/?id=3f5fba8a7be61bfc0f46a5ea058108f6e0e1c268

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [AppVeyor] mesa master #3012 failed

2016-12-31 Thread Ilia Mirkin
src\gallium\auxiliary\hud\hud_context.c(874) : warning C4013: 'access'
undefined; assuming extern returning int
src\gallium\auxiliary\hud\hud_context.c(874) : error C2065: 'W_OK' :
undeclared identifier
scons: *** [build\windows-x86-debug\gallium\auxiliary\hud\hud_context.obj]
Error 2

On Sat, Dec 31, 2016 at 6:31 PM, AppVeyor  wrote:
> Build mesa 3012 failed
>
> Commit 3f5fba8a7b by Edmondo Tommasina on 12/21/2016 9:58 PM:
> docs: document GALLIUM_HUD_DUMP_DIR envvar\n\nSigned-off-by: Marek Olšák
> 
>
> Configure your notification preferences
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3012 failed

2016-12-31 Thread AppVeyor



Build mesa 3012 failed


Commit 3f5fba8a7b by Edmondo Tommasina on 12/21/2016 9:58 PM:

docs: document GALLIUM_HUD_DUMP_DIR envvar\n\nSigned-off-by: Marek Olšák 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium/hud: dump hud_driver_query values to files

2016-12-31 Thread Marek Olšák
FYI, I've pushed the series and squashed the first 2 patches.

Thanks,

Marek

On Sat, Dec 31, 2016 at 10:15 PM, Marek Olšák  wrote:
> On Wed, Dec 21, 2016 at 10:58 PM, Edmondo Tommasina
>  wrote:
>> Dump values for every selected data source in GALLIUM_HUD.
>>
>> Every data source has its own file and the filename is
>> equal to the data source identifier.
>> ---
>>  src/gallium/auxiliary/hud/hud_context.c  | 6 ++
>>  src/gallium/auxiliary/hud/hud_driver_query.c | 2 ++
>>  src/gallium/auxiliary/hud/hud_private.h  | 1 +
>>  3 files changed, 9 insertions(+)
>>
>> diff --git a/src/gallium/auxiliary/hud/hud_context.c 
>> b/src/gallium/auxiliary/hud/hud_context.c
>> index ceb157a..edd831a 100644
>> --- a/src/gallium/auxiliary/hud/hud_context.c
>> +++ b/src/gallium/auxiliary/hud/hud_context.c
>> @@ -33,6 +33,7 @@
>>   * Set GALLIUM_HUD=help for more info.
>>   */
>>
>> +#include 
>>  #include 
>>  #include 
>>
>> @@ -829,6 +830,9 @@ hud_graph_add_value(struct hud_graph *gr, uint64_t value)
>> gr->current_value = value;
>> value = value > gr->pane->ceiling ? gr->pane->ceiling : value;
>>
>> +   if (gr->fd)
>> +  fprintf(gr->fd, "%" PRIu64 "\n", value);
>> +
>> if (gr->index == gr->pane->max_num_vertices) {
>>gr->vertices[0] = 0;
>>gr->vertices[1] = gr->vertices[(gr->index-1)*2+1];
>> @@ -856,6 +860,8 @@ hud_graph_destroy(struct hud_graph *graph)
>> FREE(graph->vertices);
>> if (graph->free_query_data)
>>graph->free_query_data(graph->query_data);
>> +   if (graph->fd)
>> +  fclose(graph->fd);
>> FREE(graph);
>>  }
>>
>> diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
>> b/src/gallium/auxiliary/hud/hud_driver_query.c
>> index 40ea120..bfde16a 100644
>> --- a/src/gallium/auxiliary/hud/hud_driver_query.c
>> +++ b/src/gallium/auxiliary/hud/hud_driver_query.c
>> @@ -378,6 +378,8 @@ hud_pipe_query_install(struct hud_batch_query_context 
>> **pbq,
>>info->result_index = result_index;
>> }
>>
>> +   gr->fd = fopen(gr->name, "w+");
>
> This opens the file unconditionally. Did you forget to check the env var here?
>
> Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] gallium/hud: add dump directory enviroment variable

2016-12-31 Thread Marek Olšák
Ignore my comment on patch 1. This patch can be merged with the first one.

Marek

On Wed, Dec 21, 2016 at 10:58 PM, Edmondo Tommasina
 wrote:
> Set GALLIUM_HUD_DUMP_DIR to dump values to files in this directory.
>
> No values are dumped if the environment variable is not set, the
> directory doesn't exist or the user doesn't have write access.
> ---
>  src/gallium/auxiliary/hud/hud_driver_query.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
> b/src/gallium/auxiliary/hud/hud_driver_query.c
> index bfde16a..23fda01 100644
> --- a/src/gallium/auxiliary/hud/hud_driver_query.c
> +++ b/src/gallium/auxiliary/hud/hud_driver_query.c
> @@ -351,6 +351,8 @@ hud_pipe_query_install(struct hud_batch_query_context 
> **pbq,
>  {
> struct hud_graph *gr;
> struct query_info *info;
> +   const char *hud_dump_dir = getenv("GALLIUM_HUD_DUMP_DIR");
> +   char *dump_file;
>
> gr = CALLOC_STRUCT(hud_graph);
> if (!gr)
> @@ -378,7 +380,15 @@ hud_pipe_query_install(struct hud_batch_query_context 
> **pbq,
>info->result_index = result_index;
> }
>
> -   gr->fd = fopen(gr->name, "w+");
> +   if (hud_dump_dir && access(hud_dump_dir, W_OK) == 0) {
> +  dump_file = malloc(strlen(hud_dump_dir) + sizeof(gr->name));
> +  if (dump_file) {
> + strcpy(dump_file, hud_dump_dir);
> + strcat(dump_file, gr->name);
> + gr->fd = fopen(dump_file, "w+");
> + free(dump_file);
> +  }
> +   }
>
> hud_pane_add_graph(pane, gr);
> pane->type = type; /* must be set before updating the max_value */
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/27] i965/miptree: Add a return for updating of winsys

2016-12-31 Thread Ben Widawsky

On 16-12-31 14:40:42, Ben Widawsky wrote:

On 16-12-10 15:39:12, Pohjolainen, Topi wrote:

On Thu, Dec 01, 2016 at 02:09:59PM -0800, Ben Widawsky wrote:


[snip]



We don't seem to use "zero for success"-style at least in i965. Could you
change this to bool and flip the check earlier for consistency?



What do you mean by flip the check earlier?



nvm. I realized what you meant.

[snip]


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/27] i965/miptree: Add a return for updating of winsys

2016-12-31 Thread Ben Widawsky

On 16-12-10 15:39:12, Pohjolainen, Topi wrote:

On Thu, Dec 01, 2016 at 02:09:59PM -0800, Ben Widawsky wrote:

From: Ben Widawsky 

There is nothing particularly useful to do currently if the update
fails, but there is no point carrying on either. As a result, this has a
behavior change.

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_context.c   | 14 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  6 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 +-
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b928f94..593fa67 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1645,9 +1645,10 @@ intel_process_dri2_buffer(struct brw_context *brw,
   return;
}

-   intel_update_winsys_renderbuffer_miptree(brw, rb, bo,
-drawable->w, drawable->h,
-buffer->pitch);
+   if (intel_update_winsys_renderbuffer_miptree(brw, rb, bo,
+drawable->w, drawable->h,
+buffer->pitch))
+  return;

if (_mesa_is_front_buffer_drawing(fb) &&
(buffer->attachment == __DRI_BUFFER_FRONT_LEFT ||
@@ -1703,9 +1704,10 @@ intel_update_image_buffer(struct brw_context *intel,
if (last_mt && last_mt->bo == buffer->bo)
   return;

-   intel_update_winsys_renderbuffer_miptree(intel, rb, buffer->bo,
-buffer->width, buffer->height,
-buffer->pitch);
+   if (intel_update_winsys_renderbuffer_miptree(intel, rb, buffer->bo,
+buffer->width, buffer->height,
+buffer->pitch))
+  return;

if (_mesa_is_front_buffer_drawing(fb) &&
buffer_type == __DRI_IMAGE_BUFFER_FRONT &&
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d002546..74db507 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -908,7 +908,7 @@ intel_miptree_create_for_image(struct brw_context *intel,
  * that will contain the actual rendering (which is lazily resolved to
  * irb->singlesample_mt).
  */
-void
+int


We don't seem to use "zero for success"-style at least in i965. Could you
change this to bool and flip the check earlier for consistency?



What do you mean by flip the check earlier?


 intel_update_winsys_renderbuffer_miptree(struct brw_context *intel,
  struct intel_renderbuffer *irb,
  drm_intel_bo *bo,
@@ -974,12 +974,12 @@ intel_update_winsys_renderbuffer_miptree(struct 
brw_context *intel,
  irb->mt = multisample_mt;
   }
}
-   return;
+   return 0;

 fail:
intel_miptree_release(>singlesample_mt);
intel_miptree_release(>mt);
-   return;
+   return -1;
 }

 struct intel_mipmap_tree*
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 7b9a7be..85fe118 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -726,7 +726,7 @@ intel_miptree_create_for_image(struct brw_context *intel,
uint32_t pitch,
uint32_t layout_flags);

-void
+int
 intel_update_winsys_renderbuffer_miptree(struct brw_context *intel,
  struct intel_renderbuffer *irb,
  drm_intel_bo *bo,
--
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium/hud: dump hud_driver_query values to files

2016-12-31 Thread Marek Olšák
On Wed, Dec 21, 2016 at 10:58 PM, Edmondo Tommasina
 wrote:
> Dump values for every selected data source in GALLIUM_HUD.
>
> Every data source has its own file and the filename is
> equal to the data source identifier.
> ---
>  src/gallium/auxiliary/hud/hud_context.c  | 6 ++
>  src/gallium/auxiliary/hud/hud_driver_query.c | 2 ++
>  src/gallium/auxiliary/hud/hud_private.h  | 1 +
>  3 files changed, 9 insertions(+)
>
> diff --git a/src/gallium/auxiliary/hud/hud_context.c 
> b/src/gallium/auxiliary/hud/hud_context.c
> index ceb157a..edd831a 100644
> --- a/src/gallium/auxiliary/hud/hud_context.c
> +++ b/src/gallium/auxiliary/hud/hud_context.c
> @@ -33,6 +33,7 @@
>   * Set GALLIUM_HUD=help for more info.
>   */
>
> +#include 
>  #include 
>  #include 
>
> @@ -829,6 +830,9 @@ hud_graph_add_value(struct hud_graph *gr, uint64_t value)
> gr->current_value = value;
> value = value > gr->pane->ceiling ? gr->pane->ceiling : value;
>
> +   if (gr->fd)
> +  fprintf(gr->fd, "%" PRIu64 "\n", value);
> +
> if (gr->index == gr->pane->max_num_vertices) {
>gr->vertices[0] = 0;
>gr->vertices[1] = gr->vertices[(gr->index-1)*2+1];
> @@ -856,6 +860,8 @@ hud_graph_destroy(struct hud_graph *graph)
> FREE(graph->vertices);
> if (graph->free_query_data)
>graph->free_query_data(graph->query_data);
> +   if (graph->fd)
> +  fclose(graph->fd);
> FREE(graph);
>  }
>
> diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
> b/src/gallium/auxiliary/hud/hud_driver_query.c
> index 40ea120..bfde16a 100644
> --- a/src/gallium/auxiliary/hud/hud_driver_query.c
> +++ b/src/gallium/auxiliary/hud/hud_driver_query.c
> @@ -378,6 +378,8 @@ hud_pipe_query_install(struct hud_batch_query_context 
> **pbq,
>info->result_index = result_index;
> }
>
> +   gr->fd = fopen(gr->name, "w+");

This opens the file unconditionally. Did you forget to check the env var here?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/27] i965: Create correctly sized mcs for an image

2016-12-31 Thread Ben Widawsky

On 16-12-10 15:36:06, Pohjolainen, Topi wrote:

On Thu, Dec 01, 2016 at 02:09:58PM -0800, Ben Widawsky wrote:

From: Ben Widawsky 

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 37 
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 0f19a6e..91eb7ec 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -545,8 +545,11 @@ create_image_with_modifier(struct intel_screen *screen,
 {
uint32_t tiling;
unsigned long pitch;
+   unsigned ccs_height = 0;

switch (modifier) {
+   case /* I915_FORMAT_MOD_CCS */ fourcc_mod_code(INTEL, 4):
+  ccs_height = ALIGN(DIV_ROUND_UP(height, 16), 32);
case I915_FORMAT_MOD_Y_TILED:
   tiling = I915_TILING_Y;
}
@@ -554,10 +557,35 @@ create_image_with_modifier(struct intel_screen *screen,
/* For now, all modifiers require some tiling */
assert(tiling);

+   /*
+* CCS width is always going to be less than or equal to the image's width.
+* All we need to do is make sure we add extra rows (height) for the CCS.
+*
+* A pair of CCS bits correspond to 8x4 pixels, and must be cacheline
+* granularity. Each CCS tile is laid out in 8b strips, which corresponds to
+* 1024x512 pixel region. In memory, it looks like the following:
+*
+* ?
+* ??? ???
+* ??? ???
+* ??? ???
+* ???  Image  ???
+* ??? ???
+* ??? ???
+* ???x???
+* ?
+* ??? ???   |
+* ???ccs  ???  unused   |
+* ?---???


I guess this looks okay as actual source code and Mutt just displays it like
this for me.



It's UTF-8 codes. The problem is more likely your terminal than mutt. I can use
ASCII if people prefer.


+* <--pitch-->
+*/
+   unsigned y_tiled_height = ALIGN(height, 32);
+
cpp = _mesa_get_format_bytes(image->format);
-   image->bo = drm_intel_bo_alloc_tiled(screen->bufmgr, "image+mod",
-width, height, cpp, ,
-, 0);
+   image->bo = drm_intel_bo_alloc_tiled(screen->bufmgr,
+ccs_height ? "image+ccs" : "image",


Do want to keep "image+mod" for the non-ccs case?



Yes, thanks. This was originally a bit different before as the function was
called regardless of whether or not there were modifiers. I'll change it.


+width, y_tiled_height + ccs_height,
+cpp, , , 0);
if (image->bo == NULL)
   return false;

@@ -575,7 +603,8 @@ create_image_with_modifier(struct intel_screen *screen,
if (image->planar_format)
   assert(image->planar_format->nplanes == 1);

-   image->aux_offset = 0; /* y_tiled_height * pitch; */
+   if (ccs_height)
+  image->aux_offset = y_tiled_height * pitch;


Here you set 'aux_offset' relative to the beginning of the actual color region
and therefore in the previous patch you shouldn't add mt->offset to it,
right? (Just like I wrote in the previous patch, I think mt->offset is only
used for moving to subsequent slices within color region).



Originally I did add offset here. The formula was:
aux_offset = mt->offset + y_tiled_height * pitch;

So it was consistent at least with the previous patch. We can finish the
discussion of what to do on that patch, and I'll have this part match it.


return true;
 }
--
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/27] Renderbuffer Decompression (and GBM modifiers)

2016-12-31 Thread Ben Widawsky

On 16-12-29 17:34:19, Ben Widawsky wrote:

On 16-12-06 13:34:02, Paulo Zanoni wrote:

2016-12-01 20:09 GMT-02:00 Ben Widawsky :

From: Ben Widawsky 

This patch series ultimately adds support within the i965 driver for
Renderbuffer Decompression with GBM. In short, this feature reduces memory
bandwidth by allowing the GPU to work with losslessly compressed data and having
that compression scheme understood by the display engine for decompression. The
display engine will decompress on the fly and scanout the image.

Quoting from the final patch, the bandwidth savings on a SKL GT4 with a 19x10
display running kmscube:

Without compression:
   Read bandwidth: 603.91 MiB/s
   Write bandwidth: 615.28 MiB/s

With compression:
   Read bandwidth: 259.34 MiB/s
   Write bandwidth: 337.83 MiB/s


The hardware achieves this savings by maintaining an auxiliary buffer
containing "opaque" compression information. It's opaque in the sense that the
low level compression scheme is not needed, but, knowledge of the overall
layout of the compressed data is required. The auxiliary buffer is created by
the driver on behalf of the client when requested. That buffer needs to be
passed along wherever the main image's buffer goes.

The overall strategy is that the buffer/surface is created with a list of
modifiers. The list of modifiers the hardware is capable of using will come from
a new kernel API that is aware of the hardware and general constraints. A client
will request the list of modifiers and pass it directly back in during buffer
creation (potentially the client can prune the list, but as of now there is no
reason to.) This new API is being developed by Kristian. I did not get far
enough to play with that.

For EGL, a similar mechanism would exist whereby when importing a buffer into
EGL, one would provide a modifier and probably a pointer to the auxiliary data
upon import. (Import therefore might require multiple dma-buf fds), but for i965
and Intel, this wouldn't be necessary.

Here is a brief description of the series:
1-6 Adds support in GBM for per plane functions where necessary. This is
required because the kernel expects the auxiliary buffer to be passed along as a
plane. It has its own offset, and stride, and the client shouldn't need to
calculate those.

7-9 Adds support in GBM to understand modifiers. When creating a buffer or
surface, the client is expected to pass in a list of modifiers that the driver
will optimally choose from. As a result of this, the GBM APIs need to support
modifiers.

10-12 Support Y-tiled modifier. Y-tiling was already a modifier exposed by the
kernel. With the previous patches in place, it's easy to support this too.

13-26 Plumbing to support sending CCS buffers to display. Leveraging much of the
existing code for MCS buffers, these patches creating an MCS for the scanout
buffer. The trickery here is that a single BO contains both the main surface and
the auxiliary data. Previously, auxiliary data always lived in its own BO.

27 Support CCS-modifier. Finally, the code can parse the CCS fb modifier(s) and
realize the bandwidth savings that come with it.

This was tested using kmscube
(https://github.com/bwidawsk/kmscube/tree/modifiers). The kmscube implementation
is missing support for GET_PLANE2 - which is currently being worked on by
Kristian.

Upstream plan:


First of all, I'd like to point that I haven't really been following
this feature closely, so maybe my questions are irrelevant to this
series. But still, I feel I have to poitn these things since maybe
they are relevant. Please tell me if I'm not talking about the same
thing as you are.

The main question is: where's the matching i915.ko series? Shouldn't
that be step 0 in your upstream plan?



Ville is working on it. All patches except the last can be merged without kernel
support. That is assuming that we agree upon the general solution, using the
modifiers and having both buffers be part of the same BO. There is also a
requisite series from Kristian which will allow the client to query per plane
modifiers.



I guess this is a lie actually. I depend on fourcc_mod_code(INTEL, 4) being
Y-tiled CCS modifier. I can figure out a way to defer this until the last patch.


I do recall seeing BSpec text containing "do this thing if render
decompression is enabled" and, at that time, our code wasn't
implementing those instructions. AFAIU, the Kernel didn't really had
support for render decompression, so its specific bits were just
ignored. I was assuming that whoever implemented the feature would add
all the necessary bits, especially since we didn't seem to have any
sort of "if (has_render_decompression(dev_priv))" to call. I am 100%
sure there's such an example in the Gen 9 Watermarks instructions, but
I'm sure I saw more somewhere else (Display WA page?). And reember:
missing watermarks workarounds equals flickering screens.

Is this relevant to your series? How will Mesa be 

[Mesa-dev] [AMD] Screen flickering with 4K and RX 480, would be glad to help debugging

2016-12-31 Thread Romain Failliot
Hi!

I've recently bought a 4K display and an RX 480, but I've got some
troubles with Dota when using high settings. I've created a thread on
Phoronix forum (because maybe other people have the same problem...):

https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-amd-linux/921416-screen-flickering-with-an-rx-480-dota-2-4k-full-settings

The idea is not to have support (I know this mailing list isn't for
that), but to try to find the source of the problem. Maybe it's a
hardware problem (that's on my side), but maybe it's a driver problem.

I'd be glad to give you any logs, outputs, info you ask.
For instance, is there a log somewhere where we can see that the
display actually tried to change it's resolution?

Thanks!

-- 
Romain "Creak" Failliot
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patch for freedreno features

2016-12-31 Thread Romain Failliot
I tried to use git send-email but it doesn't seem to work (although
the output says otherwise).

So eventually it's simpler to just copy/paste the patch generated by
git format-patch:

---
 docs/features.txt | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index c27d521..63b45af 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -33,7 +33,7 @@ are exposed in the 3.0 context as extensions.
 Feature Status
 ---


-GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi,
llvmpipe, softpipe, swr
+GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600,
radeonsi, llvmpipe, softpipe, swr

   glBindFragDataLocation, glGetFragDataLocation DONE
   GL_NV_conditional_render (Conditional rendering)  DONE ()
@@ -60,12 +60,12 @@ GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0,
r600, radeonsi, llvmpipe, soft
   glVertexAttribI commands  DONE
   Depth format cube texturesDONE ()
   GLX_ARB_create_context (GLX 1.4 is required)  DONE
-  Multisample anti-aliasing DONE
(llvmpipe (*), softpipe (*), swr (*))
+  Multisample anti-aliasing DONE
(freedreno (*), llvmpipe (*), softpipe (*), swr (*))

-(*) llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
+(*) freedreno, llvmpipe, softpipe, and swr have fake Multisample
anti-aliasing support


-GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi,
llvmpipe, softpipe, swr
+GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600,
radeonsi, llvmpipe, softpipe, swr

   Forward compatible context support/deprecations   DONE ()
   GL_ARB_draw_instanced (Instanced drawing) DONE ()
@@ -82,34 +82,34 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0,
r600, radeonsi, llvmpipe, soft

   Core/compatibility profiles   DONE
   Geometry shaders  DONE ()
-  GL_ARB_vertex_array_bgra (BGRA vertex order)  DONE (swr)
-  GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (swr)
-  GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (swr)
-  GL_ARB_provoking_vertex (Provoking vertex)DONE (swr)
-  GL_ARB_seamless_cube_map (Seamless cubemaps)  DONE (swr)
+  GL_ARB_vertex_array_bgra (BGRA vertex order)  DONE (freedreno, swr)
+  GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (freedreno, swr)
+  GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno, swr)
+  GL_ARB_provoking_vertex (Provoking vertex)DONE (freedreno, swr)
+  GL_ARB_seamless_cube_map (Seamless cubemaps)  DONE (freedreno, swr)
   GL_ARB_texture_multisample (Multisample textures) DONE (swr)
-  GL_ARB_depth_clamp (Frag depth clamp) DONE (swr)
-  GL_ARB_sync (Fence objects)   DONE (swr)
+  GL_ARB_depth_clamp (Frag depth clamp) DONE (freedreno, swr)
+  GL_ARB_sync (Fence objects)   DONE (freedreno, swr)
   GLX_ARB_create_context_profileDONE


 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi,
llvmpipe, softpipe

-  GL_ARB_blend_func_extendedDONE (swr)
+  GL_ARB_blend_func_extendedDONE
(freedreno/a3xx, swr)
   GL_ARB_explicit_attrib_location   DONE (all
drivers that support GLSL)
-  GL_ARB_occlusion_query2   DONE (swr)
+  GL_ARB_occlusion_query2   DONE (freedreno, swr)
   GL_ARB_sampler_objectsDONE (all drivers)
-  GL_ARB_shader_bit_encodingDONE (swr)
-  GL_ARB_texture_rgb10_a2ui DONE (swr)
-  GL_ARB_texture_swizzleDONE (swr)
+  GL_ARB_shader_bit_encodingDONE (freedreno, swr)
+  GL_ARB_texture_rgb10_a2ui DONE (freedreno, swr)
+  GL_ARB_texture_swizzleDONE (freedreno, swr)
   GL_ARB_timer_queryDONE (swr)
-  GL_ARB_instanced_arrays   DONE (swr)
-  GL_ARB_vertex_type_2_10_10_10_rev DONE (swr)
+  GL_ARB_instanced_arrays   DONE (freedreno, swr)
+  GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr)


 GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi

-  GL_ARB_draw_buffers_blend DONE
(i965/gen6+, nv50, llvmpipe, softpipe, swr)
+  GL_ARB_draw_buffers_blend DONE
(freedreno, i965/gen6+, nv50, llvmpipe, softpipe, 

Re: [Mesa-dev] Patch for freedreno features

2016-12-31 Thread Romain Failliot
I'll try to do the git patch!

I know features.txt isn't the official support source and it is more
for the devs to follow on their work, so it's really up to up if you
want to add freedreno in features.txt. I simply don't have a device
for each driver, that's why I'm parsing features.txt in mesamatrix.

brb with a patch!

Thanks!

2016-12-31 12:08 GMT-05:00 Rob Clark :
> hey, I don't suppose you could send a git patch?  I can push (although
> tbh glxinfo is the authoritative source when it comes to which
> extensions are supported on which generations of adreno)
>
> BR,
> -R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patch for freedreno features

2016-12-31 Thread Rob Clark
hey, I don't suppose you could send a git patch?  I can push (although
tbh glxinfo is the authoritative source when it comes to which
extensions are supported on which generations of adreno)

BR,
-R

On Fri, Dec 30, 2016 at 2:09 PM, Romain Failliot
 wrote:
> Hi!
>
> There's a patch by Rob Clark that sits in bugzilla for while now:
> https://bugs.freedesktop.org/show_bug.cgi?id=95460
>
> I've just updated it to HEAD. It would be nice to merge it, especially
> since there hasn't been much changes in features.txt for a while.
>
> Cheers!
>
> --
> Romain "Creak" Failliot
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99237] Impossible to create transparent X11/EGL windows while respecting EGL_NATIVE_VISUAL_ID

2016-12-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99237

--- Comment #1 from nfx...@gmail.com ---
> Second it has some code that explicitly excludes RGBA X visuals,

Looking at the code I linked again, this is wrong. But if xcb_depth_iterator_t
returns the RGB visual before the RGBA one, the RGBA one will of course never
be added. I have no idea how the iterator sorts visuals, though. On my system,
the only 32 bit (i.e. RGBA) visual is the very last one according to glxinfo
and xdpyinfo.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99237] Impossible to create transparent X11/EGL windows while respecting EGL_NATIVE_VISUAL_ID

2016-12-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99237

Bug ID: 99237
   Summary: Impossible to create transparent X11/EGL windows while
respecting EGL_NATIVE_VISUAL_ID
   Product: Mesa
   Version: 13.0
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: EGL
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: nfx...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

eglGetConfigs() never lists EGLConfigs that point to RGBA X visuals in their
EGL_NATIVE_VISUAL_ID attribute. If you create an EGL context and X11 window the
"proper" way, i.e. use eglGetConfigs() or eglChooseConfig() and then create a
window with the visual noted in the chosen EGLConfig's  EGL_NATIVE_VISUAL_ID
attribute, the resulting window's GL-rendered contents will never be
alpha-blended by the compositor.

nVidia's EGL implementation on the other hand does list EGLConfigs with RGBA
visuals. (They have two almost identical EGLConfigs following each other: first
one with a RGB X visual, then with a RGBA one. All fields except EGL_CONFIG_ID
and EGL_NATIVE_VISUAL_ID are the same.)

I think that Mesa should do it the same way, and that not doing this is a bug
and/or a missing feature that should be added.

Seems like this is done explicitly in dri2_x11_add_configs_for_visuals:
https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/platform_x11.c#n752

First, it allows only 1 config/visual per color class. It really should add two
visuals at least for TrueColor visuals (a RGB and then a RGBA one). Second it
has some code that explicitly excludes RGBA X visuals, with comments about how
applications don't want alpha-composited windows. I'm not completely aware why
this can't just be done via the EGL_ALPHA_SIZE attribute (if it's 0, select a
RGB-only config), but maybe there are reasons. (Wayland apparently does not
care about those reasons, and if you get a RGBA config, your window is always
alpha-composited, but I didn't double-check this.) nVidia/EGL and GLX avoid
such problems by always listing a config backed by a RGB X visual first.

You could also argue that the API user should just use a RGBA visual when
creating the X window, either by completely ignoring the chosen EGLConfig's
EGL_NATIVE_VISUAL_ID attribute, or by attempting to select a "compatible" one.
However, not all drivers allow using a different X visual (even if it's
compatible). The nVidia driver is one such driver. The EGL spec also explicitly
states that it's implementation specific whether another underlying pixel
format can be used (and mention different X visual IDs as example).

I'm sure application developers should be spared from trying to support both
Mesa and nVidia in the hard way by having to implement both methods. While I'm
personally less than enchanted for needing platform-specific code in the EGL
config selection just to get X11 transparency, nVidia's idea seems to be the
best way to implement it. It's similar to how GLX behaves,
backwards-compatible, and the application does not need to try to use different
visuals for window creation.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99010] --disable-gallium-llvm no longer recognized

2016-12-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99010

Jonathan Gray  changed:

   What|Removed |Added

 CC||j...@openbsd.org

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev