date:20171027

Re: [Mesa-dev] [PATCH] nir: Don't print swizzles when there are more than 4 components

2017-10-27 Thread Jason Ekstrand


On October 27, 2017 18:21:45 Matt Turner  wrote:


... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].


Ugh... Ideally, I think you'd want without_matrix() because a matrix is 
just an array as far as interfaces go.  That said, I think this probably 
fine for now.  R-b.


Tim, if you'd like to fix it for all the "fun" cases, that would he nice.



Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or
  packed shader in/outs.")
---
Reproduce with INTEL_DEBUG=vs and
generated_tests/spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-float_mat4-position-double_double.shader_test

We could probably print better swizzles for dvecs too, which have more than
four components. Exercise to the reader :)

 src/compiler/nir/nir_print.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 4b7ad5c6ba..fcc8025346 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -457,7 +457,7 @@ print_var_decl(nir_variable *var, print_state *state)
   switch (var->data.mode) {
   case nir_var_shader_in:
   case nir_var_shader_out:
- if (num_components != 4 && num_components != 0) {
+ if (num_components < 4 && num_components != 0) {
 const char *xyzw = "xyzw";
 for (int i = 0; i < num_components; i++)
components_local[i + 1] = xyzw[i + var->data.location_frac];
--
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: Don't print swizzles when there are more than 4 components

2017-10-27 Thread Matt Turner

... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].

Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or
  packed shader in/outs.")
---
Reproduce with INTEL_DEBUG=vs and
generated_tests/spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-float_mat4-position-double_double.shader_test

We could probably print better swizzles for dvecs too, which have more than
four components. Exercise to the reader :)

 src/compiler/nir/nir_print.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 4b7ad5c6ba..fcc8025346 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -457,7 +457,7 @@ print_var_decl(nir_variable *var, print_state *state)
   switch (var->data.mode) {
   case nir_var_shader_in:
   case nir_var_shader_out:
- if (num_components != 4 && num_components != 0) {
+ if (num_components < 4 && num_components != 0) {
 const char *xyzw = "xyzw";
 for (int i = 0; i < num_components; i++)
components_local[i + 1] = xyzw[i + var->data.location_frac];
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH v3 14/48] i965/fs: Extend the live ranges of VGRFs which leave loops

2017-10-27 Thread Francisco Jerez

I've submitted an alternative to this approach here [1], more along the
lines of the v1 patch you borrowed from my jenkins branch.  Most of the
reasons for that we have discussed already in the office (except for the
last point), but FTR:

 - This patch only works around the problem by tweaking the live
   intervals, which makes the live intervals inconsistent with the block
   live sets (the latter are used by some back-end passes that this
   patch isn't going to help with, like DCE and scheduling).

 - This patch doesn't fix the same bug on the VEC4 back-end.

 - I feel that the alternative approach [1] admits more straightforward
   extension to more exotic control-flow constructs.

 - It doesn't actually fix the problem for all possible patterns of
   divergent loop execution.  Consider the following GLSL pseudocode:

   | b = 0;
   | do {
   |use(b);
   |b = non_uniform_condition();
   |if (b)
   |   continue;
   | 
   |stomp_neighboring_channels();
   |break;
   | 
   | } while(true);

   With this series applied, the live interval of 'b' will be calculated
   to be between the 'b = 0' and 'continue' statements, which allows
   stomp_neighboring_channels() to overwrite the contents of 'b'
   depending on how the register allocator lays things out in the
   register file.  OTOH with [1] applied the live interval calculated
   for 'b' will be the region between the 'b = 0' and 'break'
   statements, which overlaps the whole region of divergent control flow
   as expected.

Cheers.

[1] https://lists.freedesktop.org/archives/mesa-dev/2017-October/174591.html

Jason Ekstrand  writes:

> No Shader-db changes.
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/compiler/brw_fs_live_variables.cpp | 55 
> 
>  1 file changed, 55 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs_live_variables.cpp 
> b/src/intel/compiler/brw_fs_live_variables.cpp
> index c449672..380060d 100644
> --- a/src/intel/compiler/brw_fs_live_variables.cpp
> +++ b/src/intel/compiler/brw_fs_live_variables.cpp
> @@ -223,6 +223,61 @@ fs_live_variables::compute_start_end()
>   }
>}
> }
> +
> +   /* Due to the explicit way the SIMD data is handled on GEN, we need to be 
> a
> +* bit more careful with live ranges and loops.  Consider the following
> +* example:
> +*
> +*vec4 color2;
> +*while (1) {
> +*   vec4 color = texture();
> +*   if (...) {
> +*  color2 = color * 2;
> +*  break;
> +*   }
> +*}
> +*gl_FragColor = color2;
> +*
> +* In this case, the definition of color2 dominates the use because the
> +* loop only has the one exit.  This means that the live range interval 
> for
> +* color2 goes from the statement in the if to it's use below the loop.
> +* Now suppose that the texture operation has a header register that gets
> +* assigned one of the registers used for color2.  If the loop condition 
> is
> +* non-uniform and some of the threads will take the and others will
> +* continue.  In this case, the next pass through the loop, the WE_all
> +* setup of the header register will stomp the disabled channels of color2
> +* and corrupt the value.
> +*
> +* This same problem can occur if you have a mix of 64, 32, and 16-bit
> +* registers because the channels do not line up or if you have a SIMD16
> +* program and the first half of one value overlaps the second half of the
> +* other.
> +*
> +* To solve this problem, we take any VGRFs whose live ranges cross the
> +* while instruction of a loop and extend their live ranges to the top of
> +* the loop.  This more accurately models the hardware because the value 
> in
> +* the VGRF needs to be carried through subsequent loop iterations in 
> order
> +* to remain valid when we finally do break.
> +*/
> +   foreach_block (block, cfg) {
> +  if (block->end()->opcode != BRW_OPCODE_WHILE)
> + continue;
> +
> +  /* This is a WHILE instrution. Find the DO block. */
> +  bblock_t *do_block = NULL;
> +  foreach_list_typed(bblock_link, child_link, link, >children) {
> + if (child_link->block->start_ip < block->end_ip) {
> +assert(do_block == NULL);
> +do_block = child_link->block;
> + }
> +  }
> +  assert(do_block);
> +
> +  for (int i = 0; i < num_vars; i++) {
> + if (start[i] < block->end_ip && end[i] > block->end_ip)
> +start[i] = MIN2(start[i], do_block->start_ip);
> +  }
> +   }
>  }
>  
>  fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
> -- 
> 2.5.0.400.gff86faf
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: PGP signature

Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 3:16 PM, Nanley Chery  wrote:

> On Fri, Oct 27, 2017 at 12:52:30PM -0700, Jason Ekstrand wrote:
> > On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery 
> > wrote:
> >
> > > Only use CCS_E to render to a texture that is CCS_E-compatible with the
> > > original texture's miptree (linear) format. This prevents render
> > > operations from writing data that can't be decoded with the original
> > > miptree format.
> > >
> > > On Gen10, with the new CCS_E-enabled formats handled, this enables the
> > > driver to pass the arb_texture_view-rendering-formats piglit test.
> > >
> > > Cc: 
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28
> > > +--
> > >  1 file changed, 26 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index a850f4d17b..59c57c227b 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct
> brw_context
> > > *brw,
> > > }
> > >  }
> > >
> > > +/**
> > > + * Return true if the format that will be used to access the miptree
> is
> > > + * CCS_E-compatible with the miptree's linear/non-sRGB format.
> > > + *
> > > + * Why use the linear format? Well, although the miptree may be
> specified
> > > with
> > > + * an sRGB format, the usage of that color space/format can be
> toggled.
> > > Since
> > > + * our HW tends to support more linear formats than sRGB ones, we use
> this
> > > + * format variant for check for CCS_E compatibility.
> > > + */
> > > +static bool
> > > +format_ccs_e_compat_with_miptree(const struct gen_device_info
> *devinfo,
> > > + const struct intel_mipmap_tree *mt,
> > > + enum isl_format access_format)
> > > +{
> > > +   assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E);
> > > +
> > > +   mesa_format linear_format = _mesa_get_srgb_format_linear(
> mt->format);
> > > +   enum isl_format isl_format = brw_isl_format_for_mesa_
> > > format(linear_format);
> > > +   return isl_formats_are_ccs_e_compatible(devinfo, isl_format,
> > > access_format);
> > > +}
> > > +
> > >  static bool
> > >  intel_miptree_supports_ccs_e(struct brw_context *brw,
> > >   const struct intel_mipmap_tree *mt)
> > > @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct
> brw_context
> > > *brw,
> > >return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE;
> > >
> > > case ISL_AUX_USAGE_CCS_E: {
> > > -  /* If the format supports CCS_E, then we can just use it */
> > > -  if (isl_format_supports_ccs_e(>screen->devinfo,
> > > render_format))
> > > +  /* If the format supports CCS_E and is compatible with the
> miptree,
> > > +   * then we can use it.
> > > +   */
> > > +  if (format_ccs_e_compat_with_miptree(>screen->devinfo,
> > > +   mt, render_format))
> > >
> >
> > You don't need the helper if you just use mt->surf.format.  That's what
> > can_texture_with_ccs does.
> >
> >
>
> Isn't using mt->surf.format making the code more restrictive than
> necessary? That field may give you the sRGB format of the surface, but
> the helper would always give you the linear variant. Therefore the
> helper allows for CCS_E to be used in more cases.
>

You're right.  I completely missed that.  In that case, we probably want to
use your helper for can_texture_with_ccs as well.  Maybe two patches: one
which adds the helper and makes can_texture_with_ccs use it and a second to
add the render target stuff?  Or you can keep this one as-is and do
can_texture_with_ccs as a follow-on, I don't really care.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] intel/fs: Don't let undefined values prevent copy propagation.

2017-10-27 Thread Francisco Jerez

This makes the dataflow propagation logic of the copy propagation pass
more intelligent in cases where the destination of a copy is known to
be undefined for some incoming CFG edges, building upon the
definedness information provided by the last patch.  Helps a few
programs, and avoids a handful shader-db regressions from the next
patch.

shader-db results on ILK:

  total instructions in shared programs: 6541547 -> 6541523 (-0.00%)
  instructions in affected programs: 360 -> 336 (-6.67%)
  helped: 8
  HURT: 0

  LOST:   0
  GAINED: 10

shader-db results on BDW:

  total instructions in shared programs: 8174323 -> 8173882 (-0.01%)
  instructions in affected programs: 7730 -> 7289 (-5.71%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 4

shader-db results on SKL:

  total instructions in shared programs: 8185669 -> 8184598 (-0.01%)
  instructions in affected programs: 10364 -> 9293 (-10.33%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 2
---
 src/intel/compiler/brw_fs_copy_propagation.cpp | 50 --
 1 file changed, 47 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp 
b/src/intel/compiler/brw_fs_copy_propagation.cpp
index cb117396089..5897cff35be 100644
--- a/src/intel/compiler/brw_fs_copy_propagation.cpp
+++ b/src/intel/compiler/brw_fs_copy_propagation.cpp
@@ -36,9 +36,12 @@
 
 #include "util/bitset.h"
 #include "brw_fs.h"
+#include "brw_fs_live_variables.h"
 #include "brw_cfg.h"
 #include "brw_eu.h"
 
+using namespace brw;
+
 namespace { /* avoid conflict with opt_copy_propagation_elements */
 struct acp_entry : public exec_node {
fs_reg dst;
@@ -77,12 +80,19 @@ struct block_data {
 * course of this block.
 */
BITSET_WORD *kill;
+
+   /**
+* Which entries in the fs_copy_prop_dataflow acp table are guaranteed to
+* have a fully uninitialized destination at the end of this block.
+*/
+   BITSET_WORD *undef;
 };
 
 class fs_copy_prop_dataflow
 {
 public:
fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg,
+ const fs_live_variables *live,
  exec_list *out_acp[ACP_HASH_SIZE]);
 
void setup_initial_values();
@@ -92,6 +102,7 @@ public:
 
void *mem_ctx;
cfg_t *cfg;
+   const fs_live_variables *live;
 
acp_entry **acp;
int num_acp;
@@ -102,8 +113,9 @@ public:
 } /* anonymous namespace */
 
 fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg,
+ const fs_live_variables *live,
  exec_list *out_acp[ACP_HASH_SIZE])
-   : mem_ctx(mem_ctx), cfg(cfg)
+   : mem_ctx(mem_ctx), cfg(cfg), live(live)
 {
bd = rzalloc_array(mem_ctx, struct block_data, cfg->num_blocks);
 
@@ -124,6 +136,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, 
cfg_t *cfg,
   bd[block->num].liveout = rzalloc_array(bd, BITSET_WORD, bitset_words);
   bd[block->num].copy = rzalloc_array(bd, BITSET_WORD, bitset_words);
   bd[block->num].kill = rzalloc_array(bd, BITSET_WORD, bitset_words);
+  bd[block->num].undef = rzalloc_array(bd, BITSET_WORD, bitset_words);
 
   for (int i = 0; i < ACP_HASH_SIZE; i++) {
  foreach_in_list(acp_entry, entry, _acp[block->num][i]) {
@@ -189,6 +202,18 @@ fs_copy_prop_dataflow::setup_initial_values()
  }
   }
}
+
+   /* Initialize the undef set. */
+   foreach_block (block, cfg) {
+  for (int i = 0; i < num_acp; i++) {
+ BITSET_SET(bd[block->num].undef, i);
+ for (unsigned off = 0; off < acp[i]->size_written; off += REG_SIZE) {
+if (BITSET_TEST(live->block_data[block->num].defout,
+live->var_from_reg(byte_offset(acp[i]->dst, off
+   BITSET_CLEAR(bd[block->num].undef, i);
+ }
+  }
+   }
 }
 
 /**
@@ -229,13 +254,30 @@ fs_copy_prop_dataflow::run()
 
  for (int i = 0; i < bitset_words; i++) {
 const BITSET_WORD old_livein = bd[block->num].livein[i];
+BITSET_WORD livein_from_any_block = 0;
 
 bd[block->num].livein[i] = ~0u;
 foreach_list_typed(bblock_link, parent_link, link, 
>parents) {
bblock_t *parent = parent_link->block;
-   bd[block->num].livein[i] &= bd[parent->num].liveout[i];
+   /* Consider ACP entries with a known-undefined destination to
+* be available from the parent.  This is valid because we're
+* free to set the undefined variable equal to the source of
+* the ACP entry without breaking the application's
+* expectations, since the variable is undefined.
+*/
+   bd[block->num].livein[i] &= (bd[parent->num].liveout[i] |
+bd[parent->num].undef[i]);
+   livein_from_any_block |= bd[parent->num].liveout[i];
 }
 
+/* Limit to the set of ACP entries

[Mesa-dev] [PATCH 1/3] intel/fs: Restrict live intervals to the subset possibly reachable from any definition.

2017-10-27 Thread Francisco Jerez

Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.

This can lead to a serious performance problem in shaders containing
many partial writes, like scalar arithmetic, FP64 and soon FP16
operations.  The number of oversize live intervals in such workloads
can cause the compilation time of the shader to explode because of the
worse than quadratic behavior of the register allocator and scheduler
when running out of registers, and it can also cause the running time
of the shader to explode due to the amount of spilling it leads to,
which is orders of magnitude slower than GRF memory.

This patch fixes it by computing the intersection of our current live
intervals with the subset of the program that can possibly be reached
from any definition of the variable.  Extending the storage allocation
of the variable beyond that is pretty useless because its value is
guaranteed to be undefined at a point that cannot be reached from any
definition.

No significant change in the running time of shader-db (with 5%
statistical significance).

shader-db results on IVB:

  total cycles in shared programs: 61108780 -> 60932856 (-0.29%)
  cycles in affected programs: 16335482 -> 16159558 (-1.08%)
  helped: 5121
  HURT: 4347

  total spills in shared programs: 1309 -> 1288 (-1.60%)
  spills in affected programs: 249 -> 228 (-8.43%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1652 -> 1597 (-3.33%)
  fills in affected programs: 262 -> 207 (-20.99%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 209

shader-db results on BDW:

  total cycles in shared programs: 67617262 -> 67361220 (-0.38%)
  cycles in affected programs: 23397142 -> 23141100 (-1.09%)
  helped: 8045
  HURT: 6488

  total spills in shared programs: 1456 -> 1252 (-14.01%)
  spills in affected programs: 465 -> 261 (-43.87%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1720 -> 1465 (-14.83%)
  fills in affected programs: 471 -> 216 (-54.14%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 162

shader-db results on SKL:

  total cycles in shared programs: 65436248 -> 65245186 (-0.29%)
  cycles in affected programs: 22560936 -> 22369874 (-0.85%)
  helped: 8457
  HURT: 6247

  total spills in shared programs: 437 -> 437 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 870 -> 854 (-1.84%)
  fills in affected programs: 16 -> 0
  helped: 1
  HURT: 0

  LOST:   0
  GAINED: 107

Reviewed-by: Jason Ekstrand 
---
 src/intel/compiler/brw_fs_live_variables.cpp | 34 
 src/intel/compiler/brw_fs_live_variables.h   | 12 ++
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_live_variables.cpp 
b/src/intel/compiler/brw_fs_live_variables.cpp
index c449672a519..059f076fa51 100644
--- a/src/intel/compiler/brw_fs_live_variables.cpp
+++ b/src/intel/compiler/brw_fs_live_variables.cpp
@@ -83,9 +83,11 @@ fs_live_variables::setup_one_write(struct block_data *bd, 
fs_inst *inst,
/* The def[] bitset marks when an initialization in a block completely
 * screens off previous updates of that variable (VGRF channel).
 */
-   if (inst->dst.file == VGRF && !inst->is_partial_write()) {
-  if (!BITSET_TEST(bd->use, var))
+   if (inst->dst.file == VGRF) {
+  if (!inst->is_partial_write() && !BITSET_TEST(bd->use, var))
  BITSET_SET(bd->def, var);
+
+  BITSET_SET(bd->defout, var);
}
 }
 
@@ -199,6 +201,28 @@ fs_live_variables::compute_live_variables()
  }
   }
}
+
+   /* Propagate defin and defout down the CFG to calculate the union of live
+* variables potentially defined along any possible control flow path.
+*/
+   do {
+  cont = false;
+
+  foreach_block (block, cfg) {
+ const struct block_data *bd = _data[block->num];
+
+foreach_list_typed(bblock_link, child_link, link, >children) {
+struct block_data *child_bd = _data[child_link->block->num];
+
+   for (int i = 0; i < bitset_words; i++) {
+   const BITSET_WORD new_def = bd->defout[i] & ~child_bd->defin[i];
+   child_bd->defin[i] |= new_def;
+   child_bd->defout[i] |= new_def;
+   cont |= new_def;
+   }
+}
+  }
+   } while (cont);
 }
 
 /**
@@ -212,12 +236,12 @@ fs_live_variables::compute_start_end()
   struct block_data *bd = _data[block->num];
 
   for (int i = 0; i < num_vars; i++) {
- if (BITSET_TEST(bd->livein, i)) {
+ if (BITSET_TEST(bd->livein, i) && BITSET_TEST(bd->defin, i)) {
 start[i] = MIN2(start[i], block->start_ip);
 end[i] = MAX2(end[i], block->start_ip);
  }
 
- if (BITSET_TEST(bd->liveout, i)) {
+ if (BITSET_TEST(bd->liveout, i) && BITSET_TEST(bd->defout, i)) {
 start[i] =

[Mesa-dev] [PATCH 3/3] intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.

2017-10-27 Thread Francisco Jerez

This addresses a long-standing back-end compiler bug that could lead
to cross-channel data corruption in loops executed non-uniformly.  In
some cases live variables extending through a loop divergence point
(e.g. a non-uniform break) into a convergence point (e.g. the end of
the loop) wouldn't be considered live along all physical control flow
paths the SIMD thread could possibly have taken in between due to some
channels remaining in the loop for additional iterations.

This patch fixes the problem by extending the CFG with physical edges
that don't exist in the idealized non-vectorized program, but
represent valid control flow paths the SIMD EU may take due to the
divergence of logical threads.  This makes sense because the i965 IR
is explicitly SIMD, and it's not uncommon for instructions to have an
influence on neighboring channels (e.g. a force_writemask_all header
setup), so the behavior of the SIMD thread as a whole needs to be
considered.

No changes in shader-db.
---
 src/intel/compiler/brw_cfg.cpp | 75 ++
 1 file changed, 69 insertions(+), 6 deletions(-)

diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp
index fad12eec588..600b428a492 100644
--- a/src/intel/compiler/brw_cfg.cpp
+++ b/src/intel/compiler/brw_cfg.cpp
@@ -98,6 +98,7 @@ ends_block(const backend_instruction *inst)
   op == BRW_OPCODE_ELSE ||
   op == BRW_OPCODE_CONTINUE ||
   op == BRW_OPCODE_BREAK ||
+  op == BRW_OPCODE_DO ||
   op == BRW_OPCODE_WHILE;
 }
 
@@ -268,13 +269,57 @@ cfg_t::cfg_t(exec_list *instructions)
  }
 
  cur->instructions.push_tail(inst);
+
+ /* Represent divergent execution of the loop as a pair of alternative
+  * edges coming out of the DO instruction: For any physical iteration
+  * of the loop a given logical thread can either start off enabled
+  * (which is represented as the "next" successor), or disabled (if it
+  * has reached a non-uniform exit of the loop during a previous
+  * iteration, which is represented as the "cur_while" successor).
+  *
+  * The disabled edge will be taken by the logical thread anytime we
+  * arrive at the DO instruction through a back-edge coming from a
+  * conditional exit of the loop where divergent control flow started.
+  *
+  * This guarantees that there is a control-flow path from any
+  * divergence point of the loop into the convergence point
+  * (immediately past the WHILE instruction) such that it overlaps the
+  * whole IP region of divergent control flow (potentially the whole
+  * loop) *and* doesn't imply the execution of any instructions part
+  * of the loop (since the corresponding execution mask bit will be
+  * disabled for a diverging thread).
+  *
+  * This way we make sure that any variables that are live throughout
+  * the region of divergence for an inactive logical thread are also
+  * considered to interfere with any other variables assigned by
+  * active logical threads within the same physical region of the
+  * program, since otherwise we would risk cross-channel data
+  * corruption.
+  */
+ next = new_block();
+ cur->add_successor(mem_ctx, next);
+ cur->add_successor(mem_ctx, cur_while);
+ set_next_block(, next, ip);
 break;
 
   case BRW_OPCODE_CONTINUE:
  cur->instructions.push_tail(inst);
 
+ /* A conditional CONTINUE may start a region of divergent control
+  * flow until the start of the next loop iteration (*not* until the
+  * end of the loop which is why the successor is not the top-level
+  * divergence point at cur_do).  The live interval of any variable
+  * extending through a CONTINUE edge is guaranteed to overlap the
+  * whole region of divergent execution, because any variable live-out
+  * at the CONTINUE instruction will also be live-in at the top of the
+  * loop, and therefore also live-out at the bottom-most point of the
+  * loop which is reachable from the top (since a control flow path
+  * exists from a definition of the variable through this CONTINUE
+  * instruction, the top of the loop, the (reachable) bottom of the
+  * loop, the top of the loop again, into a use of the variable).
+  */
  assert(cur_do != NULL);
-cur->add_successor(mem_ctx, cur_do);
+ cur->add_successor(mem_ctx, cur_do->next());
 
 next = new_block();
 if (inst->predicate)
@@ -286,8 +331,18 @@ cfg_t::cfg_t(exec_list *instructions)
   case BRW_OPCODE_BREAK:
  cur->instructions.push_tail(inst);
 
- assert(cur_while != NULL);
-cur->add_successor(mem_ctx, cur_while);
+ /* A conditional BREAK

Re: [Mesa-dev] [PATCH v2] wayland-drm: static inline wayland_drm_buffer_get

2017-10-27 Thread Emil Velikov

On 24 October 2017 at 17:14, Emil Velikov  wrote:
> From: Emil Velikov 
>
> The function is effectively a direct function call into
> libwayland-server.so.
>
> Thus GBM no longer depends on the wayland-drm static library, making the
> build more straight forward. And the resulting binary is a bit smaller.
>
> Note: we need to move struct wayland_drm_callbacks further up,
> otherwise we'll get an error since the type is incomplete.
>
> v2: Rebase, beef-up commit message, update meson, move struct
> wayland_drm_callbacks.
>
> Cc: Dylan Baker 
> Cc: Daniel Stone 
> Signed-off-by: Emil Velikov 
> Reviewed-by: Daniel Stone  (v1)
> ---
> Dylan can you check the meson bits? Can one say to meson, build object X
> while only using the depA CFLAGS? It seems to me that it currently links
> against depA even when you don't want it to.
>
Dylan, can you please take a look at the meson bit?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] clover/llvm: Drop support for LLVM < 3.9.

2017-10-27 Thread Francisco Jerez

Vedran Miletić  writes:

> v2: remove/inline compat stuff
>
> Reviewed-by: Francisco Jerez 
> ---
>  .../state_trackers/clover/llvm/codegen/native.cpp  |   8 +-
>  src/gallium/state_trackers/clover/llvm/compat.hpp  | 109 
> +
>  .../state_trackers/clover/llvm/invocation.cpp  |  22 +++--
>  .../state_trackers/clover/llvm/metadata.hpp|  30 +-
>  4 files changed, 20 insertions(+), 149 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp 
> b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> index 12c83a92b6..38028597a3 100644
> --- a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> @@ -114,7 +114,7 @@ namespace {
>  
>std::unique_ptr tm {
>   t->createTargetMachine(target.triple, target.cpu, "", {},
> -compat::default_reloc_model,
> +::llvm::None,
>  compat::default_code_model,
>  ::llvm::CodeGenOpt::Default) };
>if (!tm)
> @@ -124,11 +124,11 @@ namespace {
>::llvm::SmallVector data;
>  
>{
> - compat::pass_manager pm;
> + ::llvm::legacy::PassManager pm;
>   ::llvm::raw_svector_ostream os { data };
> - compat::raw_ostream_to_emit_file fos(os);
> + ::llvm::raw_svector_ostream (os);
>

This is just a reference to the other local variable above.  Mind
cleaning it up?

> - mod.setDataLayout(compat::get_data_layout(*tm));
> + mod.setDataLayout(tm->createDataLayout());
>   tm->Options.MCOptions.AsmVerbose =
>  (ft == TargetMachine::CGFT_AssemblyFile);
>  
> diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp 
> b/src/gallium/state_trackers/clover/llvm/compat.hpp
> index f8b56516d5..ce3a29f7d5 100644
> --- a/src/gallium/state_trackers/clover/llvm/compat.hpp
> +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp
> @@ -36,6 +36,8 @@
>  
>  #include "util/algorithm.hpp"
>  
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -46,16 +48,6 @@
>  #include 
>  #endif
>  
> -#if HAVE_LLVM >= 0x0307
> -#include 
> -#include 
> -#else
> -#include 
> -#include 
> -#include 
> -#include 
> -#endif
> -
>  #include 
>  #include 
>  #include 
> @@ -63,12 +55,6 @@
>  namespace clover {
> namespace llvm {
>namespace compat {
> -#if HAVE_LLVM >= 0x0307
> - typedef ::llvm::TargetLibraryInfoImpl target_library_info;
> -#else
> - typedef ::llvm::TargetLibraryInfo target_library_info;
> -#endif
> -
>  #if HAVE_LLVM >= 0x0500
>   const auto lang_as_offset = 0;
>   const clang::InputKind ik_opencl = clang::InputKind::OpenCL;
> @@ -77,19 +63,6 @@ namespace clover {
>   const clang::InputKind ik_opencl = clang::IK_OpenCL;
>  #endif
>  
> - inline void
> - set_lang_defaults(clang::CompilerInvocation ,
> -   clang::LangOptions , clang::InputKind ik,
> -   const ::llvm::Triple ,
> -   clang::PreprocessorOptions ,
> -   clang::LangStandard::Kind std) {
> -#if HAVE_LLVM >= 0x0309
> -inv.setLangDefaults(lopts, ik, t, ppopts, std);
> -#else
> -inv.setLangDefaults(lopts, ik, std);
> -#endif
> - }
> -
>   inline void
>   add_link_bitcode_file(clang::CodeGenOptions ,
> const std::string ) {
> @@ -100,78 +73,8 @@ namespace clover {
>  F.PropagateAttrs = true;
>  F.LinkFlags = ::llvm::Linker::Flags::None;
>  opts.LinkBitcodeFiles.emplace_back(F);
> -#elif HAVE_LLVM >= 0x0308
> -opts.LinkBitcodeFiles.emplace_back(::llvm::Linker::Flags::None, 
> path);
> -#else
> -opts.LinkBitcodeFile = path;
> -#endif
> - }
> -
> -#if HAVE_LLVM >= 0x0307
> - typedef ::llvm::legacy::PassManager pass_manager;
> -#else
> - typedef ::llvm::PassManager pass_manager;
> -#endif
> -
> - inline void
> - add_data_layout_pass(pass_manager ) {
> -#if HAVE_LLVM < 0x0307
> -pm.add(new ::llvm::DataLayoutPass());
> -#endif
> - }
> -
> - inline void
> - add_internalize_pass(pass_manager ,
> -  const std::vector ) {
> -#if HAVE_LLVM >= 0x0309
> -pm.add(::llvm::createInternalizePass(
> -  [=](const ::llvm::GlobalValue ) {
> - return std::find(names.begin(), names.end(),
> -  gv.getName()) != names.end();
> -  }));
> -#else
> -pm.add(::llvm::createInternalizePass(std::vector(
> -  map(std::mem_fn(::string::data), names;
> -#endif
> - }
> -
> -

Re: [Mesa-dev] [PATCH 1/7] util: move futex helpers into futex.h

2017-10-27 Thread Timothy Arceri




On 23/10/17 05:33, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

---
  src/util/Makefile.sources |  1 +
  src/util/futex.h  | 51 +++
  src/util/meson.build  |  1 +
  src/util/simple_mtx.h | 20 +--
  4 files changed, 54 insertions(+), 19 deletions(-)
  create mode 100644 src/util/futex.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index c7f6516a992..407b2825624 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -6,20 +6,21 @@ MESA_UTIL_FILES := \
build_id.h \
crc32.c \
crc32.h \
debug.c \
debug.h \
disk_cache.c \
disk_cache.h \
format_r11g11b10f.h \
format_rgb9e5.h \
format_srgb.h \
+   futex.h \
half_float.c \
half_float.h \
hash_table.c \
hash_table.h \
list.h \
macros.h \
mesa-sha1.c \
mesa-sha1.h \
sha1/sha1.c \
sha1/sha1.h \
diff --git a/src/util/futex.h b/src/util/futex.h
new file mode 100644
index 000..a79daaf209b
--- /dev/null
+++ b/src/util/futex.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright © 2015 Intel
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef UTIL_FUTEX_H
+#define UTIL_FUTEX_H
+
+#if defined(HAVE_FUTEX)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static inline long sys_futex(void *addr1, int op, int val1, struct timespec 
*timeout, void *addr2, int val3)
+{
+   return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3);
+}
+
+static inline int futex_wake(uint32_t *addr) {
+   return sys_futex(addr, FUTEX_WAKE, 1, NULL, NULL, 0);
+}
+
+static inline int futex_wait(uint32_t *addr, int32_t value) {


These have the same style mistakes as in the fast mtx patch i.e the { 
needs to be on the next line.


It looks like this series is written on top of that patch but I haven't 
yet got a rb/ack for it yet.




+   return sys_futex(addr, FUTEX_WAIT, value, NULL, NULL, 0);
+}
+
+#endif
+
+#endif /* UTIL_FUTEX_H */
diff --git a/src/util/meson.build b/src/util/meson.build
index c9cb3e861e9..0940e4d12a7 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -30,20 +30,21 @@ files_mesa_util = files(
'build_id.h',
'crc32.c',
'crc32.h',
'debug.c',
'debug.h',
'disk_cache.c',
'disk_cache.h',
'format_r11g11b10f.h',
'format_rgb9e5.h',
'format_srgb.h',
+  'futex.h',
'half_float.c',
'half_float.h',
'hash_table.c',
'hash_table.h',
'list.h',
'macros.h',
'mesa-sha1.c',
'mesa-sha1.h',
'sha1/sha1.c',
'sha1/sha1.h',
diff --git a/src/util/simple_mtx.h b/src/util/simple_mtx.h
index d23a4303c80..0c2602d03b6 100644
--- a/src/util/simple_mtx.h
+++ b/src/util/simple_mtx.h
@@ -14,20 +14,21 @@
   *
   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
   * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
   * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
DEALINGS
   * IN THE SOFTWARE.
   */
  
+#include "util/futex.h"

  #include "util/u_thread.h"
  
  #if defined(__GNUC__) && defined(HAVE_FUTEX)
  
  /* mtx_t - Fast, simple mutex

   *
   * While modern pthread mutexes are very fast (implemented using futex), they
   * still incur a call to an external DSO and overhead of the generality and
   * features of pthread mutexes.  Most mutexes in mesa only needs lock/unlock,
   * and the idea here is that we can inline the atomic operation and make the
@@ -48,39 +49,20 @@
   * A fast mutex only supports lock/unlock, can't

Re: [Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Anuj Phogat

On Fri, Oct 27, 2017 at 4:00 PM, Kenneth Graunke  wrote:
> On Friday, October 27, 2017 3:17:20 PM PDT Anuj Phogat wrote:
>> V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE
>>
>> Signed-off-by: Anuj Phogat 
>> Cc: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/genX_state_upload.c | 13 -
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
>> b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> index 4ccfd48919..25240f01d9 100644
>> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
>> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>>   sf.SmoothPointEnable = true;
>>  #endif
>>
>> +#if GEN_GEN == 10
>> +  /* _NEW_BUFFERS
>> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
>> +   */
>> +  const bool multisampled_fbo =
>> + _mesa_geometric_samples(ctx->DrawBuffer) > 1;
>> +  if (multisampled_fbo)
>> + sf.SmoothPointEnable = false;
>> +#endif
>> +
>>  #if GEN_IS_G4X || GEN_GEN >= 5
>>sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
>>  #endif
>> @@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = 
>> {
>> _NEW_POINT |
>> _NEW_PROGRAM |
>> (GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) |
>> -   (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0),
>> +   (GEN_GEN <= 7 ? _NEW_POLYGON : 0) |
>> +   (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0),
>
> Let's leave the GEN <= 7 bit as is and just add
>
> (GEN_GEN == 10 ? _NEW_BUFFERS : 0)
Done. Thanks.
>
> With that, both are
> Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Kenneth Graunke

On Friday, October 27, 2017 3:17:20 PM PDT Anuj Phogat wrote:
> V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE
> 
> Signed-off-by: Anuj Phogat 
> Cc: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 4ccfd48919..25240f01d9 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>   sf.SmoothPointEnable = true;
>  #endif
>  
> +#if GEN_GEN == 10
> +  /* _NEW_BUFFERS
> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
> +   */
> +  const bool multisampled_fbo =
> + _mesa_geometric_samples(ctx->DrawBuffer) > 1;
> +  if (multisampled_fbo)
> + sf.SmoothPointEnable = false;
> +#endif
> +
>  #if GEN_IS_G4X || GEN_GEN >= 5
>sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
>  #endif
> @@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = {
> _NEW_POINT |
> _NEW_PROGRAM |
> (GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) |
> -   (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0),
> +   (GEN_GEN <= 7 ? _NEW_POLYGON : 0) |
> +   (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0),

Let's leave the GEN <= 7 bit as is and just add

(GEN_GEN == 10 ? _NEW_BUFFERS : 0)

With that, both are
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build

2017-10-27 Thread Roland Scheidegger

Looks alright too.

Reviewed-by: Roland Scheidegger 

Am 28.10.2017 um 00:35 schrieb Brian Paul:
> Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
> ---
>  src/mesa/drivers/osmesa/SConscript | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/osmesa/SConscript 
> b/src/mesa/drivers/osmesa/SConscript
> index 7280314..9391dc3 100644
> --- a/src/mesa/drivers/osmesa/SConscript
> +++ b/src/mesa/drivers/osmesa/SConscript
> @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [
>  '#src/mapi',
>  '#src/mesa',
>  Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
> files/headers
> +Dir('../../../mapi/glapi'), # src/mapi/glapi build path
>  ])
>  
>  env.Prepend(LIBS = [
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build

2017-10-27 Thread Dylan Baker

Ah, I see that you fixed that already,
Reviewed-by: Dylan Baker 

Quoting Brian Paul (2017-10-27 15:35:09)
> Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
> ---
>  src/mesa/drivers/osmesa/SConscript | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/osmesa/SConscript 
> b/src/mesa/drivers/osmesa/SConscript
> index 7280314..9391dc3 100644
> --- a/src/mesa/drivers/osmesa/SConscript
> +++ b/src/mesa/drivers/osmesa/SConscript
> @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [
>  '#src/mapi',
>  '#src/mesa',
>  Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
> files/headers
> +Dir('../../../mapi/glapi'), # src/mapi/glapi build path
>  ])
>  
>  env.Prepend(LIBS = [
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build

2017-10-27 Thread Dylan Baker

I sent almost the same patch at exactly the same time. I also needed to add this
path to src/map/glapi/Sconscript to get scons building. Maybe that's unrelated?

Dylan

Quoting Brian Paul (2017-10-27 15:35:09)
> Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
> ---
>  src/mesa/drivers/osmesa/SConscript | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/osmesa/SConscript 
> b/src/mesa/drivers/osmesa/SConscript
> index 7280314..9391dc3 100644
> --- a/src/mesa/drivers/osmesa/SConscript
> +++ b/src/mesa/drivers/osmesa/SConscript
> @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [
>  '#src/mapi',
>  '#src/mesa',
>  Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
> files/headers
> +Dir('../../../mapi/glapi'), # src/mapi/glapi build path
>  ])
>  
>  env.Prepend(LIBS = [
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] scons: fix OSMesa driver build

2017-10-27 Thread Brian Paul

Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
---
 src/mesa/drivers/osmesa/SConscript | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/osmesa/SConscript 
b/src/mesa/drivers/osmesa/SConscript
index 7280314..9391dc3 100644
--- a/src/mesa/drivers/osmesa/SConscript
+++ b/src/mesa/drivers/osmesa/SConscript
@@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [
 '#src/mapi',
 '#src/mesa',
 Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
files/headers
+Dir('../../../mapi/glapi'), # src/mapi/glapi build path
 ])
 
 env.Prepend(LIBS = [
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa master #5970 completed

2017-10-27 Thread AppVeyor



Build mesa 5970 completed



Commit 05592cebd4 by Brian Paul on 10/27/2017 9:14 PM:

scons: fix scons build to find generated glapitable.h\n\nFixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"\n\nReviewed-by: Roland Scheidegger 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] scons: Fix glapi include path changes

2017-10-27 Thread Dylan Baker

fixes: 5daed06da2c0aaa4 ("osmesa: Include generated headers without path")
cc: Mark Janes 
Signed-off-by: Dylan Baker 
---
 src/mapi/glapi/SConscript  | 1 +
 src/mesa/drivers/osmesa/SConscript | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript
index 994778a8fb9..11f2b6dddfd 100644
--- a/src/mapi/glapi/SConscript
+++ b/src/mapi/glapi/SConscript
@@ -31,6 +31,7 @@ env.Append(CPPPATH = [
 '#/src/mapi',
 '#/src/mesa',
 Dir('..'), # src/mapi build path
+Dir('.'), # src/mapi/glapi build path
 ])
 
 glapi_sources = [
diff --git a/src/mesa/drivers/osmesa/SConscript 
b/src/mesa/drivers/osmesa/SConscript
index 728031446e5..43da571f085 100644
--- a/src/mesa/drivers/osmesa/SConscript
+++ b/src/mesa/drivers/osmesa/SConscript
@@ -6,7 +6,9 @@ env.Prepend(CPPPATH = [
 '#src',
 '#src/mapi',
 '#src/mesa',
-Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
files/headers
+# src/mapi{/glapi} build path for python-generated GL API files/headers
+Dir('../../../mapi'),
+Dir('../../../mapi/glapi'),
 ])
 
 env.Prepend(LIBS = [
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V2 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1

2017-10-27 Thread Anuj Phogat

V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE.
Fix indentation.

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 25240f01d9..65ec30833c 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -4363,6 +4363,16 @@ genX(upload_raster)(struct brw_context *brw)
   /* _NEW_LINE */
   raster.AntialiasingEnable = ctx->Line.SmoothFlag;
 
+#if GEN_GEN == 10
+  /* _NEW_BUFFERS
+   * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
+   */
+  const bool multisampled_fbo =
+ _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+  if (multisampled_fbo)
+ raster.AntialiasingEnable = false;
+#endif
+
   /* _NEW_SCISSOR */
   raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags;
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3.5] intel/compiler: Add union types for prog_data and prog_key stages

2017-10-27 Thread Jason Ekstrand

On Thu, Oct 26, 2017 at 11:14 PM, Jordan Justen 
wrote:

> Signed-off-by: Jordan Justen 
> Reviewed-by: Jason Ekstrand 
> Cc: Jason Ekstrand 
> Cc: Kenneth Graunke 
> ---
>
>  * Add comment (Ken)
>  * No typedef (Jason)
>
>  src/intel/compiler/brw_compiler.h | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_
> compiler.h
> index 701b4a80bf1..6ad89171ce4 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -403,6 +403,16 @@ struct brw_cs_prog_key {
> struct brw_sampler_prog_key_data tex;
>  };
>
> +/* brw_any_prog_key is any of the keys that map to an API stage */
> +union brw_any_prog_key {
> +   struct brw_vs_prog_key vs;
> +   struct brw_tcs_prog_key tcs;
> +   struct brw_tes_prog_key tes;
> +   struct brw_gs_prog_key gs;
> +   struct brw_wm_prog_key wm;
> +   struct brw_cs_prog_key cs;
> +};
> +
>  /*
>   * Image metadata structure as laid out in the shader parameter
>   * buffer.  Entries have to be 16B-aligned for the vec4 back-end to be
> @@ -1066,6 +1076,16 @@ struct brw_clip_prog_data {
> uint32_t total_grf;
>  };
>
> +/* brw_any_prog_data is prog_data for any stage that maps to an API stage
> */
> +union brw_any_prog_data {
>

For my usage, it'd be nice if you also throw stage_prog_data in here.


> +   struct brw_vs_prog_data vs;
> +   struct brw_tcs_prog_data tcs;
> +   struct brw_tes_prog_data tes;
> +   struct brw_gs_prog_data gs;
> +   struct brw_wm_prog_data wm;
> +   struct brw_cs_prog_data cs;
> +};
> +
>  #define DEFINE_PROG_DATA_DOWNCAST(stage)   \
>  static inline struct brw_##stage##_prog_data * \
>  brw_##stage##_prog_data(struct brw_stage_prog_data *prog_data) \
> --
> 2.15.0.rc2
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now

2017-10-27 Thread Nanley Chery

On Fri, Oct 27, 2017 at 12:50:45PM -0700, Jason Ekstrand wrote:
> Thanks for doing this in isl_format_supports_ccs_e instead of changing the
> table!
> 
> Reviewed-by: Jason Ekstrand 
> 

Thank you for the review!

> On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery 
> wrote:
> 
> > CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2
> > formats with CCS_E. None of these formats fit within the current
> > blorp_copy framework so disable them until support is added.
> >
> > Signed-off-by: Nanley Chery 
> > Cc: Jason Ekstrand 
> > ---
> >
> > Jason, do you think we modify blorp instead of moving forward with this
> > patch? In any case, I've sent this to the list to add context to my next
> > patch.
> >
> >  src/intel/isl/isl_format.c | 24 
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> > index fba3ac5e1a..03c591071b 100644
> > --- a/src/intel/isl/isl_format.c
> > +++ b/src/intel/isl/isl_format.c
> > @@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct
> > gen_device_info *devinfo,
> > if (!format_info[format].exists)
> >return false;
> >
> > +   /* For simplicity, only report that a format supports CCS_E if blorp
> > can
> > +* perform bit-for-bit copies with an image of that format while
> > compressed.
> > +* This allows ISL users to avoid having to resolve the image before
> > +* performing such a copy. We may want to change this behavior in the
> > +* future.
> > +*
> > +* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy
> > +* currently works, bit-for-bit copy operations are not possible
> > without an
> > +* intermediate resolve.
> > +*/
> > +   if (format == ISL_FORMAT_R11G11B10_FLOAT)
> > +  return false;
> > +
> > +   /* blorp_copy currently doesn't support formats with different
> > bit-widths
> > +* per-channel. Until that support is added, report that these formats
> > don't
> > +* support CCS_E. FIXME: Add support for these formats.
> > +*/
> > +   if (format == ISL_FORMAT_B10G10R10A2_UNORM ||
> > +   format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB ||
> > +   format == ISL_FORMAT_R10G10B10A2_UNORM ||
> > +   format == ISL_FORMAT_R10G10B10A2_UINT) {
> > +  return false;
> > +   }
> > +
> > return format_gen(devinfo) >= format_info[format].ccs_e;
> >  }
> >
> > --
> > 2.14.3
> >
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Anuj Phogat

V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ccfd48919..25240f01d9 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
  sf.SmoothPointEnable = true;
 #endif
 
+#if GEN_GEN == 10
+  /* _NEW_BUFFERS
+   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
+   */
+  const bool multisampled_fbo =
+ _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+  if (multisampled_fbo)
+ sf.SmoothPointEnable = false;
+#endif
+
 #if GEN_IS_G4X || GEN_GEN >= 5
   sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
 #endif
@@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = {
_NEW_POINT |
_NEW_PROGRAM |
(GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) |
-   (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0),
+   (GEN_GEN <= 7 ? _NEW_POLYGON : 0) |
+   (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0),
   .brw   = BRW_NEW_BLORP |
BRW_NEW_VUE_MAP_GEOM_OUT |
(GEN_GEN <= 5 ? BRW_NEW_BATCH |
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering

2017-10-27 Thread Nanley Chery

On Fri, Oct 27, 2017 at 12:52:30PM -0700, Jason Ekstrand wrote:
> On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery 
> wrote:
> 
> > Only use CCS_E to render to a texture that is CCS_E-compatible with the
> > original texture's miptree (linear) format. This prevents render
> > operations from writing data that can't be decoded with the original
> > miptree format.
> >
> > On Gen10, with the new CCS_E-enabled formats handled, this enables the
> > driver to pass the arb_texture_view-rendering-formats piglit test.
> >
> > Cc: 
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28
> > +--
> >  1 file changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index a850f4d17b..59c57c227b 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context
> > *brw,
> > }
> >  }
> >
> > +/**
> > + * Return true if the format that will be used to access the miptree is
> > + * CCS_E-compatible with the miptree's linear/non-sRGB format.
> > + *
> > + * Why use the linear format? Well, although the miptree may be specified
> > with
> > + * an sRGB format, the usage of that color space/format can be toggled.
> > Since
> > + * our HW tends to support more linear formats than sRGB ones, we use this
> > + * format variant for check for CCS_E compatibility.
> > + */
> > +static bool
> > +format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo,
> > + const struct intel_mipmap_tree *mt,
> > + enum isl_format access_format)
> > +{
> > +   assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E);
> > +
> > +   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
> > +   enum isl_format isl_format = brw_isl_format_for_mesa_
> > format(linear_format);
> > +   return isl_formats_are_ccs_e_compatible(devinfo, isl_format,
> > access_format);
> > +}
> > +
> >  static bool
> >  intel_miptree_supports_ccs_e(struct brw_context *brw,
> >   const struct intel_mipmap_tree *mt)
> > @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context
> > *brw,
> >return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE;
> >
> > case ISL_AUX_USAGE_CCS_E: {
> > -  /* If the format supports CCS_E, then we can just use it */
> > -  if (isl_format_supports_ccs_e(>screen->devinfo,
> > render_format))
> > +  /* If the format supports CCS_E and is compatible with the miptree,
> > +   * then we can use it.
> > +   */
> > +  if (format_ccs_e_compat_with_miptree(>screen->devinfo,
> > +   mt, render_format))
> >
> 
> You don't need the helper if you just use mt->surf.format.  That's what
> can_texture_with_ccs does.
> 
> 

Isn't using mt->surf.format making the code more restrictive than
necessary? That field may give you the sRGB format of the surface, but
the helper would always give you the linear variant. Therefore the
helper allows for CCS_E to be used in more cases.

> >   return ISL_AUX_USAGE_CCS_E;
> >
> >/* Otherwise, we have to fall back to CCS_D */
> > --
> > 2.14.3
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] st/mesa: guard sampler views changes with a mutex

2017-10-27 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Sun, Oct 22, 2017 at 8:39 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Some locking is unfortunately required, because well-formed GL programs
> can have multiple threads racing to access the same texture, e.g.: two
> threads/contexts rendering from the same texture, or one thread destroying
> a context while the other is rendering from or modifying a texture.
>
> Since even the simple mutex caused noticable slowdowns in the piglit
> drawoverhead micro-benchmark, this patch uses a slightly more involved
> approach to keep locks out of the fast path:
>
> - the initial lookup of sampler views happens without taking a lock
> - a per-texture lock is only taken when we have to modify the sampler
>   view(s)
> - since each thread mostly operates only on the entry corresponding to
>   its context, the main issue is re-allocation of the sampler view array
>   when it needs to be grown, but the old copy is not freed
>
> Old copies of the sampler views array are kept around in a linked list
> until the entire texture object is deleted. The total memory wasted
> in this way is roughly equal to the size of the current sampler views
> array.
>
> Fixes non-deterministic memory corruption in some
> dEQP-EGL.functional.sharing.gles2.multithread.* tests, e.g.
> dEQP-EGL.functional.sharing.gles2.multithread.simple.images.texture_source.create_texture_render
> ---
>  src/mesa/state_tracker/st_atom_sampler.c |  21 +--
>  src/mesa/state_tracker/st_cb_texture.c   |  12 ++
>  src/mesa/state_tracker/st_sampler_view.c | 270 
> ++-
>  src/mesa/state_tracker/st_sampler_view.h |   3 +
>  src/mesa/state_tracker/st_texture.h  |  40 -
>  5 files changed, 250 insertions(+), 96 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_sampler.c 
> b/src/mesa/state_tracker/st_atom_sampler.c
> index ff3f49fa4e4..6d83f963f0a 100644
> --- a/src/mesa/state_tracker/st_atom_sampler.c
> +++ b/src/mesa/state_tracker/st_atom_sampler.c
> @@ -36,20 +36,21 @@
>  #include "main/mtypes.h"
>  #include "main/glformats.h"
>  #include "main/samplerobj.h"
>  #include "main/teximage.h"
>  #include "main/texobj.h"
>
>  #include "st_context.h"
>  #include "st_cb_texture.h"
>  #include "st_format.h"
>  #include "st_atom.h"
> +#include "st_sampler_view.h"
>  #include "st_texture.h"
>  #include "pipe/p_context.h"
>  #include "pipe/p_defines.h"
>
>  #include "cso_cache/cso_context.h"
>
>  #include "util/u_format.h"
>
>
>  /**
> @@ -157,40 +158,32 @@ st_convert_sampler(const struct st_context *st,
> (sampler->wrap_s | sampler->wrap_t | sampler->wrap_r) & 0x1 &&
> (msamp->BorderColor.ui[0] ||
>  msamp->BorderColor.ui[1] ||
>  msamp->BorderColor.ui[2] ||
>  msamp->BorderColor.ui[3])) {
>const GLboolean is_integer = texobj->_IsIntegerFormat;
>GLenum texBaseFormat = _mesa_base_tex_image(texobj)->_BaseFormat;
>
>if (st->apply_texture_swizzle_to_border_color) {
>   const struct st_texture_object *stobj = 
> st_texture_object_const(texobj);
> - const struct pipe_sampler_view *sv = NULL;
> -
> - /* Just search for the first used view. We can do this because the
> -swizzle is per-texture, not per context. */
>   /* XXX: clean that up to not use the sampler view at all */
> - for (unsigned i = 0; i < stobj->num_sampler_views; ++i) {
> -if (stobj->sampler_views[i].view) {
> -   sv = stobj->sampler_views[i].view;
> -   break;
> -}
> - }
> + const struct st_sampler_view *sv = 
> st_texture_get_current_sampler_view(st, stobj);
>
>   if (sv) {
> +struct pipe_sampler_view *view = sv->view;
>  union pipe_color_union tmp;
>  const unsigned char swz[4] =
>  {
> -   sv->swizzle_r,
> -   sv->swizzle_g,
> -   sv->swizzle_b,
> -   sv->swizzle_a,
> +   view->swizzle_r,
> +   view->swizzle_g,
> +   view->swizzle_b,
> +   view->swizzle_a,
>  };
>
>  st_translate_color(>BorderColor, ,
> texBaseFormat, is_integer);
>
>  util_format_apply_color_swizzle(>border_color,
>  , swz, is_integer);
>   } else {
>  st_translate_color(>BorderColor,
> >border_color,
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 3b7a3b5e985..7766273381b 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -144,40 +144,52 @@ st_DeleteTextureImage(struct gl_context * ctx, struct 
> gl_texture_image *img)
> /* nothing special (yet) for st_texture_image */
>

Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Anuj Phogat

On Fri, Oct 27, 2017 at 2:50 PM, Matt Turner  wrote:
> On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogat  wrote:
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
>> b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> index 4ccfd48919..b6e800aa90 100644
>> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
>> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>>   sf.SmoothPointEnable = true;
>>  #endif
>>
>> +#if GEN_GEN == 10
>> +  /* _NEW_MULTISAMPLE
>> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
>> +   */
>> +  const bool multisampled_fbo =
>> +  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
>
> Indent this line.
Fixed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Anuj Phogat

On Fri, Oct 27, 2017 at 2:44 PM, Kenneth Graunke  wrote:
> On Friday, October 27, 2017 10:50:23 AM PDT Anuj Phogat wrote:
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
>> b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> index 4ccfd48919..b6e800aa90 100644
>> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
>> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
>> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>>   sf.SmoothPointEnable = true;
>>  #endif
>>
>> +#if GEN_GEN == 10
>> +  /* _NEW_MULTISAMPLE
>> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
>> +   */
>> +  const bool multisampled_fbo =
>> +  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
>> +  if (multisampled_fbo)
>> + sf.SmoothPointEnable = false;
>> +#endif
>> +
>>  #if GEN_IS_G4X || GEN_GEN >= 5
>>sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
>>  #endif
>>
>
> In both patches...accessing ctx->DrawBuffer requires _NEW_BUFFERS, but
> doesn't need _NEW_MULTISAMPLE.
Right. Fixed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] scons: fix scons build to find generated glapitable.h

2017-10-27 Thread Roland Scheidegger

Am 27.10.2017 um 23:16 schrieb Brian Paul:
> Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
> ---
>  src/mapi/glapi/SConscript | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript
> index 994778a..08fed6c 100644
> --- a/src/mapi/glapi/SConscript
> +++ b/src/mapi/glapi/SConscript
> @@ -30,6 +30,7 @@ env.Append(CPPPATH = [
>  '#/src',
>  '#/src/mapi',
>  '#/src/mesa',
> +Dir('.'), # src/mapi/glapi build path
>  Dir('..'), # src/mapi build path
>  ])
>  
> 

Makes sense to me.

Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Matt Turner

On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogat  wrote:
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 4ccfd48919..b6e800aa90 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>   sf.SmoothPointEnable = true;
>  #endif
>
> +#if GEN_GEN == 10
> +  /* _NEW_MULTISAMPLE
> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
> +   */
> +  const bool multisampled_fbo =
> +  _mesa_geometric_samples(ctx->DrawBuffer) > 1;

Indent this line.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Kenneth Graunke

On Friday, October 27, 2017 10:50:23 AM PDT Anuj Phogat wrote:
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 4ccfd48919..b6e800aa90 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
>   sf.SmoothPointEnable = true;
>  #endif
>  
> +#if GEN_GEN == 10
> +  /* _NEW_MULTISAMPLE
> +   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
> +   */
> +  const bool multisampled_fbo =
> +  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
> +  if (multisampled_fbo)
> + sf.SmoothPointEnable = false;
> +#endif
> +
>  #if GEN_IS_G4X || GEN_GEN >= 5
>sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
>  #endif
> 

In both patches...accessing ctx->DrawBuffer requires _NEW_BUFFERS, but
doesn't need _NEW_MULTISAMPLE.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium; s/unsigned/enum pipe_prim_type/

2017-10-27 Thread Roland Scheidegger

Reviewed-by: Roland Scheidegger 

Am 27.10.2017 um 23:16 schrieb Brian Paul:
> In the vbuf_render::set_primitive() functions.
> ---
>  src/gallium/auxiliary/draw/draw_vbuf.h   | 3 ++-
>  src/gallium/drivers/i915/i915_prim_vbuf.c| 2 +-
>  src/gallium/drivers/llvmpipe/lp_setup_vbuf.c | 2 +-
>  src/gallium/drivers/nouveau/nv30/nv30_draw.c | 2 +-
>  src/gallium/drivers/r300/r300_render.c   | 2 +-
>  src/gallium/drivers/softpipe/sp_prim_vbuf.c  | 4 ++--
>  6 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_vbuf.h 
> b/src/gallium/auxiliary/draw/draw_vbuf.h
> index 194796b..8faccda 100644
> --- a/src/gallium/auxiliary/draw/draw_vbuf.h
> +++ b/src/gallium/auxiliary/draw/draw_vbuf.h
> @@ -38,6 +38,7 @@
>  
>  
>  #include "pipe/p_compiler.h"
> +#include "pipe/p_defines.h"
>  
>  
>  struct pipe_rasterizer_state;
> @@ -96,7 +97,7 @@ struct vbuf_render {
>  * the discretion of the driver, for the benefit of the passthrough
>  * path.
>  */
> -   void (*set_primitive)( struct vbuf_render *, unsigned prim );
> +   void (*set_primitive)( struct vbuf_render *, enum pipe_prim_type prim );
>  
> /**
>  * Draw indexed primitives.  Note that indices are ushort.  The driver
> diff --git a/src/gallium/drivers/i915/i915_prim_vbuf.c 
> b/src/gallium/drivers/i915/i915_prim_vbuf.c
> index c0ba23b..8f2f5c1 100644
> --- a/src/gallium/drivers/i915/i915_prim_vbuf.c
> +++ b/src/gallium/drivers/i915/i915_prim_vbuf.c
> @@ -324,7 +324,7 @@ i915_vbuf_ensure_index_bounds(struct vbuf_render *render,
>  
>  static void
>  i915_vbuf_render_set_primitive(struct vbuf_render *render, 
> -   unsigned prim)
> +   enum pipe_prim_type prim)
>  {
> struct i915_vbuf_render *i915_render = i915_vbuf_render(render);
> i915_render->prim = prim;
> diff --git a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c 
> b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
> index d7fa9fd..28a48d4 100644
> --- a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
> +++ b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
> @@ -115,7 +115,7 @@ lp_setup_unmap_vertices(struct vbuf_render *vbr,
>  
>  
>  static void
> -lp_setup_set_primitive(struct vbuf_render *vbr, unsigned prim)
> +lp_setup_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim)
>  {
> lp_setup_context(vbr)->prim = prim;
>  }
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_draw.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_draw.c
> index 4c587fc..798ec14 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_draw.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_draw.c
> @@ -111,7 +111,7 @@ nv30_render_unmap_vertices(struct vbuf_render *render,
>  }
>  
>  static void
> -nv30_render_set_primitive(struct vbuf_render *render, unsigned prim)
> +nv30_render_set_primitive(struct vbuf_render *render, enum pipe_prim_type 
> prim)
>  {
> struct nv30_render *r = nv30_render(render);
>  
> diff --git a/src/gallium/drivers/r300/r300_render.c 
> b/src/gallium/drivers/r300/r300_render.c
> index 4cae766..9397aae 100644
> --- a/src/gallium/drivers/r300/r300_render.c
> +++ b/src/gallium/drivers/r300/r300_render.c
> @@ -964,7 +964,7 @@ static void r300_render_release_vertices(struct 
> vbuf_render* render)
>  }
>  
>  static void r300_render_set_primitive(struct vbuf_render* render,
> -  unsigned prim)
> +  enum pipe_prim_type prim)
>  {
>  struct r300_render* r300render = r300_render(render);
>  
> diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c 
> b/src/gallium/drivers/softpipe/sp_prim_vbuf.c
> index 95d1ac1..1ce04a2 100644
> --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c
> +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c
> @@ -60,7 +60,7 @@ struct softpipe_vbuf_render
> struct softpipe_context *softpipe;
> struct setup_context *setup;
>  
> -   uint prim;
> +   enum pipe_prim_type prim;
> uint vertex_size;
> uint nr_vertices;
> uint vertex_buffer_size;
> @@ -133,7 +133,7 @@ sp_vbuf_unmap_vertices(struct vbuf_render *vbr,
>  
>  
>  static void
> -sp_vbuf_set_primitive(struct vbuf_render *vbr, unsigned prim)
> +sp_vbuf_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim)
>  {
> struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr);
> struct setup_context *setup_ctx = cvbr->setup;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation

2017-10-27 Thread Ian Romanick

On 10/24/2017 02:42 AM, Muthukumar, Aravindan wrote:
>> -Original Message-
>> From: Ian Romanick [mailto:i...@freedesktop.org]
>> Sent: Friday, October 20, 2017 9:51 AM
>> To: Muthukumar, Aravindan ; mesa-
>> d...@lists.freedesktop.org
>> Cc: J Karanje, Kedar ; Tamminen, Eero T
>> 
>> Subject: Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation
>>
>> On 09/13/2017 11:43 PM, aravindan.muthuku...@intel.com wrote:
>>> From: Aravindan Muthukumar 
>>>
>>> Avoiding the loop which was running with O(n) complexity.
>>> Now the complexity has been reduced to O(1)
>>>
>>> Algorithm calculates the index using matrix method.
>>> Matrix arrangement is as below:
>>> Assuming PAGE_SIZE is 4096.
>>>
>>>  1*4096   2*40963*40964*4096
>>>  5*4096   6*40967*40968*4096
>>
>> I think adding one more row to this chart would make it more clear.  The two
>> rows shown also follow a simpler pattern, and that made some of the
>> complexity below seem confusing.
>>
>>10*4096  12*4096   14*4096   16*4096
>>
>>>   ...  ...   ...   ...
>>>   ...  ...   ...   ...
>>>   ...  ...   ...   max_cache_size
>>>
>>> From this matrix its cleary seen that every row
>>clearly
>>
>>> follows the below way:
>>>  ...   ...   ...n
>>>n+(1/4)n  n+(1/2)n  n+(3/4)n2n
>>>
>>> Row is calulated as log2(size/PAGE_SIZE) Column is calculated as
>>> converting the difference between the elements to fit into power size
>>> of two and indexing it.
>>>
>>> Final Index is (row*4)+(col-1)
>>>
>>> Tested with Intel Mesa CI.
>>>
>>> Improves performance of 3d Mark on Broxton.
>>> Analyzed using Compare Perf Analyser:
>>> Average : 201.2 +/- 65.4836 (n=20)
>>
>> Is 201 the improvement or the absolute score?  Do not quote absolute scores.
>>
>>> Percentage : 0.705966% +/- 0.229767% (n=20)
>>
>> Eero: Can you reproduce this result on BXT or other platforms?  Just 
>> curious...
>>
>>> v2: Review comments regarding cosmetics and asserts implemented
>>>
>>> Signed-off-by: Aravindan Muthukumar 
>>> Signed-off-by: Kedar Karanje 
>>> Reviewed-by: Yogesh Marathe 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 46
>>> --
>>>  1 file changed, 39 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
>>> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
>>> index 8017219..8013ccb 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
>>> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
>>> @@ -87,6 +87,8 @@
>>>
>>>  #define memclear(s) memset(, 0, sizeof(s))
>>>
>>> +#define PAGE_SIZE 4096
>>> +
>>>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
>>>
>>>  static inline int
>>> @@ -181,19 +183,45 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t
>> pitch, uint32_t tiling)
>>> return ALIGN(pitch, tile_width);
>>>  }
>>>
>>> +static inline int
>>> +ilog2_round_up(int value)
>>> +{
>>> +   assert(value != 0);
>>> +   return 32 - __builtin_clz(value - 1); }
>>> +
>>> +/*
>>> + * This function finds the correct bucket fit for the input size.
>>> + * The function works with O(1) complexity when the requested size
>>> + * was queried instead of iterating the size through all the buckets.
>>> + */
>>>  static struct bo_cache_bucket *
>>>  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)  {
>>> -   int i;
>>> +   int index = -1;
>>
>> I don't see how execution could get to that test without index being set 
>> again,
>> so it should be safe to remove the initialization.
>>
> 
> We will skip this initialization since it doesn’t impact the index result.
> 
>   
>>> +   int row, col = 0;
>>> +   int pages, pages_log2;
>>
>> Move the declarations of row, col, pages, and pages_log2 into the else case,
>> initialize them with the only assignment to them, and make them const.
>>
>> Since all of these values, including index, get values derived from size, I 
>> believe
>> the should all be unsigned.  In that case, remove the index >= 0 test below.
>>
> 
> We will move as well as initialize the above variables in the respective else 
> part. 
> 
>>>
>>> -   for (i = 0; i < bufmgr->num_buckets; i++) {
>>> -  struct bo_cache_bucket *bucket = >cache_bucket[i];
>>> -  if (bucket->size >= size) {
>>> - return bucket;
>>> -  }
>>> +   /* condition for size less  than 4*4096 (16KB) page size */
>>  ^   ^ Remove one space here.
>>  |
>>  Capitalize and add a period at the end of the sentence.
>>
>>> +   if(size <= 4 * PAGE_SIZE) {
>> ^ Add a space here.
>>
>>> +  index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;;
>>   Remove extra trailing semicolon. ^
>>
>>> +   } else {
>>> +

[Mesa-dev] [PATCH] scons: fix scons build to find generated glapitable.h

2017-10-27 Thread Brian Paul

Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"
---
 src/mapi/glapi/SConscript | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript
index 994778a..08fed6c 100644
--- a/src/mapi/glapi/SConscript
+++ b/src/mapi/glapi/SConscript
@@ -30,6 +30,7 @@ env.Append(CPPPATH = [
 '#/src',
 '#/src/mapi',
 '#/src/mesa',
+Dir('.'), # src/mapi/glapi build path
 Dir('..'), # src/mapi build path
 ])
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium; s/unsigned/enum pipe_prim_type/

2017-10-27 Thread Brian Paul

In the vbuf_render::set_primitive() functions.
---
 src/gallium/auxiliary/draw/draw_vbuf.h   | 3 ++-
 src/gallium/drivers/i915/i915_prim_vbuf.c| 2 +-
 src/gallium/drivers/llvmpipe/lp_setup_vbuf.c | 2 +-
 src/gallium/drivers/nouveau/nv30/nv30_draw.c | 2 +-
 src/gallium/drivers/r300/r300_render.c   | 2 +-
 src/gallium/drivers/softpipe/sp_prim_vbuf.c  | 4 ++--
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_vbuf.h 
b/src/gallium/auxiliary/draw/draw_vbuf.h
index 194796b..8faccda 100644
--- a/src/gallium/auxiliary/draw/draw_vbuf.h
+++ b/src/gallium/auxiliary/draw/draw_vbuf.h
@@ -38,6 +38,7 @@
 
 
 #include "pipe/p_compiler.h"
+#include "pipe/p_defines.h"
 
 
 struct pipe_rasterizer_state;
@@ -96,7 +97,7 @@ struct vbuf_render {
 * the discretion of the driver, for the benefit of the passthrough
 * path.
 */
-   void (*set_primitive)( struct vbuf_render *, unsigned prim );
+   void (*set_primitive)( struct vbuf_render *, enum pipe_prim_type prim );
 
/**
 * Draw indexed primitives.  Note that indices are ushort.  The driver
diff --git a/src/gallium/drivers/i915/i915_prim_vbuf.c 
b/src/gallium/drivers/i915/i915_prim_vbuf.c
index c0ba23b..8f2f5c1 100644
--- a/src/gallium/drivers/i915/i915_prim_vbuf.c
+++ b/src/gallium/drivers/i915/i915_prim_vbuf.c
@@ -324,7 +324,7 @@ i915_vbuf_ensure_index_bounds(struct vbuf_render *render,
 
 static void
 i915_vbuf_render_set_primitive(struct vbuf_render *render, 
-   unsigned prim)
+   enum pipe_prim_type prim)
 {
struct i915_vbuf_render *i915_render = i915_vbuf_render(render);
i915_render->prim = prim;
diff --git a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c 
b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
index d7fa9fd..28a48d4 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c
@@ -115,7 +115,7 @@ lp_setup_unmap_vertices(struct vbuf_render *vbr,
 
 
 static void
-lp_setup_set_primitive(struct vbuf_render *vbr, unsigned prim)
+lp_setup_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim)
 {
lp_setup_context(vbr)->prim = prim;
 }
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_draw.c 
b/src/gallium/drivers/nouveau/nv30/nv30_draw.c
index 4c587fc..798ec14 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_draw.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_draw.c
@@ -111,7 +111,7 @@ nv30_render_unmap_vertices(struct vbuf_render *render,
 }
 
 static void
-nv30_render_set_primitive(struct vbuf_render *render, unsigned prim)
+nv30_render_set_primitive(struct vbuf_render *render, enum pipe_prim_type prim)
 {
struct nv30_render *r = nv30_render(render);
 
diff --git a/src/gallium/drivers/r300/r300_render.c 
b/src/gallium/drivers/r300/r300_render.c
index 4cae766..9397aae 100644
--- a/src/gallium/drivers/r300/r300_render.c
+++ b/src/gallium/drivers/r300/r300_render.c
@@ -964,7 +964,7 @@ static void r300_render_release_vertices(struct 
vbuf_render* render)
 }
 
 static void r300_render_set_primitive(struct vbuf_render* render,
-  unsigned prim)
+  enum pipe_prim_type prim)
 {
 struct r300_render* r300render = r300_render(render);
 
diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c 
b/src/gallium/drivers/softpipe/sp_prim_vbuf.c
index 95d1ac1..1ce04a2 100644
--- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c
+++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c
@@ -60,7 +60,7 @@ struct softpipe_vbuf_render
struct softpipe_context *softpipe;
struct setup_context *setup;
 
-   uint prim;
+   enum pipe_prim_type prim;
uint vertex_size;
uint nr_vertices;
uint vertex_buffer_size;
@@ -133,7 +133,7 @@ sp_vbuf_unmap_vertices(struct vbuf_render *vbr,
 
 
 static void
-sp_vbuf_set_primitive(struct vbuf_render *vbr, unsigned prim)
+sp_vbuf_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim)
 {
struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr);
struct setup_context *setup_ctx = cvbr->setup;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor (v2)

2017-10-27 Thread Mauro Rossi

Having moved gallium_dri.so library to /vendor/lib/dri
also symlinks need to be coherently created using TARGET_OUT_VENDOR instead of 
TARGET_OUT
or all non Intel drivers will not be loaded with Android N and earlier,
thus causing SurfaceFlinger SIGABRT

(v2) simplification of post install command

Fixes: c3f75d483c ("Android: move libraries to /vendor")

Cc: 17.3 
---
 src/gallium/targets/dri/Android.mk | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/targets/dri/Android.mk 
b/src/gallium/targets/dri/Android.mk
index 61b65769ff..bd3613388f 100644
--- a/src/gallium/targets/dri/Android.mk
+++ b/src/gallium/targets/dri/Android.mk
@@ -70,8 +70,9 @@ LOCAL_SHARED_LIBRARIES += $(sort $(GALLIUM_SHARED_LIBS))
 ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),)
 LOCAL_POST_INSTALL_CMD := \
$(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64), \
- mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \
- $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so 
$(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \
+ $(eval MESA_DRI_MODULE_PATH := 
$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)) \
+ mkdir -p $(MESA_DRI_MODULE_PATH); \
+ $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so 
$(MESA_DRI_MODULE_PATH)/$(d)_dri.so;) \
)
 else
 LOCAL_MODULE_SYMLINKS := $(foreach d, $(GALLIUM_TARGET_DRIVERS), $(d)_dri.so)
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery 
wrote:

> Only use CCS_E to render to a texture that is CCS_E-compatible with the
> original texture's miptree (linear) format. This prevents render
> operations from writing data that can't be decoded with the original
> miptree format.
>
> On Gen10, with the new CCS_E-enabled formats handled, this enables the
> driver to pass the arb_texture_view-rendering-formats piglit test.
>
> Cc: 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28
> +--
>  1 file changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index a850f4d17b..59c57c227b 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context
> *brw,
> }
>  }
>
> +/**
> + * Return true if the format that will be used to access the miptree is
> + * CCS_E-compatible with the miptree's linear/non-sRGB format.
> + *
> + * Why use the linear format? Well, although the miptree may be specified
> with
> + * an sRGB format, the usage of that color space/format can be toggled.
> Since
> + * our HW tends to support more linear formats than sRGB ones, we use this
> + * format variant for check for CCS_E compatibility.
> + */
> +static bool
> +format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo,
> + const struct intel_mipmap_tree *mt,
> + enum isl_format access_format)
> +{
> +   assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E);
> +
> +   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
> +   enum isl_format isl_format = brw_isl_format_for_mesa_
> format(linear_format);
> +   return isl_formats_are_ccs_e_compatible(devinfo, isl_format,
> access_format);
> +}
> +
>  static bool
>  intel_miptree_supports_ccs_e(struct brw_context *brw,
>   const struct intel_mipmap_tree *mt)
> @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context
> *brw,
>return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE;
>
> case ISL_AUX_USAGE_CCS_E: {
> -  /* If the format supports CCS_E, then we can just use it */
> -  if (isl_format_supports_ccs_e(>screen->devinfo,
> render_format))
> +  /* If the format supports CCS_E and is compatible with the miptree,
> +   * then we can use it.
> +   */
> +  if (format_ccs_e_compat_with_miptree(>screen->devinfo,
> +   mt, render_format))
>

You don't need the helper if you just use mt->surf.format.  That's what
can_texture_with_ccs does.


>   return ISL_AUX_USAGE_CCS_E;
>
>/* Otherwise, we have to fall back to CCS_D */
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: common: silence compiler warning

2017-10-27 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2017-10-27 09:44:14, Lionel Landwerlin wrote:
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/common/gen_decoder.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
> index 395ff02908a..55e7305117c 100644
> --- a/src/intel/common/gen_decoder.c
> +++ b/src/intel/common/gen_decoder.c
> @@ -556,7 +556,7 @@ gen_spec_load(const struct gen_device_info *devinfo)
>  {
> struct parser_context ctx;
> void *buf;
> -   uint8_t *text_data;
> +   uint8_t *text_data = NULL;
> uint32_t text_offset = 0, text_length = 0, total_length;
> uint32_t gen_10 = devinfo_to_gen(devinfo);
>  
> -- 
> 2.15.0.rc2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now

2017-10-27 Thread Jason Ekstrand

Thanks for doing this in isl_format_supports_ccs_e instead of changing the
table!

Reviewed-by: Jason Ekstrand 

On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery 
wrote:

> CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2
> formats with CCS_E. None of these formats fit within the current
> blorp_copy framework so disable them until support is added.
>
> Signed-off-by: Nanley Chery 
> Cc: Jason Ekstrand 
> ---
>
> Jason, do you think we modify blorp instead of moving forward with this
> patch? In any case, I've sent this to the list to add context to my next
> patch.
>
>  src/intel/isl/isl_format.c | 24 
>  1 file changed, 24 insertions(+)
>
> diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> index fba3ac5e1a..03c591071b 100644
> --- a/src/intel/isl/isl_format.c
> +++ b/src/intel/isl/isl_format.c
> @@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct
> gen_device_info *devinfo,
> if (!format_info[format].exists)
>return false;
>
> +   /* For simplicity, only report that a format supports CCS_E if blorp
> can
> +* perform bit-for-bit copies with an image of that format while
> compressed.
> +* This allows ISL users to avoid having to resolve the image before
> +* performing such a copy. We may want to change this behavior in the
> +* future.
> +*
> +* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy
> +* currently works, bit-for-bit copy operations are not possible
> without an
> +* intermediate resolve.
> +*/
> +   if (format == ISL_FORMAT_R11G11B10_FLOAT)
> +  return false;
> +
> +   /* blorp_copy currently doesn't support formats with different
> bit-widths
> +* per-channel. Until that support is added, report that these formats
> don't
> +* support CCS_E. FIXME: Add support for these formats.
> +*/
> +   if (format == ISL_FORMAT_B10G10R10A2_UNORM ||
> +   format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB ||
> +   format == ISL_FORMAT_R10G10B10A2_UNORM ||
> +   format == ISL_FORMAT_R10G10B10A2_UINT) {
> +  return false;
> +   }
> +
> return format_gen(devinfo) >= format_info[format].ccs_e;
>  }
>
> --
> 2.14.3
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] intel/genxml: Fix decoding of groups with fields smaller than a DWord.

2017-10-27 Thread Jordan Justen

On 2017-10-25 21:17:13, Kenneth Graunke wrote:
> Groups containing fields smaller than a DWord were not being decoded
> correctly.  For example:
> 
> 
>   
> 
> 
> gen_field_iterator_next would properly walk over each element of the
> array, incrementing group_iter, and calling iter_group_offset_bits()
> to advance to the proper DWord.  However, the code to print the actual
> values only considered iter->field->start/end, which are 0 and 3 in the
> above example.  So it would always fetch bits 3:0 of the current DWord
> when printing values, instead of advancing to each element of the array,
> printing bits 0-3, 4-7, 8-11, and so on.
> 
> To fix this, we add new iter->start/end tracking, which properly
> advances for each instance of a group's field.
> 
> Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING,
> with a patch to convert it to use an array of bitfields (the example
> above).
> 
> This also fixes the decoding of 3DSTATE_SBE's "Attribute Active
> Component Format" fields.
> ---
>  src/intel/common/gen_decoder.c | 24 ++--
>  src/intel/common/gen_decoder.h |  2 ++
>  2 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
> index 85880143f00..1bf69ac4f94 100644
> --- a/src/intel/common/gen_decoder.c
> +++ b/src/intel/common/gen_decoder.c
> @@ -843,8 +843,12 @@ iter_advance_field(struct gen_field_iterator *iter)
> strncpy(iter->name, iter->field->name, sizeof(iter->name));
> else
>memset(iter->name, 0, sizeof(iter->name));
> -   iter->dword = iter_group_offset_bits(iter, iter->group_iter) / 32 +
> -  iter->field->start / 32;
> +
> +   int group_member_offset = iter_group_offset_bits(iter, iter->group_iter);
> +
> +   iter->start = group_member_offset + iter->field->start;
> +   iter->end = group_member_offset + iter->field->end;
> +   iter->dword = iter->start / 32;
> iter->struct_desc = NULL;
>  
> return true;
> @@ -861,7 +865,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
> if (!iter_advance_field(iter))
>return false;
>  
> -   if ((iter->field->end - iter->field->start) > 32)
> +   if ((iter->end - iter->start) > 32)
>v.qw = ((uint64_t) iter->p[iter->dword+1] << 32) | 
> iter->p[iter->dword];
> else
>v.qw = iter->p[iter->dword];
> @@ -871,13 +875,13 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
> switch (iter->field->type.kind) {
> case GEN_TYPE_UNKNOWN:
> case GEN_TYPE_INT: {
> -  uint64_t value = field(v.qw, iter->field->start, iter->field->end);
> +  uint64_t value = field(v.qw, iter->start, iter->end);
>snprintf(iter->value, sizeof(iter->value), "%"PRId64, value);
>enum_name = gen_get_enum_name(>field->inline_enum, value);
>break;
> }
> case GEN_TYPE_UINT: {
> -  uint64_t value = field(v.qw, iter->field->start, iter->field->end);
> +  uint64_t value = field(v.qw, iter->start, iter->end);
>snprintf(iter->value, sizeof(iter->value), "%"PRIu64, value);
>enum_name = gen_get_enum_name(>field->inline_enum, value);
>break;
> @@ -886,7 +890,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
>const char *true_string =
>   iter->print_colors ? "\e[0;35mtrue\e[0m" : "true";
>snprintf(iter->value, sizeof(iter->value), "%s",
> -   field(v.qw, iter->field->start, iter->field->end) ?
> +   field(v.qw, iter->start, iter->end) ?
> true_string : "false");
>break;
> }
> @@ -896,7 +900,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
> case GEN_TYPE_ADDRESS:
> case GEN_TYPE_OFFSET:
>snprintf(iter->value, sizeof(iter->value), "0x%08"PRIx64,
> -   field_address(v.qw, iter->field->start, iter->field->end));
> +   field_address(v.qw, iter->start, iter->end));
>break;
> case GEN_TYPE_STRUCT:
>snprintf(iter->value, sizeof(iter->value), "",
> @@ -907,8 +911,8 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
>break;
> case GEN_TYPE_UFIXED:
>snprintf(iter->value, sizeof(iter->value), "%f",
> -   (float) field(v.qw, iter->field->start,
> - iter->field->end) / (1 << iter->field->type.f));
> +   (float) field(v.qw, iter->start,
> + iter->end) / (1 << iter->field->type.f));

This line break seems odd. Could 'iter->end) /' move to the previous line?

Either way, Reviewed-by: Jordan Justen 

>break;
> case GEN_TYPE_SFIXED:
>/* FIXME: Sign extend extracted field. */
> @@ -917,7 +921,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter)
> case GEN_TYPE_MBO:
> break;
> case GEN_TYPE_ENUM: {
> -  uint64_t value = field(v.qw, iter->field->start, iter->field->end);
> +  uint64_t value =

Re: [Mesa-dev] [PATCH v3 31/48] intel/cs: Re-run final NIR optimizations for each SIMD size

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 3:03 AM, Iago Toral  wrote:

> This should be squashed into the previous commit
>

Oops... Fixed.


> On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> > With the advent of SPIR-V subgroup operations, compute shaders will
> > have
> > to be slightly different depending on the SIMD size at which they
> > execute.  In order to allow us to do dispatch-width specific things
> > in
> > NIR, we re-run the final NIR stages for each sIMD width.
> >
> > As a side-effect of this change, we start using ralloc on fs_visitor
> > so
> > we need to add DECLARE_RALLOC_OPERATORS to fs_visitor.
> > ---
> >  src/intel/compiler/brw_fs.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.h
> > b/src/intel/compiler/brw_fs.h
> > index d3ab385..9ff06b6 100644
> > --- a/src/intel/compiler/brw_fs.h
> > +++ b/src/intel/compiler/brw_fs.h
> > @@ -60,7 +60,7 @@ offset(const fs_reg , const brw::fs_builder
> > , unsigned delta)
> >  class fs_visitor : public backend_shader
> >  {
> >  public:
> > -   DECLARE_RALLOC_CXX_OPERATORS(fs_reg)
> > +   DECLARE_RALLOC_CXX_OPERATORS(fs_visitor)
> >
> > fs_visitor(const struct brw_compiler *compiler, void *log_data,
> >void *mem_ctx,
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 4:47 AM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> > The automatic exec size inference can accidentally mess things up if
> > we're not careful.  For instance, if we have
> >
> > add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D
> >
> > then the destination register will end up having a width of 2 with a
> > horizontal stride of 4 and a vertical stride of 8.  The EU emit code
> > sees the width of 2 and decides that we really wanted an exec size of
> > 2
> > which doesn't do what we wanted.
>
> Right :-/
>
> I have to say that this change makes me a little nervous, mostly
> because it doesn't look easy to identify all the cases where we were
> relying on automatic execsizes to fix things up for us... since this is
> not as easy as to look for places where we use 'vec1' or something like
> that. How did you get the list of things that needed explicit sizes?
>

Lots of grep :)  If I recall correctly, I searched for EXECUTE_1, vec1,
EXECUTE_2, and vec2.


> Also, both commits before this address cases of exec_size = 1, but we
> rely on automatic exec sizes for exec_size = 2 as well, I guess we have
> none of these?
>
> Anyway, I guess Jenkins would have caught at least most omissions so
> maybe I am being too paranoid.


That's my hope as well. :-)  Durring the debugging of this chunk of the
series, Jenkins found quite a few errors.  Once Jenkins was happy, I did
one more pass of grep to be sure.

--Jason


> > ---
> >  src/intel/compiler/brw_fs_generator.cpp | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> > b/src/intel/compiler/brw_fs_generator.cpp
> > index 8322be1..46f9a33 100644
> > --- a/src/intel/compiler/brw_fs_generator.cpp
> > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct
> > brw_compiler *compiler, void *log_data,
> >  {
> > p = rzalloc(mem_ctx, struct brw_codegen);
> > brw_init_codegen(devinfo, p, mem_ctx);
> > +
> > +   /* In the FS code generator, we are very careful to ensure that
> > we always
> > +* set the right execution size so we don't need the EU code to
> > "help" us
> > +* by trying to infer it.  Sometimes, it infers the wrong thing.
> > +*/
> > +   p->automatic_exec_sizes = false;
> >  }
> >
> >  fs_generator::~fs_generator()
> > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst,
> > struct brw_reg payload)
> >struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(),
> > BRW_REGISTER_TYPE_UD));
> >
> >/* Check runtime bit to detect if we have to send AA data or
> > not */
> > -  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> >brw_push_insn_state(p);
> > -  brw_inst_set_exec_size(p->devinfo, brw_last_inst,
> > BRW_EXECUTE_1);
> > +  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> > +  brw_set_default_exec_size(p, BRW_EXECUTE_1);
> >brw_AND(p,
> >v1_null_ud,
> >retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
> >brw_imm_ud(1<<26));
> >brw_inst_set_cond_modifier(p->devinfo, brw_last_inst,
> > BRW_CONDITIONAL_NZ);
> > -  brw_pop_insn_state(p);
> >
> >int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) -
> > p->store;
> > +  brw_pop_insn_state(p);
> >{
> >   /* Don't send AA data */
> >   fire_fb_write(inst, offset(payload, 1), implied_header,
> > inst->mlen-1);
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 23/48] intel/fs: Assign constant locations if they haven't been assigned

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 12:35 AM, Iago Toral  wrote:

> This sounds good to me, but I guess it is not really fixing anything,
> right? I ask because the subject claims that this patch does something
> that the original code was already supposed to be doing.
>

This patch is a bit of an artifact of history.  I needed it at one point in
the development of the series but I think it may have ended up not
mattering in the end.  I still think it's a nice clean-up.

--Jason


> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > Before, we bailing in assign_constant_locations based on the minimum
> > dispatch size.  The more direct thing to do is simply to check for
> > whether or not we have constant locations and bail if we do.  For
> > nir_setup_uniforms, it's completely safe to do it multiple times
> > because
> > we just copy a value from the NIR shader.
> > ---
> >  src/intel/compiler/brw_fs.cpp | 4 +++-
> >  src/intel/compiler/brw_fs_nir.cpp | 5 -
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > index 52079d3..75139fd 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -1956,8 +1956,10 @@ void
> >  fs_visitor::assign_constant_locations()
> >  {
> > /* Only the first compile gets to decide on locations. */
> > -   if (dispatch_width != min_dispatch_width)
> > +   if (push_constant_loc) {
> > +  assert(pull_constant_loc);
> >return;
> > +   }
> >
> > bool is_live[uniforms];
> > memset(is_live, 0, sizeof(is_live));
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 7556576..05efee3 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -81,8 +81,11 @@ fs_visitor::nir_setup_outputs()
> >  void
> >  fs_visitor::nir_setup_uniforms()
> >  {
> > -   if (dispatch_width != min_dispatch_width)
> > +   /* Only the first compile gets to set up uniforms. */
> > +   if (push_constant_loc) {
> > +  assert(pull_constant_loc);
> >return;
> > +   }
> >
> > uniforms = nir->num_uniforms / 4;
> >  }
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 29/48] intel/cs: Rework the way thread local ID is handled

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 2:11 AM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> > Previously, brw_nir_lower_intrinsics added the param and then emitted
> > a
> > load_uniform intrinsic to load it directly.  This commit switches
> > things
> > over to use a specific NIR intrinsic for the thread id.  The one
> > thing I
> > don't like about this approach is that we have to copy
> > thread_local_id
> > over to the new visitor in import_uniforms.
>
> It is not clear to me why you are doing this... why do you like this
> better?
>

For compute shaders, the SPIR-V subgroups stuff has a gl_subgroupId system
value which subgroup in the dispatch you are.  That information is
basically the same as the thread_local_id only off by a factor of the SIMD
size.  It's fairly arbitrary, but I figured we might as well switch over to
pushing the value that's defined in SPIR-V.


> > ---
> >  src/compiler/nir/nir_intrinsics.h|  3 ++
> >  src/intel/compiler/brw_fs.cpp|  4 +-
> >  src/intel/compiler/brw_fs.h  |  1 +
> >  src/intel/compiler/brw_fs_nir.cpp| 14 +++
> >  src/intel/compiler/brw_nir.h |  3 +-
> >  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 53 +-
> > --
> >  6 files changed, 32 insertions(+), 46 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_intrinsics.h
> > b/src/compiler/nir/nir_intrinsics.h
> > index cefd18b..47022dd 100644
> > --- a/src/compiler/nir/nir_intrinsics.h
> > +++ b/src/compiler/nir/nir_intrinsics.h
> > @@ -364,6 +364,9 @@ SYSTEM_VALUE(blend_const_color_a_float, 1, 0, xx,
> > xx, xx)
> >  SYSTEM_VALUE(blend_const_color_rgba_unorm, 1, 0, xx, xx, xx)
> >  SYSTEM_VALUE(blend_const_color__unorm, 1, 0, xx, xx, xx)
> >
> > +/* Intel specific system values */
> > +SYSTEM_VALUE(intel_thread_local_id, 1, 0, xx, xx, xx)
> > +
> >  /**
> >   * Barycentric coordinate intrinsics.
> >   *
> > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > index 2acd838..c0d4c05 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -996,6 +996,7 @@ fs_visitor::import_uniforms(fs_visitor *v)
> > this->push_constant_loc = v->push_constant_loc;
> > this->pull_constant_loc = v->pull_constant_loc;
> > this->uniforms = v->uniforms;
> > +   this->thread_local_id = v->thread_local_id;
> >  }
> >
> >  void
> > @@ -6781,8 +6782,7 @@ brw_compile_cs(const struct brw_compiler
> > *compiler, void *log_data,
> >  {
> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> > shader = brw_nir_apply_sampler_key(shader, compiler, >tex,
> > true);
> > -
> > -   brw_nir_lower_cs_intrinsics(shader, prog_data);
> > +   brw_nir_lower_cs_intrinsics(shader);
> > shader = brw_postprocess_nir(shader, compiler, true);
> >
> > prog_data->local_size[0] = shader->info.cs.local_size[0];
> > diff --git a/src/intel/compiler/brw_fs.h
> > b/src/intel/compiler/brw_fs.h
> > index da32593..f51a4d8 100644
> > --- a/src/intel/compiler/brw_fs.h
> > +++ b/src/intel/compiler/brw_fs.h
> > @@ -315,6 +315,7 @@ public:
> >  */
> > int *push_constant_loc;
> >
> > +   fs_reg thread_local_id;
> > fs_reg frag_depth;
> > fs_reg frag_stencil;
> > fs_reg sample_mask;
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 05efee3..fdc6fc6 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -88,6 +88,16 @@ fs_visitor::nir_setup_uniforms()
> > }
> >
> > uniforms = nir->num_uniforms / 4;
> > +
> > +   if (stage == MESA_SHADER_COMPUTE) {
> > +  /* Add a uniform for the thread local id.  It must be the last
> > uniform
> > +   * on the list.
> > +   */
> > +  assert(uniforms == prog_data->nr_params);
> > +  uint32_t *param = brw_stage_prog_data_add_params(prog_data,
> > 1);
> > +  *param = BRW_PARAM_BUILTIN_THREAD_LOCAL_ID;
> > +  thread_local_id = fs_reg(UNIFORM, uniforms++,
> > BRW_REGISTER_TYPE_UD);
> > +   }
> >  }
> >
> >  static bool
> > @@ -3409,6 +3419,10 @@ fs_visitor::nir_emit_cs_intrinsic(const
> > fs_builder ,
> >cs_prog_data->uses_barrier = true;
> >break;
> >
> > +   case nir_intrinsic_load_intel_thread_local_id:
> > +  bld.MOV(retype(dest, BRW_REGISTER_TYPE_UD), thread_local_id);
> > +  break;
> > +
> > case nir_intrinsic_load_local_invocation_id:
> > case nir_intrinsic_load_work_group_id: {
> >gl_system_value sv = nir_system_value_from_intrinsic(instr-
> > >intrinsic);
> > diff --git a/src/intel/compiler/brw_nir.h
> > b/src/intel/compiler/brw_nir.h
> > index 1493b74..3e40712 100644
> > --- a/src/intel/compiler/brw_nir.h
> > +++ b/src/intel/compiler/brw_nir.h
> > @@ -95,8 +95,7 @@ void brw_nir_analyze_boolean_resolves(nir_shader
> > *nir);
> >  nir_shader *brw_preprocess_nir(const struct

Re: [Mesa-dev] [PATCH v3 25/48] intel/cs: Drop max_dispatch_width checks from compile_cs

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 1:18 AM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > The only things that adjust fs_visitor::max_dispatch_width are render
> > target writes which don't happen in compute shaders so they're
> > pointless.
> > ---
> >  src/intel/compiler/brw_fs.cpp | 6 ++
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > index a23366b..4c362ba 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
>
> Maybe add an assert before this to check that max_dispatch_width is >=
> 32 as expected here?
>

Done.


> > @@ -6818,8 +6818,7 @@ brw_compile_cs(const struct brw_compiler
> > *compiler, void *log_data,
> >   NULL, /* Never used in core profile */
> >   shader, 16, shader_time_index);
> > if (likely(!(INTEL_DEBUG & DEBUG_NO16)) &&
> > -   !fail_msg && v8.max_dispatch_width >= 16 &&
> > -   min_dispatch_width <= 16) {
> > +   !fail_msg && min_dispatch_width <= 16) {
> >/* Try a SIMD16 compile */
> >if (min_dispatch_width <= 8)
> >   v16.import_uniforms();
> > @@ -6843,8 +6842,7 @@ brw_compile_cs(const struct brw_compiler
> > *compiler, void *log_data,
> > fs_visitor v32(compiler, log_data, mem_ctx, key, _data-
> > >base,
> >   NULL, /* Never used in core profile */
> >   shader, 32, shader_time_index);
> > -   if (!fail_msg && v8.max_dispatch_width >= 32 &&
> > -   (min_dispatch_width > 16 || (INTEL_DEBUG & DEBUG_DO32))) {
> > +   if (!fail_msg && (min_dispatch_width > 16 || (
>
> Maybe use unlikely() with (INTEL_DEBUG & DEBUG_DO32)?
>

That doesn't really go along with this change.  I've been meaning to rework
DO32 and I'm happy to fix it as part of that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/3] glsl_parser_extra: Add utility to copy symbols between symbol tables

2017-10-27 Thread Neil Roberts

From: Eduardo Lima Mitev 

Some symbols gathered in the symbols table during parsing are needed
later for the compile and link stages, so they are moved along the
process. Currently, only functions and non-temporary variables are
copied between symbol tables. However, the built-in gl_PerVertex
interface blocks are also needed during the linking stage (the last
step), to match re-declared blocks of inter-stage shaders.

This patch adds a new utility function that will factorize current code
that copies functions and variables between two symbol tables, and in
addition will copy explicitly declared gl_PerVertex blocks too.

The function will be used in a subsequent patch.

v2 (Neil Roberts):
Allow the src symbol table to be NULL and explicitly copy the
gl_PerVertex symbols in case they are not referenced in the exec_list.

Signed-off-by: Eduardo Lima Mitev 
Signed-off-by: Neil Roberts 
---
 src/compiler/glsl/glsl_parser_extras.cpp | 43 
 src/compiler/glsl/glsl_parser_extras.h   |  5 
 2 files changed, 48 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 51d835b..ec4603d 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1863,6 +1863,49 @@ set_shader_inout_layout(struct gl_shader *shader,
shader->bound_image = state->bound_image_specified;
 }
 
+/* src can be NULL if only the symbols found in the exec_list should be
+ * copied
+ */
+void
+_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
+   struct glsl_symbol_table *src,
+   struct glsl_symbol_table *dest)
+{
+   foreach_in_list (ir_instruction, ir, shader_ir) {
+  switch (ir->ir_type) {
+  case ir_type_function:
+ dest->add_function((ir_function *) ir);
+ break;
+  case ir_type_variable: {
+ ir_variable *const var = (ir_variable *) ir;
+
+ if (var->data.mode != ir_var_temporary)
+dest->add_variable(var);
+ break;
+  }
+  default:
+ break;
+  }
+   }
+
+   if (src != NULL) {
+  /* Explicitly copy the gl_PerVertex interface definitions because these
+   * are needed to check they are the same during the interstage link.
+   * They can’t necessarily be found via the exec_list because the members
+   * might not be referenced. The GL spec still requires that they match
+   * in that case.
+   */
+  const glsl_type *iface =
+ src->get_interface("gl_PerVertex", ir_var_shader_in);
+  if (iface)
+ dest->add_interface(iface->name, iface, ir_var_shader_in);
+
+  iface = src->get_interface("gl_PerVertex", ir_var_shader_out);
+  if (iface)
+ dest->add_interface(iface->name, iface, ir_var_shader_out);
+   }
+}
+
 extern "C" {
 
 static void
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index fb35813..2e98bc7 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -948,6 +948,11 @@ extern int glcpp_preprocess(void *ctx, const char 
**shader, char **info_log,
 extern void _mesa_destroy_shader_compiler(void);
 extern void _mesa_destroy_shader_compiler_caches(void);
 
+extern void
+_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
+   struct glsl_symbol_table *src,
+   struct glsl_symbol_table *dest);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/3] glsl/linker: Check that re-declared, inter-shader built-in blocks match

2017-10-27 Thread Neil Roberts

From: Eduardo Lima Mitev 

>From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of
the PDF states:

"If multiple shaders using members of a built-in block belonging to
 the same interface are linked together in the same program, they must
 all redeclare the built-in block in the same way, as described in
 section 4.3.9 “Interface Blocks” for interface-block matching, or a
 link-time error will result."

Fixes:
* GL45-CTS.CommonBugs.CommonBug_PerVertexValidation

v2 (Neil Roberts):
Explicitly look for gl_PerVertex in the symbol tables instead of
waiting to find a variable in the interface.

Signed-off-by: Eduardo Lima Mitev 
Signed-off-by: Neil Roberts 
---
 src/compiler/glsl/link_interface_blocks.cpp | 29 +
 1 file changed, 29 insertions(+)

diff --git a/src/compiler/glsl/link_interface_blocks.cpp 
b/src/compiler/glsl/link_interface_blocks.cpp
index 7037c77..510d4f7 100644
--- a/src/compiler/glsl/link_interface_blocks.cpp
+++ b/src/compiler/glsl/link_interface_blocks.cpp
@@ -364,6 +364,35 @@ validate_interstage_inout_blocks(struct gl_shader_program 
*prog,
consumer->Stage != MESA_SHADER_FRAGMENT) ||
   consumer->Stage == MESA_SHADER_GEOMETRY;
 
+   /* Check that block re-declarations of gl_PerVertex are compatible
+* across shaders: From OpenGL Shading Language 4.5, section
+* "7.1 Built-In Language Variables", page 130 of the PDF:
+*
+*"If multiple shaders using members of a built-in block belonging
+* to the same interface are linked together in the same program,
+* they must all redeclare the built-in block in the same way, as
+* described in section 4.3.9 “Interface Blocks” for interface-block
+* matching, or a link-time error will result."
+*
+* This is done explicitly outside of iterating the member variable
+* declarations because it is possible that the variables are not used and
+* so they would have been optimised out.
+*/
+   const glsl_type *consumer_iface =
+  consumer->symbols->get_interface("gl_PerVertex",
+   ir_var_shader_in);
+
+   const glsl_type *producer_iface =
+  producer->symbols->get_interface("gl_PerVertex",
+   ir_var_shader_out);
+
+   if (producer_iface && consumer_iface &&
+   interstage_member_mismatch(prog, consumer_iface, producer_iface)) {
+  linker_error(prog, "Incompatible or missing gl_PerVertex re-declaration "
+   "in consecutive shaders");
+  return;
+   }
+
/* Add output interfaces from the producer to the symbol table. */
foreach_in_list(ir_instruction, node, producer->ir) {
   ir_variable *var = node->as_variable();
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/3] glsl: Use the utility function to copy symbols between symbol tables

2017-10-27 Thread Neil Roberts

From: Eduardo Lima Mitev 

This effectively factorizes a couple of similar routines.

v2 (Neil Roberts): Non-trivial rebase on master

Signed-off-by: Eduardo Lima Mitev 
Signed-off-by: Neil Roberts 
---
 src/compiler/glsl/glsl_parser_extras.cpp | 25 +++--
 src/compiler/glsl/linker.cpp | 16 +++-
 2 files changed, 10 insertions(+), 31 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index ec4603d..1fea4d6 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1979,6 +1979,7 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state 
*state)
 
 static void
 opt_shader_and_create_symbol_table(struct gl_context *ctx,
+   struct glsl_symbol_table *source_symbols,
struct gl_shader *shader)
 {
assert(shader->CompileStatus != compile_failure &&
@@ -2036,22 +2037,8 @@ opt_shader_and_create_symbol_table(struct gl_context 
*ctx,
 * We don't have to worry about types or interface-types here because those
 * are fly-weights that are looked up by glsl_type.
 */
-   foreach_in_list (ir_instruction, ir, shader->ir) {
-  switch (ir->ir_type) {
-  case ir_type_function:
- shader->symbols->add_function((ir_function *) ir);
- break;
-  case ir_type_variable: {
- ir_variable *const var = (ir_variable *) ir;
-
- if (var->data.mode != ir_var_temporary)
-shader->symbols->add_variable(var);
- break;
-  }
-  default:
- break;
-  }
-   }
+   _mesa_glsl_copy_symbols_from_table(shader->ir, source_symbols,
+  shader->symbols);
 }
 
 void
@@ -2088,7 +2075,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
  return;
 
   if (shader->CompileStatus == compiled_no_opts) {
- opt_shader_and_create_symbol_table(ctx, shader);
+ opt_shader_and_create_symbol_table(ctx,
+NULL, /* source_symbols */
+shader);
  shader->CompileStatus = compile_success;
  return;
   }
@@ -2149,7 +2138,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
   lower_subroutine(shader->ir, state);
 
   if (!ctx->Cache || force_recompile)
- opt_shader_and_create_symbol_table(ctx, shader);
+ opt_shader_and_create_symbol_table(ctx, state->symbols, shader);
   else {
  reparent_ir(shader->ir, shader->ir);
  shader->CompileStatus = compiled_no_opts;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f827b68..0045291 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1255,21 +1255,11 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog,
  * Populates a shaders symbol table with all global declarations
  */
 static void
-populate_symbol_table(gl_linked_shader *sh)
+populate_symbol_table(gl_linked_shader *sh, glsl_symbol_table *symbols)
 {
sh->symbols = new(sh) glsl_symbol_table;
 
-   foreach_in_list(ir_instruction, inst, sh->ir) {
-  ir_variable *var;
-  ir_function *func;
-
-  if ((func = inst->as_function()) != NULL) {
- sh->symbols->add_function(func);
-  } else if ((var = inst->as_variable()) != NULL) {
- if (var->data.mode != ir_var_temporary)
-sh->symbols->add_variable(var);
-  }
-   }
+   _mesa_glsl_copy_symbols_from_table(sh->ir, symbols, sh->symbols);
 }
 
 
@@ -2288,7 +2278,7 @@ link_intrastage_shaders(void *mem_ctx,
 
link_bindless_layout_qualifiers(prog, shader_list, num_shaders);
 
-   populate_symbol_table(linked);
+   populate_symbol_table(linked, shader_list[0]->symbols);
 
/* The pointer to the main function in the final linked shader (i.e., the
 * copy of the original shader that contained the main function).
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/3] Fix for PerVertexValidation CTS test

2017-10-27 Thread Neil Roberts

Hi,

Here is a v2 of Eduardo’s patches to fix the PerVertexValidation CTS
test. They were originally posted here:

https://lists.freedesktop.org/archives/mesa-dev/2017-March/146667.html

The patches needed a non-trivial rebase. The previous discussion was
left with a note saying that there are more cases that should fail but
that weren’t handled by the patches and aren’t checked by the CTS
test. Ie, if the consuming shader doesn’t reference any of the members
of gl_PerVertex then it wouldn’t find the block redefinition. This v2
addresses that by directly fetching the glsl_type for the gl_PerVertex
block from the symbol table rather waiting for an instruction which
references it.

I’ve tested it with Piglit and the public CTS repo and it doesn’t
cause any regressions, and of course it fixes the CTS test.

It might be nice to add a Piglit or CTS test to check this missing
case.

- Neil

Eduardo Lima Mitev (3):
  glsl_parser_extra: Add utility to copy symbols between symbol tables
  glsl: Use the utility function to copy symbols between symbol tables
  glsl/linker: Check that re-declared, inter-shader built-in blocks
match

 src/compiler/glsl/glsl_parser_extras.cpp| 68 +
 src/compiler/glsl/glsl_parser_extras.h  |  5 +++
 src/compiler/glsl/link_interface_blocks.cpp | 29 
 src/compiler/glsl/linker.cpp| 16 ++-
 4 files changed, 87 insertions(+), 31 deletions(-)

-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 20/48] intel/fs: Protect opt_algebraic from OOB BROADCAST indices

2017-10-27 Thread Jason Ekstrand

On Fri, Oct 27, 2017 at 12:09 AM, Iago Toral  wrote:

>
> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/compiler/brw_fs.cpp | 10 --
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > index 1c4351b..52079d3 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -2416,8 +2416,14 @@ fs_visitor::opt_algebraic()
> >  progress = true;
> >   } else if (inst->src[1].file == IMM) {
> >  inst->opcode = BRW_OPCODE_MOV;
> > -inst->src[0] = component(inst->src[0],
> > - inst->src[1].ud);
> > +/* It's possible that the selected component will be too
> > large and
> > + * overflow the register.  If this happens and we some
> > how manage
> > + * to constant fold it in and get here, it would cause
> > an assert
> > + * in component() below.  Instead, just let it wrap
> > around if it
> > + * goes over exec_size.
> > + */
>
> component() is really a horiz_offset() call which is in turn a
> byte_offset() call, which doesn't assert on anything other than invalid
> register files. I guess you mean that the byte offset computed by the
> component() call below can later lead to hitting assertions as we
> attempt to read outside the allocated space for the vgrf, right?
>

Yes.  Nothing asserts here, we just end up reading way past the end of the
VGRF and the validator which checks whether or not we keep all our reads
in-bounds throws a fit.

> My question is whether this is supposed to happen at all, it seems like
> we should not be emitting broadcast operations like this at all since
> they are invalid and here we are instead papering over that bug.
>

This can happen if someone does a read_invocation from SPIR-V or GLSL that
has an invocation index that is too large.  We're allowed to give them
garbage values, but asserting isn't nice.

I've updated the comment to:

/* It's possible that the selected component will be too large
and
 * overflow the register.  This can happen if someone does a
 * readInvocation() from GLSL or SPIR-V and provides an OOB
 * invocationIndex.  If this happens and we some how manage
 * to constant fold it in and get here, then component() may
cause
 * us to start reading outside of the VGRF which will lead to an
 * assert later.  Instead, just let it wrap around if it goes
over
 * exec_size.
 */

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now

2017-10-27 Thread Nanley Chery

CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2
formats with CCS_E. None of these formats fit within the current
blorp_copy framework so disable them until support is added.

Signed-off-by: Nanley Chery 
Cc: Jason Ekstrand 
---

Jason, do you think we modify blorp instead of moving forward with this
patch? In any case, I've sent this to the list to add context to my next
patch.

 src/intel/isl/isl_format.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index fba3ac5e1a..03c591071b 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct gen_device_info 
*devinfo,
if (!format_info[format].exists)
   return false;
 
+   /* For simplicity, only report that a format supports CCS_E if blorp can
+* perform bit-for-bit copies with an image of that format while compressed.
+* This allows ISL users to avoid having to resolve the image before
+* performing such a copy. We may want to change this behavior in the
+* future.
+*
+* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy
+* currently works, bit-for-bit copy operations are not possible without an
+* intermediate resolve.
+*/
+   if (format == ISL_FORMAT_R11G11B10_FLOAT)
+  return false;
+
+   /* blorp_copy currently doesn't support formats with different bit-widths
+* per-channel. Until that support is added, report that these formats don't
+* support CCS_E. FIXME: Add support for these formats.
+*/
+   if (format == ISL_FORMAT_B10G10R10A2_UNORM ||
+   format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB ||
+   format == ISL_FORMAT_R10G10B10A2_UNORM ||
+   format == ISL_FORMAT_R10G10B10A2_UINT) {
+  return false;
+   }
+
return format_gen(devinfo) >= format_info[format].ccs_e;
 }
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering

2017-10-27 Thread Nanley Chery

Only use CCS_E to render to a texture that is CCS_E-compatible with the
original texture's miptree (linear) format. This prevents render
operations from writing data that can't be decoded with the original
miptree format.

On Gen10, with the new CCS_E-enabled formats handled, this enables the
driver to pass the arb_texture_view-rendering-formats piglit test.

Cc: 
Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 +--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index a850f4d17b..59c57c227b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context *brw,
}
 }
 
+/**
+ * Return true if the format that will be used to access the miptree is
+ * CCS_E-compatible with the miptree's linear/non-sRGB format.
+ *
+ * Why use the linear format? Well, although the miptree may be specified with
+ * an sRGB format, the usage of that color space/format can be toggled. Since
+ * our HW tends to support more linear formats than sRGB ones, we use this
+ * format variant for check for CCS_E compatibility.
+ */
+static bool
+format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo,
+ const struct intel_mipmap_tree *mt,
+ enum isl_format access_format)
+{
+   assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E);
+
+   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
+   enum isl_format isl_format = brw_isl_format_for_mesa_format(linear_format);
+   return isl_formats_are_ccs_e_compatible(devinfo, isl_format, access_format);
+}
+
 static bool
 intel_miptree_supports_ccs_e(struct brw_context *brw,
  const struct intel_mipmap_tree *mt)
@@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context *brw,
   return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE;
 
case ISL_AUX_USAGE_CCS_E: {
-  /* If the format supports CCS_E, then we can just use it */
-  if (isl_format_supports_ccs_e(>screen->devinfo, render_format))
+  /* If the format supports CCS_E and is compatible with the miptree,
+   * then we can use it.
+   */
+  if (format_ccs_e_compat_with_miptree(>screen->devinfo,
+   mt, render_format))
  return ISL_AUX_USAGE_CCS_E;
 
   /* Otherwise, we have to fall back to CCS_D */
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 19/48] i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src

2017-10-27 Thread Jason Ekstrand

On Thu, Oct 26, 2017 at 11:53 PM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 33 +--
> > --
> >  1 file changed, 21 insertions(+), 12 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index e008e2e..a441f57 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -1441,11 +1441,19 @@ fs_visitor::get_nir_src(const nir_src )
> > src.reg.base_offset * src.reg.reg-
> > >num_components);
> > }
> >
> > -   /* to avoid floating-point denorm flushing problems, set the type
> > by
> > -* default to D - instructions that need floating point semantics
> > will set
> > -* this to F if they need to
> > -*/
> > -   return retype(reg, BRW_REGISTER_TYPE_D);
> > +   if (nir_src_bit_size(src) == 64 && devinfo->gen == 7) {
> > +  /* The only 64-bit type available on gen7 is DF, so use that.
> > */
> > +  reg.type = BRW_REGISTER_TYPE_DF;
> > +   } else {
> > +  /* To avoid floating-point denorm flushing problems, set the
> > type by
> > +   * default to an integer type - instructions that need
> > floating point
> > +   * semantics will set this to F if they need to
> > +   */
> > +  reg.type = brw_reg_type_from_bit_size(nir_src_bit_size(src),
> > +BRW_REGISTER_TYPE_D);
> > +   }
> > +
> > +   return reg;
> >  }
> >
> >  /**
> > @@ -1455,6 +1463,10 @@ fs_reg
> >  fs_visitor::get_nir_src_imm(const nir_src )
> >  {
> > nir_const_value *val = nir_src_as_const_value(src);
> > +   /* This function shouldn't be called on anything which can even
> > +* possibly be 64 bits as it can't do what it claims.
> > +*/
>
> What would be wrong with something like this?
>
> if (nir_src_bit_size(src) == 32)
>return val ? fs_reg(brw_imm_d(val->i32[0])) : get_nir_src(src);
> else
>return val ? fs_reg(brw_imm_df(val->f64[0])) : get_nir_src(src);
>

Because double immediates only really work on BDW+ and I didn't want
someone to call this function and get tripped up by that.


> > +   assert(nir_src_bit_size(src) == 32);
> > return val ? fs_reg(brw_imm_d(val->i32[0])) : get_nir_src(src);
> >  }
> >
> > @@ -2648,8 +2660,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const
> > fs_builder ,
> >  */
> > unsigned channel = iter * 2 + i;
> > fs_reg dest = shuffle_64bit_data_for_32bit_write(bld,
> > -  retype(offset(value, bld, 2 * channel),
> > BRW_REGISTER_TYPE_DF),
> > -  1);
> > +  offset(value, bld, channel), 1);
> >
> > srcs[header_regs + (i + first_component) * 2] = dest;
> > srcs[header_regs + (i + first_component) * 2 + 1] =
> > @@ -3505,8 +3516,7 @@ fs_visitor::nir_emit_cs_intrinsic(const
> > fs_builder ,
> >if (nir_src_bit_size(instr->src[0]) == 64) {
> >   type_size = 8;
> >   val_reg = shuffle_64bit_data_for_32bit_write(bld,
> > -retype(val_reg, BRW_REGISTER_TYPE_DF),
> > -instr->num_components);
> > +val_reg, instr->num_components);
> >}
> >
> >unsigned type_slots = type_size / 4;
> > @@ -4005,8 +4015,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> > , nir_intrinsic_instr *instr
> >if (nir_src_bit_size(instr->src[0]) == 64) {
> >   type_size = 8;
> >   val_reg = shuffle_64bit_data_for_32bit_write(bld,
> > -retype(val_reg, BRW_REGISTER_TYPE_DF),
> > -instr->num_components);
> > +val_reg, instr->num_components);
> >}
> >
> >unsigned type_slots = type_size / 4;
> > @@ -4063,7 +4072,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> > , nir_intrinsic_instr *instr
> >unsigned first_component = nir_intrinsic_component(instr);
> >if (nir_src_bit_size(instr->src[0]) == 64) {
> >   fs_reg tmp = shuffle_64bit_data_for_32bit_write(bld,
> > -retype(src, BRW_REGISTER_TYPE_DF), num_components);
> > +src, num_components);
> >   src = tmp;
> >   num_components *= 2;
> >}
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 18/48] i965/fs/nir: Minor refactor of store_output

2017-10-27 Thread Jason Ekstrand

On Thu, Oct 26, 2017 at 11:35 PM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > Stop retyping the output of shuffle_64bit_data_for_32bit_write.  It's
> > always BRW_REGISTER_TYPE_D which is perfectly fine for writing out.
> > Also, when we change get_nir_src to return something with a 64-bit
> > type
> > for 64-bit values, the retyping will not be at all what we
> > want.  Also,
> > retyping the output based on src.type before we whack it back to 32
> > bits
> > is a problem because the output is always 32 bits.
> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 5bcdb1a..e008e2e 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -4058,18 +4058,18 @@ fs_visitor::nir_emit_intrinsic(const
> > fs_builder , nir_intrinsic_instr *instr
> >
> >nir_const_value *const_offset = nir_src_as_const_value(instr-
> > >src[1]);
> >assert(const_offset && "Indirect output stores not allowed");
> > -  fs_reg new_dest = retype(offset(outputs[instr-
> > >const_index[0]], bld,
> > -  4 * const_offset->u32[0]),
> > src.type);
> >
> >unsigned num_components = instr->num_components;
> >unsigned first_component = nir_intrinsic_component(instr);
> >if (nir_src_bit_size(instr->src[0]) == 64) {
> >   fs_reg tmp = shuffle_64bit_data_for_32bit_write(bld,
> >  retype(src, BRW_REGISTER_TYPE_DF), num_components);
> > - src = retype(tmp, src.type);
> > + src = tmp;
>
> Maybe just make this:
>
> src = suffle_64bit_data_for_32bit_write(...) ?
>

Fixed locally.


> >   num_components *= 2;
> >}
> >
> > +  fs_reg new_dest = retype(offset(outputs[instr-
> > >const_index[0]], bld,
> > +  4 * const_offset->u32[0]),
> > src.type);
> >for (unsigned j = 0; j < num_components; j++) {
> >   bld.MOV(offset(new_dest, bld, j + first_component),
> >   offset(src, bld, j));
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency

2017-10-27 Thread Dylan Baker

Well, I went to push this, apparently this was in one of the osmesa
patches as an unrelated change.. :/

So it's fixed.

Dylan

Quoting Dylan Baker (2017-10-27 10:25:33)
> This is needed in cases where xcb isn't actually needed as a dependency,
> but may still be included somewhere.
> 
> cc: Erik Faye-Lund 
> cc: Eric Engestrom 
> Signed-off-by: Dylan Baker 
> ---
>  meson.build | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/meson.build b/meson.build
> index 875f9d4d294..102b75c2320 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -740,6 +740,7 @@ dep_xext = []
>  dep_xdamage = []
>  dep_xfixes = []
>  dep_x11_xcb = []
> +dep_xcb = []
>  dep_xcb_glx = []
>  dep_xcb_dri2 = []
>  dep_xcb_dri3 = []
> -- 
> 2.14.2
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor

2017-10-27 Thread Mauro Rossi

2017-10-27 14:28 GMT+02:00 Rob Herring :

> On Fri, Oct 27, 2017 at 6:41 AM, Emil Velikov 
> wrote:
> > On 26 October 2017 at 23:48, Mauro Rossi  wrote:
> >> Having moved gallium_dri.so library to /vendor/lib/dri
> >> also symlinks need to be coherently created using TARGET_OUT_VENDOR
> insted of TARGET_OUT
> >> or all non Intel drivers will not be loaded with Android N and earlier,
> >> thus causing SurfaceFlinger SIGABRT
> >>
> >> Fixes: c3f75d483c ("Android: move libraries to /vendor")
> >>
> >> Cc: 17.3 
> >> ---
> >>  src/gallium/targets/dri/Android.mk | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Reviewed-by: Rob Herring 
>
> >>
> >> diff --git a/src/gallium/targets/dri/Android.mk
> b/src/gallium/targets/dri/Android.mk
> >> index 61b65769ff..3fa86a2d56 100644
> >> --- a/src/gallium/targets/dri/Android.mk
> >> +++ b/src/gallium/targets/dri/Android.mk
> >> @@ -70,8 +70,8 @@ LOCAL_SHARED_LIBRARIES += $(sort
> $(GALLIUM_SHARED_LIBS))
> >>  ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),)
> >>  LOCAL_POST_INSTALL_CMD := \
> >> $(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64),
> \
> >> - mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \
> >> - $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so
> $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \
> >> + mkdir -p $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH);
> \
> >> + $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so
> $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \
> > Can we fold the long path into a variable and then reuse it?
> > This code will be around for a bit, so it might be worth it.
> >
> > foo=$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)
> > mkdir -p $(foo)
> > $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so
> > $(foo)/$(d)_dri.so;)
> >
> > -Emil
> > *Please use better variable name than foo
>
> bar?
>

...and the winner is ... MESA_DRI_MODULE_PATH

$(eval MESA_DRI_MODULE_PATH :=
$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH))

Sending tested v2 patch soon
Mauro
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency

2017-10-27 Thread Erik Faye-Lund

On Fri, Oct 27, 2017 at 7:25 PM, Dylan Baker  wrote:
> This is needed in cases where xcb isn't actually needed as a dependency,
> but may still be included somewhere.
>
> cc: Erik Faye-Lund 
> cc: Eric Engestrom 
> Signed-off-by: Dylan Baker 

Tested-by: Erik Faye-Lund 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa master #5967 failed

2017-10-27 Thread AppVeyor




Build mesa 5967 failed


Commit f121a669c7 by Dylan Baker on 10/24/2017 10:52 PM:

meson: build gallium based osmesa\n\nThis has been tested with the osdemo from mesa-demos\n\nv2: - Add SELinux dependency\n- fix typo GALLIUM_LLVM -> GALLIUM_LLVMPIPE\n\nSigned-off-by: Dylan Baker \nReviewed-by: Eric Engestrom 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meson: Add threads dependencies to glsl_compiler executable

2017-10-27 Thread Dylan Baker

Fixes compiling the optional standalone glsl compiler.

Reported-by: DrNick (on irc)
Signed-off-by: Dylan Baker 
---

This is not compiled by default, but can be built by:
meson build
ninja -C build src/compiler/glsl/glsl_compiler

 src/compiler/glsl/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build
index 76fcafb9910..aa0e7153f42 100644
--- a/src/compiler/glsl/meson.build
+++ b/src/compiler/glsl/meson.build
@@ -223,7 +223,7 @@ glsl_compiler = executable(
   'main.cpp',
   c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args],
   cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
-  dependencies : [dep_clock],
+  dependencies : [dep_clock, dep_thread],
   include_directories : [inc_common],
   link_with : [libglsl_standalone],
   build_by_default : false,
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa 17.2.4 release candidate

2017-10-27 Thread Ilia Mirkin

On Fri, Oct 27, 2017 at 1:43 PM, Andres Gomez  wrote:
> Rejected (6)
> 
>
> Ilia Mirkin (1):
>   glsl: fix derived cs variables
>
> Reason: Commit is too big for stable at this point.

The issue it fixes in regular compute shaders is slightly difficult to
hit (but there are piglits that do now), however the issue it hits
with ARB_compute_variable_group_size is fairly trivial to encounter.

It seems silly to put out releases with known bugs when a fix is
easily available and apply-able, with negligible risk of messing
things up.

Note that this all only affects nouveau and radeonsi, as those are the
only drivers that make use of the lowering.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 9/9] meson: build gallium based osmesa

2017-10-27 Thread Dylan Baker

Quoting Eric Engestrom (2017-10-27 02:28:05)
> On Thursday, 2017-10-26 13:55:35 -0700, Dylan Baker wrote:
> > Quoting Eric Engestrom (2017-10-26 02:40:20)
> > > On Wednesday, 2017-10-25 15:58:23 -0700, Dylan Baker wrote:
> > > > +if with_shared_glapi
> > > > +  osmesa_link_with += libglapi
> > > > +endif
> > > > +if with_ld_version_script
> > > > +  osmesa_link_args += [
> > > > +'-Wl,--version-script', join_paths(meson.current_source_dir(), 
> > > > 'osmesa.sym')
> > > 
> > > files('osmesa.sym') ?
> > > I'm not sure if it would work here though, since it expects a string.
> > > /me hopes meson converts the file_array to a list of its filenames
> > > 
> > 
> > files() doesn't work here. Maybe it makes sense to extend meson to support 
> > that?
> 
> We're not the first ones to think about this :)
> There's already an issue open for this exact case [1], which doesn't really
> have a conclusion. Might be worth chiming in, if only to ping them?
> 
> [1] https://github.com/mesonbuild/meson/issues/1592
> 

I kinda hacked something up real quick. I don't think it would be hard to
implement, although I started wondering about it. Part of what makes File so
special is that it always works no matter what directory you're in. I don't
think that th right thing is to be able to us them in string.format, but to be
able to pass them as a linker or compiler argument, and have that do the right
thing with them. I'll look into that more.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1

2017-10-27 Thread Anuj Phogat

On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogat  wrote:
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index b6e800aa90..2fb6420cee 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -4362,6 +4362,16 @@ genX(upload_raster)(struct brw_context *brw)
>/* _NEW_LINE */
>raster.AntialiasingEnable = ctx->Line.SmoothFlag;
>
> +#if GEN_GEN == 10
> +  /* _NEW_LINE | _NEW_MULTISAMPLE
_NEW_LINE is not required here. Fixed locally.
> +   * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
> +   */
> +  const bool multisampled_fbo =
> +  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
> +  if (multisampled_fbo)
> + raster.AntialiasingEnable = false;
> +#endif
> +
>/* _NEW_SCISSOR */
>raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags;
>
> --
> 2.13.5
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency

2017-10-27 Thread Eric Engestrom

On Friday, 2017-10-27 10:25:33 -0700, Dylan Baker wrote:
> This is needed in cases where xcb isn't actually needed as a dependency,
> but may still be included somewhere.
> 
> cc: Erik Faye-Lund 
> cc: Eric Engestrom 

better indeed :)
Reviewed-by: Eric Engestrom 

> Signed-off-by: Dylan Baker 
> ---
>  meson.build | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/meson.build b/meson.build
> index 875f9d4d294..102b75c2320 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -740,6 +740,7 @@ dep_xext = []
>  dep_xdamage = []
>  dep_xfixes = []
>  dep_x11_xcb = []
> +dep_xcb = []
>  dep_xcb_glx = []
>  dep_xcb_dri2 = []
>  dep_xcb_dri3 = []
> -- 
> 2.14.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1

2017-10-27 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index b6e800aa90..2fb6420cee 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -4362,6 +4362,16 @@ genX(upload_raster)(struct brw_context *brw)
   /* _NEW_LINE */
   raster.AntialiasingEnable = ctx->Line.SmoothFlag;
 
+#if GEN_GEN == 10
+  /* _NEW_LINE | _NEW_MULTISAMPLE
+   * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
+   */
+  const bool multisampled_fbo =
+  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+  if (multisampled_fbo)
+ raster.AntialiasingEnable = false;
+#endif
+
   /* _NEW_SCISSOR */
   raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags;
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1

2017-10-27 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ccfd48919..b6e800aa90 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw)
  sf.SmoothPointEnable = true;
 #endif
 
+#if GEN_GEN == 10
+  /* _NEW_MULTISAMPLE
+   * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1.
+   */
+  const bool multisampled_fbo =
+  _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+  if (multisampled_fbo)
+ sf.SmoothPointEnable = false;
+#endif
+
 #if GEN_IS_G4X || GEN_GEN >= 5
   sf.AALineDistanceMode = AALINEDISTANCE_TRUE;
 #endif
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.

2017-10-27 Thread Bas Nieuwenhuizen

On Fri, Oct 27, 2017 at 5:03 PM, Andres Gomez  wrote:
> Bas, this patch looks like it should have been marked as fixing
> 6ce550453f1 instead of e38685cc62e (?).

Well my reasoning was that the bug got "visible" when we enabled Vega.
The patch you refer to fixed part of it, but not all of it, and then
this patch fixes the rest.

>
> In any case, I was wondering whether it would be interesting to bring
> them both to the 17.2 stable queue and whether we would also want
> Timothy's preceding patch:
>
> 087e010b2b3dd83a539f97203909d6c43b5da87c radv: copy indirect lowering 
> settings from radeonsi

Yes, that would be reasonable. Timothy's patch is an optimization
though, so I'd be happy to send a backport that only generates the
variable needed for the other two if you'd prefer that.

>
> Let me know what you think.
>
> On Sun, 2017-10-22 at 18:43 +0200, Bas Nieuwenhuizen wrote:
>> Since it also uses the output vector before writing to memory.
>>
>> Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
>> ---
>>  src/amd/vulkan/radv_shader.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
>> index 07e68d6032b..6176a2e590d 100644
>> --- a/src/amd/vulkan/radv_shader.c
>> +++ b/src/amd/vulkan/radv_shader.c
>> @@ -265,9 +265,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
>>   indirect_mask |= nir_var_shader_in;
>>   }
>>   if (!llvm_has_working_vgpr_indexing &&
>> - (nir->info.stage == MESA_SHADER_VERTEX ||
>> -  nir->info.stage == MESA_SHADER_TESS_EVAL ||
>> -  nir->info.stage == MESA_SHADER_FRAGMENT))
>> + nir->info.stage != MESA_SHADER_TESS_CTRL)
>>   indirect_mask |= nir_var_shader_out;
>>
>>   /* TODO: We shouldn't need to do this, however LLVM isn't currently
> --
> Br,
>
> Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Mesa 17.2.4 release candidate

2017-10-27 Thread Andres Gomez

Hello list,

The candidate for the Mesa 17.2.4 is now available. Currently we have:
 - 21 queued
 - 10 nominated (outstanding)
 - and 6 rejected patches


In the current queue we have:

In Mesa Core we have included a change to prevent KOTOR from breaking
when in combination with the ATI fragment shader extension. 
Additionally, NIR has also received a correction.

Mesa's state tracker has gotten a patch to avoid leaks in certain
situations like when resizing a window.

Intel drivers have received several fixes.  The compiler has gotten a
couple, while anv also received one.  i965 got a patch to avoid VA-API, 
Beignet and other contexts in the system to break when in combination
with previous versions of Mesa 17.2.x.

AMD's compiler has received a couple of fixes while radv has also
received another couple, including one to avoid a hang due to an
overflow on huge textures.

Broadcom's vc4 has corrected a problem when compiling with Android's
clang.

clover has also seen fixed a problem compiling after a specific clang
revision.

Vulkan's WSI has gotten plugged a memory leak for X11.

Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval


Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.

The plan is to have 17.2.4 next Sunday (29th of October), around or
shortly after 18:00 GMT.

If you have any questions or suggestions - be that about the current
patch queue or otherwise, please go ahead.


Trivial merge conflicts
---

commit 1eb4cbc934afd72ea45521eeac763e43564d972b
Author: Dave Airlie 

radv/image: bump all the offset to uint64_t.

(cherry picked from commit 35c66f3e40177a97d74e614e2a324a03e2149c73)

commit cbc081b8711a519dd9aa371debbf68aa0add4076
Author: Kenneth Graunke 

i965: Revert absolute mode for constant buffer pointers.

(cherry picked from commit 013d33122028f2492da90a03ae4bc1dab84c3ee9)

commit ce725baa7c90420b46e16087a4f201a7e62b23e5
Author: Matthew Nicholls 

ac/nir: generate correct instruction for atomic min/max on unsigned images

(cherry picked from commit 27a0b24bf238342031e0709584e4d71ab228f1ec)

commit 74f19032341748ad257a0c2527624b661e2d2e6f
Author: Jason Ekstrand 

anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir

(cherry picked from commit 279f8fb69cf68d05287e14f60cf67fc025643bc4)


Cheers,
Andres


Mesa stable queue
-

Nominated (10)
=

Jason Ekstrand (4):
  spirv: Claim support for the simple memory model
  i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees
  i965/blorp: Use more temporary isl_format variables
  i965/miptree: Take an isl_format in render_aux_usage

Kenneth Graunke (1):
  mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.

Leo Liu (1):
  radeon/video: add gfx9 offsets when rejoin the video surface

Marek Olšák (2):
  radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI
  ac/surface/gfx9: don't allow DCC for the smallest mipmap levels

Nicolai Hähnle (1):
  amd/common/gfx9: workaround DCC corruption more conservatively

Tapani Pälli (1):
  i965: unref push_const_bo in intelDestroyContext


Queued (21)
===

Andres Gomez (7):
  cherry-ignore: configure.ac: rework llvm detection and handling
  cherry-ignore: glsl: fix derived cs variables
  cherry-ignore: added 17.3 nominations.
  cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9.
  cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well.
  cherry-ignore: mesa/bufferobj: don't double negate the range
  cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5.

Bas Nieuwenhuizen (1):
  ac/nir: Fix nir_texop_lod on GFX for 1D arrays.

Dave Airlie (1):
  radv/image: bump all the offset to uint64_t.

Henri Verbeet (1):
  vulkan/wsi: Free the event in x11_manage_fifo_queues().

Jan Vesely (1):
  clover: Fix compilation after clang r315871

Jason Ekstrand (4):
  nir/intrinsics: Set the correct num_indices for load_output
  intel/fs: Handle flag read/write aliasing in needs_src_copy
  anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir
  intel/eu: Use EXECUTE_1 for JMPI

Kenneth Graunke (1):
  i965: Revert absolute mode for constant buffer pointers.

Marek Olšák (1):
  Revert "mesa: fix texture updates for ATI_fragment_shader"

Matthew Nicholls (1):
  ac/nir: generate correct instruction for atomic min/max on unsigned images

Michel Dänzer (1):
  st/mesa: Initialize textures array in st_framebuffer_validate
Squashed with
  st/osmesa: include u_inlines.h for pipe_resource_reference

Samuel Pitoiset (1):
  radv: add the draw count buffer to the list of

Re: [Mesa-dev] [PATCH] meson: Add a dependency on nir_opcodes_h for freedreno

2017-10-27 Thread Mark Janes

Reviewed-by: Mark Janes 

Dylan Baker  writes:

> This fixes a race condition in the build.
>
> cc: Rob Clark 
> Signed-off-by: Dylan Baker 
> ---
>  src/gallium/drivers/freedreno/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/meson.build 
> b/src/gallium/drivers/freedreno/meson.build
> index 028be54ebac..6f7cf2b3bd4 100644
> --- a/src/gallium/drivers/freedreno/meson.build
> +++ b/src/gallium/drivers/freedreno/meson.build
> @@ -200,7 +200,7 @@ freedreno_includes = [
>  
>  libfreedreno = static_library(
>'freedreno',
> -  [files_libfreedreno, ir3_nir_trig_c],
> +  [files_libfreedreno, ir3_nir_trig_c, nir_opcodes_h],
>include_directories : freedreno_includes,
>c_args : [c_vis_args],
>cpp_args : [cpp_vis_args],
> -- 
> 2.14.2
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] meson: wire up egl/android

2017-10-27 Thread Dylan Baker

Whooo! Thanks for doing this!

Quoting Eric Engestrom (2017-10-27 07:40:17)
> Cc: Rob Herring 
> Cc: Tomasz Figa 
> Signed-off-by: Eric Engestrom 
> ---
> Completely untested!
> It's a step in the right direction though; doesn't hurt non-android,
> and gets android closer to building on meson :)
> ---
>  meson.build | 13 +++--
>  src/egl/meson.build |  5 -
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/meson.build b/meson.build
> index 875f9d4d294d1911f239..761c33f4651ab37ab7b6 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -152,7 +152,7 @@ endif
>  # TODO: other OSes
>  with_dri_platform = 'drm'
>  
> -# TODO: android platform
> +with_platform_android = false
>  with_platform_wayland = false
>  with_platform_x11 = false
>  with_platform_drm = false
> @@ -161,6 +161,7 @@ egl_native_platform = ''
>  _platforms = get_option('platforms')
>  if _platforms != ''
>_split = _platforms.split(',')
> +  with_platform_android = _split.contains('android')
>with_platform_x11 = _split.contains('x11')
>with_platform_wayland = _split.contains('wayland')
>with_platform_drm = _split.contains('drm')
> @@ -252,7 +253,7 @@ if _vulkan_drivers != ''
>with_intel_vk = _split.contains('intel')
>with_amd_vk = _split.contains('amd')
>with_any_vk = with_amd_vk or with_intel_vk
> -  if not (with_platform_x11 or with_platform_wayland)
> +  if not (with_platform_x11 or with_platform_wayland or 
> with_platform_android)
>  error('Vulkan requires at least one platform (x11, wayland)')
>endif
>  endif
> @@ -330,6 +331,14 @@ endif
>  if with_platform_surfaceless
>pre_args += '-DHAVE_SURFACELESS_PLATFORM'
>  endif
> +if with_platform_android
> +  dep_android = [
> +  dependency('cutils'),
> +  dependency('hardware'),
> +  dependency('sync'),
> +  ]

The indent looks off here, it looks like 4 space instead of 2, and the closing
brace should be dedented.

Otherwise this looks good to me, though it would be great if one of the
ChromeOS guys could look at it (since I think that android builds are always
done with android.mk, and this would only be for the ChromeOS ARC++ container)

Reviewed-by: Dylan Baker 

> +  pre_args += '-DHAVE_ANDROID_PLATFORM'
> +endif
>  
>  prog_python2 = find_program('python2')
>  has_mako = run_command(prog_python2, '-c', 'import mako')
> diff --git a/src/egl/meson.build b/src/egl/meson.build
> index ea7ae06761f75c00a40c..cc51671f9d8f24708405 100644
> --- a/src/egl/meson.build
> +++ b/src/egl/meson.build
> @@ -129,7 +129,10 @@ if with_platform_wayland
>  'wayland/wayland-egl', 'wayland/wayland-drm',
>)
>  endif
> -# TODO: android
> +if with_platform_android
> +  deps_for_egl += dep_android
> +  files_egl += files('drivers/dri2/platform_android.c')
> +endif
>  
>  # TODO: glvnd
>  
> -- 
> Cheers,
>   Eric
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built

2017-10-27 Thread Erik Faye-Lund

Yeah, I kinda got that feeling. Your approach seems better.

On Oct 27, 2017 19:22, "Dylan Baker"  wrote:

This just papers over the actual problem, that dep_xcb isn't declared as an
empty list with the other glx dependencies.

Dylan

Quoting Erik Faye-Lund (2017-10-27 04:55:25)
> This avoids the following build-error when building with emtpy
> vulkan-drivers and without glx=dri:
>
> Meson encountered an error in file src/vulkan/wsi/meson.build, line 30,
> column 2:
> Unknown variable "dep_xcb".
>
> Signed-off-by: Erik Faye-Lund 
> ---
>  src/meson.build | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/meson.build b/src/meson.build
> index 9b1b0ae594..4b00ab910c 100644
> --- a/src/meson.build
> +++ b/src/meson.build
> @@ -47,7 +47,9 @@ subdir('mapi')
>  # TODO: osmesa
>  subdir('compiler')
>  subdir('egl/wayland/wayland-drm')
> -subdir('vulkan')
> +if with_any_vk
> +  subdir('vulkan')
> +endif
>  subdir('amd')
>  if with_gallium_vc4
>subdir('broadcom')
> --
> 2.11.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meson: declare an empty xcb dependency

2017-10-27 Thread Dylan Baker

This is needed in cases where xcb isn't actually needed as a dependency,
but may still be included somewhere.

cc: Erik Faye-Lund 
cc: Eric Engestrom 
Signed-off-by: Dylan Baker 
---
 meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/meson.build b/meson.build
index 875f9d4d294..102b75c2320 100644
--- a/meson.build
+++ b/meson.build
@@ -740,6 +740,7 @@ dep_xext = []
 dep_xdamage = []
 dep_xfixes = []
 dep_x11_xcb = []
+dep_xcb = []
 dep_xcb_glx = []
 dep_xcb_dri2 = []
 dep_xcb_dri3 = []
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 00/10] new series of Mesa for Tizen

2017-10-27 Thread Kenneth Graunke

On Thursday, October 26, 2017 11:16:06 AM PDT Eric Anholt wrote:
> Gwan-gyeong Mun  writes:
> 
> > Hi,
> >
> > These Patch v5 series modified with new helper function series [1]. 
> >
> > These series only have mesa for tizen feature.
> >
> > [1] https://patchwork.freedesktop.org/series/32577/
> 
> Rather than have another giant pile of window system code in the tree,
> I'd like to see a serious discussion of why you aren't using the
> existing wayland and gbm platforms and what you would need from them.
> The TPL stuff looks like abstraction for the sake of abstraction, to me.

I agree with Eric.  I've never understood why this is necessary.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built

2017-10-27 Thread Dylan Baker

This just papers over the actual problem, that dep_xcb isn't declared as an
empty list with the other glx dependencies.

Dylan

Quoting Erik Faye-Lund (2017-10-27 04:55:25)
> This avoids the following build-error when building with emtpy
> vulkan-drivers and without glx=dri:
> 
> Meson encountered an error in file src/vulkan/wsi/meson.build, line 30,
> column 2:
> Unknown variable "dep_xcb".
> 
> Signed-off-by: Erik Faye-Lund 
> ---
>  src/meson.build | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/meson.build b/src/meson.build
> index 9b1b0ae594..4b00ab910c 100644
> --- a/src/meson.build
> +++ b/src/meson.build
> @@ -47,7 +47,9 @@ subdir('mapi')
>  # TODO: osmesa
>  subdir('compiler')
>  subdir('egl/wayland/wayland-drm')
> -subdir('vulkan')
> +if with_any_vk
> +  subdir('vulkan')
> +endif
>  subdir('amd')
>  if with_gallium_vc4
>subdir('broadcom')
> -- 
> 2.11.0
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meson: Add a dependency on nir_opcodes_h for freedreno

2017-10-27 Thread Dylan Baker

This fixes a race condition in the build.

cc: Rob Clark 
Signed-off-by: Dylan Baker 
---
 src/gallium/drivers/freedreno/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/meson.build 
b/src/gallium/drivers/freedreno/meson.build
index 028be54ebac..6f7cf2b3bd4 100644
--- a/src/gallium/drivers/freedreno/meson.build
+++ b/src/gallium/drivers/freedreno/meson.build
@@ -200,7 +200,7 @@ freedreno_includes = [
 
 libfreedreno = static_library(
   'freedreno',
-  [files_libfreedreno, ir3_nir_trig_c],
+  [files_libfreedreno, ir3_nir_trig_c, nir_opcodes_h],
   include_directories : freedreno_includes,
   c_args : [c_vis_args],
   cpp_args : [cpp_vis_args],
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: hashtable: make hashing prototypes match

2017-10-27 Thread Kenneth Graunke

On Friday, October 27, 2017 9:43:45 AM PDT Lionel Landwerlin wrote:
> It seems nobody's using the string hashing function. If you try to
> pass it directly to the hashtable creation function, you'll get
> compiler warning for non matching prototypes. Let's make them match.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/util/hash_table.c | 3 ++-
>  src/util/hash_table.h | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
> index 1bda2149b95..b7421a0144c 100644
> --- a/src/util/hash_table.c
> +++ b/src/util/hash_table.c
> @@ -476,9 +476,10 @@ _mesa_hash_data(const void *data, size_t size)
>  
>  /** FNV-1a string hash implementation */
>  uint32_t
> -_mesa_hash_string(const char *key)
> +_mesa_hash_string(const void *_key)
>  {
> uint32_t hash = _mesa_fnv32_1a_offset_bias;
> +   const char *key = _key;
>  
> while (*key != 0) {
>hash = _mesa_fnv32_1a_accumulate(hash, *key);
> diff --git a/src/util/hash_table.h b/src/util/hash_table.h
> index cf939130fcf..d3e0758b265 100644
> --- a/src/util/hash_table.h
> +++ b/src/util/hash_table.h
> @@ -94,7 +94,7 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>bool (*predicate)(struct hash_entry *entry));
>  
>  uint32_t _mesa_hash_data(const void *data, size_t size);
> -uint32_t _mesa_hash_string(const char *key);
> +uint32_t _mesa_hash_string(const void *key);
>  bool _mesa_key_string_equal(const void *a, const void *b);
>  bool _mesa_key_pointer_equal(const void *a, const void *b);
>  
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled

2017-10-27 Thread Samuel Pitoiset




On 10/27/2017 06:49 PM, Marek Olšák wrote:

The hw disables depth writes automatically.


Okay, good to know. Thanks!



Marek

On Fri, Oct 27, 2017 at 6:27 PM, Samuel Pitoiset
 wrote:

Found by inspection.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_pipeline.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c25642c966..f5ebcda883 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct 
radv_pipeline *pipeline,
 bool has_stencil_attachment = vk_format_is_stencil(attachment->format);

 if (has_depth_attachment) {
+   /* Depth writes are always disabled when depthTestEnable is
+* VK_FALSE.
+*/
+   bool z_write_enable =
+   vkds->depthTestEnable && vkds->depthWriteEnable;
+
 ds->db_depth_control = S_028800_Z_ENABLE(vkds->depthTestEnable 
? 1 : 0) |
-  
S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) |
+  S_028800_Z_WRITE_ENABLE(z_write_enable ? 
1 : 0) |
S_028800_ZFUNC(vkds->depthCompareOp) |

S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0);
 }
--
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled

2017-10-27 Thread Marek Olšák

The hw disables depth writes automatically.

Marek

On Fri, Oct 27, 2017 at 6:27 PM, Samuel Pitoiset
 wrote:
> Found by inspection.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index c25642c966..f5ebcda883 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct 
> radv_pipeline *pipeline,
> bool has_stencil_attachment = 
> vk_format_is_stencil(attachment->format);
>
> if (has_depth_attachment) {
> +   /* Depth writes are always disabled when depthTestEnable is
> +* VK_FALSE.
> +*/
> +   bool z_write_enable =
> +   vkds->depthTestEnable && vkds->depthWriteEnable;
> +
> ds->db_depth_control = 
> S_028800_Z_ENABLE(vkds->depthTestEnable ? 1 : 0) |
> -  
> S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) |
> +  S_028800_Z_WRITE_ENABLE(z_write_enable 
> ? 1 : 0) |
>S_028800_ZFUNC(vkds->depthCompareOp) |
>
> S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0);
> }
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] util: hashtable: make hashing prototypes match

2017-10-27 Thread Lionel Landwerlin

It seems nobody's using the string hashing function. If you try to
pass it directly to the hashtable creation function, you'll get
compiler warning for non matching prototypes. Let's make them match.

Signed-off-by: Lionel Landwerlin 
---
 src/util/hash_table.c | 3 ++-
 src/util/hash_table.h | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 1bda2149b95..b7421a0144c 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -476,9 +476,10 @@ _mesa_hash_data(const void *data, size_t size)
 
 /** FNV-1a string hash implementation */
 uint32_t
-_mesa_hash_string(const char *key)
+_mesa_hash_string(const void *_key)
 {
uint32_t hash = _mesa_fnv32_1a_offset_bias;
+   const char *key = _key;
 
while (*key != 0) {
   hash = _mesa_fnv32_1a_accumulate(hash, *key);
diff --git a/src/util/hash_table.h b/src/util/hash_table.h
index cf939130fcf..d3e0758b265 100644
--- a/src/util/hash_table.h
+++ b/src/util/hash_table.h
@@ -94,7 +94,7 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
   bool (*predicate)(struct hash_entry *entry));
 
 uint32_t _mesa_hash_data(const void *data, size_t size);
-uint32_t _mesa_hash_string(const char *key);
+uint32_t _mesa_hash_string(const void *key);
 bool _mesa_key_string_equal(const void *a, const void *b);
 bool _mesa_key_pointer_equal(const void *a, const void *b);
 
-- 
2.15.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel: common: silence compiler warning

2017-10-27 Thread Lionel Landwerlin

Signed-off-by: Lionel Landwerlin 
---
 src/intel/common/gen_decoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 395ff02908a..55e7305117c 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -556,7 +556,7 @@ gen_spec_load(const struct gen_device_info *devinfo)
 {
struct parser_context ctx;
void *buf;
-   uint8_t *text_data;
+   uint8_t *text_data = NULL;
uint32_t text_offset = 0, text_length = 0, total_length;
uint32_t gen_10 = devinfo_to_gen(devinfo);
 
-- 
2.15.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] util: move futex helpers into futex.h

2017-10-27 Thread Marek Olšák

For patches 1-4, 6, 7, assuming you fix the double unlock in patch 7:

Reviewed-by: Marek Olšák 

Marek

On Sun, Oct 22, 2017 at 8:33 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/util/Makefile.sources |  1 +
>  src/util/futex.h  | 51 
> +++
>  src/util/meson.build  |  1 +
>  src/util/simple_mtx.h | 20 +--
>  4 files changed, 54 insertions(+), 19 deletions(-)
>  create mode 100644 src/util/futex.h
>
> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
> index c7f6516a992..407b2825624 100644
> --- a/src/util/Makefile.sources
> +++ b/src/util/Makefile.sources
> @@ -6,20 +6,21 @@ MESA_UTIL_FILES := \
> build_id.h \
> crc32.c \
> crc32.h \
> debug.c \
> debug.h \
> disk_cache.c \
> disk_cache.h \
> format_r11g11b10f.h \
> format_rgb9e5.h \
> format_srgb.h \
> +   futex.h \
> half_float.c \
> half_float.h \
> hash_table.c \
> hash_table.h \
> list.h \
> macros.h \
> mesa-sha1.c \
> mesa-sha1.h \
> sha1/sha1.c \
> sha1/sha1.h \
> diff --git a/src/util/futex.h b/src/util/futex.h
> new file mode 100644
> index 000..a79daaf209b
> --- /dev/null
> +++ b/src/util/futex.h
> @@ -0,0 +1,51 @@
> +/*
> + * Copyright © 2015 Intel
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#ifndef UTIL_FUTEX_H
> +#define UTIL_FUTEX_H
> +
> +#if defined(HAVE_FUTEX)
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static inline long sys_futex(void *addr1, int op, int val1, struct timespec 
> *timeout, void *addr2, int val3)
> +{
> +   return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3);
> +}
> +
> +static inline int futex_wake(uint32_t *addr) {
> +   return sys_futex(addr, FUTEX_WAKE, 1, NULL, NULL, 0);
> +}
> +
> +static inline int futex_wait(uint32_t *addr, int32_t value) {
> +   return sys_futex(addr, FUTEX_WAIT, value, NULL, NULL, 0);
> +}
> +
> +#endif
> +
> +#endif /* UTIL_FUTEX_H */
> diff --git a/src/util/meson.build b/src/util/meson.build
> index c9cb3e861e9..0940e4d12a7 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -30,20 +30,21 @@ files_mesa_util = files(
>'build_id.h',
>'crc32.c',
>'crc32.h',
>'debug.c',
>'debug.h',
>'disk_cache.c',
>'disk_cache.h',
>'format_r11g11b10f.h',
>'format_rgb9e5.h',
>'format_srgb.h',
> +  'futex.h',
>'half_float.c',
>'half_float.h',
>'hash_table.c',
>'hash_table.h',
>'list.h',
>'macros.h',
>'mesa-sha1.c',
>'mesa-sha1.h',
>'sha1/sha1.c',
>'sha1/sha1.h',
> diff --git a/src/util/simple_mtx.h b/src/util/simple_mtx.h
> index d23a4303c80..0c2602d03b6 100644
> --- a/src/util/simple_mtx.h
> +++ b/src/util/simple_mtx.h
> @@ -14,20 +14,21 @@
>   *
>   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>   * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>   * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>   * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>   * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>   * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
>   * IN THE SOFTWARE.
>   */
>
> +#include "util/futex.h"
>  #include "util/u_thread.h"
>
>  #if defined(__GNUC__) && defined(HAVE_FUTEX)
>
>  /* mtx_t - Fast, simple mutex
>   *
>   * While modern pthread mutexes are very fast (implemented using futex), they
>   * still incur a call to an external DSO and overhead of the generality and
>   * features of pthread mutexes.  Most

[Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled

2017-10-27 Thread Samuel Pitoiset

Found by inspection.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c25642c966..f5ebcda883 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct 
radv_pipeline *pipeline,
bool has_stencil_attachment = vk_format_is_stencil(attachment->format);
 
if (has_depth_attachment) {
+   /* Depth writes are always disabled when depthTestEnable is
+* VK_FALSE.
+*/
+   bool z_write_enable =
+   vkds->depthTestEnable && vkds->depthWriteEnable;
+
ds->db_depth_control = S_028800_Z_ENABLE(vkds->depthTestEnable 
? 1 : 0) |
-  
S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) |
+  S_028800_Z_WRITE_ENABLE(z_write_enable ? 
1 : 0) |
   S_028800_ZFUNC(vkds->depthCompareOp) |
   
S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0);
}
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] svga: Use asm instead of asm

2017-10-27 Thread Brian Paul


On 10/27/2017 05:21 AM, Emil Velikov wrote:

On 27 October 2017 at 00:57, Dylan Baker  wrote:

Which allows the code to be compiled with c99 instead of gnu99.

A little history. This code is guarded by #ifdef __GNUC__, so it's only
compiled with autotools on *nix, SCons with MSVC wont hit that code.
However, meson is going to build both MSVC and GCC/Clang paths. As such
it makes sense to not have to override the std for gcc/clang, but ensure
that it's not set to gnu99 when building with MSVC when there's a
straightforward code change that allows removing the need for gnu99.


I'm afraid that most of the buildsystem details are off. Patch makes
sense regardless :-)

With a more generic commit message (one example below), the commit is
Reviewed-by: Emil Velikov 

Replace the GNU specific keyword asm with __asm_.
This allows us to remove the explicit request for GNU extensions aka -std=gnu99


Sounds good.

I tested with MinGW too.

Reviewed-by: Brian Paul 
Tested-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built

2017-10-27 Thread Eric Engestrom

On Friday, 2017-10-27 13:55:25 +0200, Erik Faye-Lund wrote:
> This avoids the following build-error when building with emtpy
> vulkan-drivers and without glx=dri:
> 
> Meson encountered an error in file src/vulkan/wsi/meson.build, line 30,
> column 2:
> Unknown variable "dep_xcb".
> 
> Signed-off-by: Erik Faye-Lund 

Reviewed-by: Eric Engestrom 

> ---
>  src/meson.build | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/meson.build b/src/meson.build
> index 9b1b0ae594..4b00ab910c 100644
> --- a/src/meson.build
> +++ b/src/meson.build
> @@ -47,7 +47,9 @@ subdir('mapi')
>  # TODO: osmesa
>  subdir('compiler')
>  subdir('egl/wayland/wayland-drm')
> -subdir('vulkan')
> +if with_any_vk
> +  subdir('vulkan')
> +endif
>  subdir('amd')
>  if with_gallium_vc4
>subdir('broadcom')
> -- 
> 2.11.0
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types

2017-10-27 Thread Ilia Mirkin

On Fri, Oct 27, 2017 at 11:07 AM, Juan A. Suarez Romero
 wrote:
> On Fri, 2017-10-27 at 10:27 -0400, Ilia Mirkin wrote:
>> On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romero
>>  wrote:
>> > This patch is mostly a patch done by Ilia Mirkin.
>> >
>> > It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
>> >
>> > CC: Ilia Mirkin 
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
>> > Signed-off-by: Juan A. Suarez Romero 
>> > ---
>> >  src/compiler/glsl/linker.cpp | 36 
>> >  1 file changed, 36 insertions(+)
>> >
>> > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
>> > index f827b68555..7305b7740b 100644
>> > --- a/src/compiler/glsl/linker.cpp
>> > +++ b/src/compiler/glsl/linker.cpp
>> > @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx,
>> >return true;
>> > }
>> >
>> > +   case GLSL_TYPE_ARRAY: {
>> > +  /* The ARB_program_interface_query spec says:
>> > +   *
>> > +   * "For an active variable declared as an array of basic types, 
>> > a
>> > +   *  single entry will be generated, with its name string formed 
>> > by
>> > +   *  concatenating the name of the array and the string "[0]"."
>> > +   *
>> > +   * "For an active variable declared as an array of an aggregate 
>> > data
>> > +   *  type (structures or arrays), a separate entry will be 
>> > generated
>> > +   *  for each active array element, unless noted immediately 
>> > below.
>> > +   *  The name of each entry is formed by concatenating the name 
>> > of
>> > +   *  the array, the "[" character, an integer identifying the 
>> > element
>> > +   *  number, and the "]" character.  These enumeration rules are
>> > +   *  applied recursively, treating each enumerated array element 
>> > as a
>> > +   *  separate active variable."
>> > +   */
>>
>> So if you have a TCS or TES or GS with
>>
>> in struct { vec4 foo; } bar[4];
>>
>> Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo?
>
> According to the spec, yes. Unless I'm missing something...
>
>
> Looking for other examples, in
>
> https://www.khronos.org/opengl/wiki/Program_Introspection#Arrays
>
> it shows a similar example, and also do as you say.
>
>> With the latter ones getting bogus locations? What is it supposed to
>> do in this case?
>
> Why it would get bogus locations?

Because it'll do elem_location += stride every time, but they each
should get the same location.

>
>
>>
>> > +  const struct glsl_type *array_type = type->fields.array;
>> > +  if (array_type->base_type == GLSL_TYPE_STRUCT ||
>> > +  array_type->base_type == GLSL_TYPE_ARRAY) {
>> > + unsigned elem_location = location;
>> > + unsigned stride = array_type->count_attribute_slots(false);
>> > + for (unsigned i = 0; i < type->length; i++) {
>> > +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i);
>> > +if (!add_shader_variable(ctx, shProg, resource_set,
>> > + stage_mask, programInterface,
>> > + var, elem, array_type,
>> > + use_implicit_location, elem_location,
>> > + outermost_struct_type))
>> > +   return false;
>> > +elem_location += stride;
>> > + }
>> > + return true;
>> > +  }
>> > +  /* fallthrough */
>> > +   }
>> > +
>> > default: {
>> >/* The ARB_program_interface_query spec says:
>> > *
>> > --
>> > 2.13.6
>> >
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types

2017-10-27 Thread Juan A. Suarez Romero

On Fri, 2017-10-27 at 10:27 -0400, Ilia Mirkin wrote:
> On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romero
>  wrote:
> > This patch is mostly a patch done by Ilia Mirkin.
> > 
> > It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
> > 
> > CC: Ilia Mirkin 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
> > Signed-off-by: Juan A. Suarez Romero 
> > ---
> >  src/compiler/glsl/linker.cpp | 36 
> >  1 file changed, 36 insertions(+)
> > 
> > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> > index f827b68555..7305b7740b 100644
> > --- a/src/compiler/glsl/linker.cpp
> > +++ b/src/compiler/glsl/linker.cpp
> > @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx,
> >return true;
> > }
> > 
> > +   case GLSL_TYPE_ARRAY: {
> > +  /* The ARB_program_interface_query spec says:
> > +   *
> > +   * "For an active variable declared as an array of basic types, a
> > +   *  single entry will be generated, with its name string formed 
> > by
> > +   *  concatenating the name of the array and the string "[0]"."
> > +   *
> > +   * "For an active variable declared as an array of an aggregate 
> > data
> > +   *  type (structures or arrays), a separate entry will be 
> > generated
> > +   *  for each active array element, unless noted immediately 
> > below.
> > +   *  The name of each entry is formed by concatenating the name of
> > +   *  the array, the "[" character, an integer identifying the 
> > element
> > +   *  number, and the "]" character.  These enumeration rules are
> > +   *  applied recursively, treating each enumerated array element 
> > as a
> > +   *  separate active variable."
> > +   */
> 
> So if you have a TCS or TES or GS with
> 
> in struct { vec4 foo; } bar[4];
> 
> Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo?

According to the spec, yes. Unless I'm missing something...


Looking for other examples, in

https://www.khronos.org/opengl/wiki/Program_Introspection#Arrays

it shows a similar example, and also do as you say.

> With the latter ones getting bogus locations? What is it supposed to
> do in this case?

Why it would get bogus locations?


> 
> > +  const struct glsl_type *array_type = type->fields.array;
> > +  if (array_type->base_type == GLSL_TYPE_STRUCT ||
> > +  array_type->base_type == GLSL_TYPE_ARRAY) {
> > + unsigned elem_location = location;
> > + unsigned stride = array_type->count_attribute_slots(false);
> > + for (unsigned i = 0; i < type->length; i++) {
> > +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i);
> > +if (!add_shader_variable(ctx, shProg, resource_set,
> > + stage_mask, programInterface,
> > + var, elem, array_type,
> > + use_implicit_location, elem_location,
> > + outermost_struct_type))
> > +   return false;
> > +elem_location += stride;
> > + }
> > + return true;
> > +  }
> > +  /* fallthrough */
> > +   }
> > +
> > default: {
> >/* The ARB_program_interface_query spec says:
> > *
> > --
> > 2.13.6
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.

2017-10-27 Thread Andres Gomez

Bas, this patch looks like it should have been marked as fixing
6ce550453f1 instead of e38685cc62e (?).

In any case, I was wondering whether it would be interesting to bring
them both to the 17.2 stable queue and whether we would also want
Timothy's preceding patch:

087e010b2b3dd83a539f97203909d6c43b5da87c radv: copy indirect lowering settings 
from radeonsi

Let me know what you think.

On Sun, 2017-10-22 at 18:43 +0200, Bas Nieuwenhuizen wrote:
> Since it also uses the output vector before writing to memory.
> 
> Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
> ---
>  src/amd/vulkan/radv_shader.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 07e68d6032b..6176a2e590d 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -265,9 +265,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
>   indirect_mask |= nir_var_shader_in;
>   }
>   if (!llvm_has_working_vgpr_indexing &&
> - (nir->info.stage == MESA_SHADER_VERTEX ||
> -  nir->info.stage == MESA_SHADER_TESS_EVAL ||
> -  nir->info.stage == MESA_SHADER_FRAGMENT))
> + nir->info.stage != MESA_SHADER_TESS_CTRL)
>   indirect_mask |= nir_var_shader_out;
>  
>   /* TODO: We shouldn't need to do this, however LLVM isn't currently
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa] meson: wire up egl/android

2017-10-27 Thread Eric Engestrom

Cc: Rob Herring 
Cc: Tomasz Figa 
Signed-off-by: Eric Engestrom 
---
Completely untested!
It's a step in the right direction though; doesn't hurt non-android,
and gets android closer to building on meson :)
---
 meson.build | 13 +++--
 src/egl/meson.build |  5 -
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 875f9d4d294d1911f239..761c33f4651ab37ab7b6 100644
--- a/meson.build
+++ b/meson.build
@@ -152,7 +152,7 @@ endif
 # TODO: other OSes
 with_dri_platform = 'drm'
 
-# TODO: android platform
+with_platform_android = false
 with_platform_wayland = false
 with_platform_x11 = false
 with_platform_drm = false
@@ -161,6 +161,7 @@ egl_native_platform = ''
 _platforms = get_option('platforms')
 if _platforms != ''
   _split = _platforms.split(',')
+  with_platform_android = _split.contains('android')
   with_platform_x11 = _split.contains('x11')
   with_platform_wayland = _split.contains('wayland')
   with_platform_drm = _split.contains('drm')
@@ -252,7 +253,7 @@ if _vulkan_drivers != ''
   with_intel_vk = _split.contains('intel')
   with_amd_vk = _split.contains('amd')
   with_any_vk = with_amd_vk or with_intel_vk
-  if not (with_platform_x11 or with_platform_wayland)
+  if not (with_platform_x11 or with_platform_wayland or with_platform_android)
 error('Vulkan requires at least one platform (x11, wayland)')
   endif
 endif
@@ -330,6 +331,14 @@ endif
 if with_platform_surfaceless
   pre_args += '-DHAVE_SURFACELESS_PLATFORM'
 endif
+if with_platform_android
+  dep_android = [
+  dependency('cutils'),
+  dependency('hardware'),
+  dependency('sync'),
+  ]
+  pre_args += '-DHAVE_ANDROID_PLATFORM'
+endif
 
 prog_python2 = find_program('python2')
 has_mako = run_command(prog_python2, '-c', 'import mako')
diff --git a/src/egl/meson.build b/src/egl/meson.build
index ea7ae06761f75c00a40c..cc51671f9d8f24708405 100644
--- a/src/egl/meson.build
+++ b/src/egl/meson.build
@@ -129,7 +129,10 @@ if with_platform_wayland
 'wayland/wayland-egl', 'wayland/wayland-drm',
   )
 endif
-# TODO: android
+if with_platform_android
+  deps_for_egl += dep_android
+  files_egl += files('drivers/dri2/platform_android.c')
+endif
 
 # TODO: glvnd
 
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa 1/2] buildsys: move file regeneration logic to the script itself

2017-10-27 Thread Eric Engestrom

On Thursday, 2017-10-26 10:10:19 -0700, Dylan Baker wrote:
> I have a few tiny nits, but otherwise this seems fine:
> Reviewed-by: Dylan Baker 
> 
> Quoting Eric Engestrom (2017-10-26 08:40:56)
> > Suggested-by: Jordan Justen 
> > Signed-off-by: Eric Engestrom 
> > ---
> >  bin/git_sha1_gen.py  | 13 -
> >  src/Makefile.am  | 15 ---
> >  src/SConscript   | 22 ++
> >  src/mesa/Android.libmesa_git_sha1.mk |  4 ++--
> >  4 files changed, 24 insertions(+), 30 deletions(-)
> > 
> > diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py
> > index c75dba101acf4ffc85fb..7b9267b59e9d1d897cc4 100755
> > --- a/bin/git_sha1_gen.py
> > +++ b/bin/git_sha1_gen.py
> > @@ -6,6 +6,7 @@
> >  """
> >  
> >  
> > +import argparse
> >  import os
> >  import os.path
> >  import subprocess
> > @@ -27,10 +28,20 @@ def get_git_sha1():
> >  git_sha1 = ''
> >  return git_sha1
> >  
> > +parser = argparse.ArgumentParser()
> > +parser.add_argument('--output', help='File to write the #define in',
> > +required=True)
> > +args = parser.parse_args()
> >  
> >  git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10]
> >  if git_sha1:
> >  git_sha1_h_in_path = os.path.join(os.path.dirname(sys.argv[0]),
> >  '..', 'src', 'git_sha1.h.in')
> >  with open(git_sha1_h_in_path , 'r') as git_sha1_h_in:
> > -sys.stdout.write(git_sha1_h_in.read().replace('@VCS_TAG@', 
> > git_sha1))
> > +new_sha1 = git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1)
> > +if os.path.isfile(args.output):
> > +with open(args.output, 'r') as git_sha1_h:
> > +if git_sha1_h.read() == new_sha1:
> > +quit()
> > +with open(args.output, 'w') as git_sha1_h:
> > +git_sha1_h.write(new_sha1)
> > diff --git a/src/Makefile.am b/src/Makefile.am
> > index 5ef2d4f55eacc470a7a9..1de4fca6a1216e7662b1 100644
> > --- a/src/Makefile.am
> > +++ b/src/Makefile.am
> > @@ -19,17 +19,10 @@
> >  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > DEALINGS
> >  # IN THE SOFTWARE.
> >  
> > -.PHONY: git_sha1.h.tmp
> > -git_sha1.h.tmp:
> > -   @$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py > $@
> > -
> > -git_sha1.h: git_sha1.h.tmp
> > -   @echo "updating git_sha1.h"
> > -   @if ! cmp -s git_sha1.h.tmp git_sha1.h; then \
> > -   mv git_sha1.h.tmp git_sha1.h ;\
> > -   else \
> > -   rm git_sha1.h.tmp ;\
> > -   fi
> > +.PHONY: git_sha1.h
> > +git_sha1.h: $(top_srcdir)/src/git_sha1.h.in
> > +   @echo "updating $@"
> > +   @$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py --output $@
> >  
> >  BUILT_SOURCES = git_sha1.h
> >  CLEANFILES = $(BUILT_SOURCES)
> > diff --git a/src/SConscript b/src/SConscript
> > index a277e8b79254588e8fd4..820c3c01289e7544a57e 100644
> > --- a/src/SConscript
> > +++ b/src/SConscript
> > @@ -24,22 +24,12 @@ def write_git_sha1_h_file(filename):
> >  to retrieve the git hashid and write the header file.  An empty file
> >  will be created if anything goes wrong."""
> >  
> > -tempfile = "git_sha1.h.tmp"
> > -with open(tempfile, "w") as f:
> > -args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py' ]
> > -try:
> > -subprocess.Popen(args, stdout=f).wait()
> > -except:
> > -print("Warning: exception in write_git_sha1_h_file()")
> > -return
> > -
> > -if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename):
> > -# The filename does not exist or it's different from the new file,
> > -# so replace old file with new.
> > -if os.path.exists(filename):
> > -os.remove(filename)
> > -os.rename(tempfile, filename)
> > -return
> > +args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py', 
> > '--output', filename]
> 
> space after the opening brace but none before the closing. I don't konw what 
> the
> scons style is, but you should probably be consistent

scons style is to have spaces inside [], which I added back now.
thanks for spotting that :)

> 
> > +try:
> > +subprocess.Popen(args).wait()
> 
> how about subprocess.call() instead of Popen().wait()?

indeed, Popen() is not needed anymore.
replaced with call(), tested and pushed :)

> 
> > +except:
> > +print("Warning: exception in write_git_sha1_h_file()")
> > +return
> >  
> >  
> >  # Create the git_sha1.h header file
> > diff --git a/src/mesa/Android.libmesa_git_sha1.mk 
> > b/src/mesa/Android.libmesa_git_sha1.mk
> > index ddb30c428af739ee9fe5..d27923074dd289a115de 100644
> > --- a/src/mesa/Android.libmesa_git_sha1.mk
> > +++ b/src/mesa/Android.libmesa_git_sha1.mk
> > @@ -43,10 +43,10 @@ $(intermediates)/dummy.c:
> >  
> >  LOCAL_GENERATED_SOURCES +=

Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak

2017-10-27 Thread Emil Velikov

On 27 October 2017 at 14:53, Lionel Landwerlin
 wrote:
> On 27/10/17 14:46, Emil Velikov wrote:
>>
>> On 27 October 2017 at 13:55, Tapani Pälli  wrote:
>>>
>>> Patch uses mem_ctx for allocation to ensure param array gets freed
>>> later, in blorp clear case this happens in blorp_params_get_clear_kernel.
>>>
>>> ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of
>>> 193
>>> ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
>>> ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121)
>>> ==6164==by 0x130189F1: fs_visitor::assign_constant_locations()
>>> (brw_fs.cpp:2095)
>>> ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
>>> ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool)
>>> (brw_fs.cpp:6229)
>>> ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
>>> ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194)
>>> ==6164==by 0x130D384B: blorp_params_get_clear_kernel
>>> (blorp_clear.c:79)
>>> ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
>>> ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
>>> ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
>>> ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297)
>>>
>>> Signed-off-by: Tapani Pälli 
>>> ---
>>>   src/intel/compiler/brw_fs.cpp | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/intel/compiler/brw_fs.cpp
>>> b/src/intel/compiler/brw_fs.cpp
>>> index 4616529abc..6b27c38be7 100644
>>> --- a/src/intel/compiler/brw_fs.cpp
>>> +++ b/src/intel/compiler/brw_fs.cpp
>>> @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations()
>>>   */
>>>  uint32_t *param = stage_prog_data->param;
>>>  stage_prog_data->nr_params = num_push_constants;
>>> -   stage_prog_data->param = ralloc_array(NULL, uint32_t,
>>> num_push_constants);
>>> +   stage_prog_data->param = ralloc_array(mem_ctx, uint32_t,
>>> num_push_constants);
>>
>> I'm likely having a dull moment:
>> isn't param/stage_prog_data->param freed (via ralloc_free) at the end
>> of the function?
>
>
> The old one is freed, here we create a new one.
I'm a genius, indeed is. Thanks for Lionel.

-Emil
Note to self: Don't skip afternoon coffee :-\
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types

2017-10-27 Thread Ilia Mirkin

On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romero
 wrote:
> This patch is mostly a patch done by Ilia Mirkin.
>
> It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
>
> CC: Ilia Mirkin 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
> Signed-off-by: Juan A. Suarez Romero 
> ---
>  src/compiler/glsl/linker.cpp | 36 
>  1 file changed, 36 insertions(+)
>
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index f827b68555..7305b7740b 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx,
>return true;
> }
>
> +   case GLSL_TYPE_ARRAY: {
> +  /* The ARB_program_interface_query spec says:
> +   *
> +   * "For an active variable declared as an array of basic types, a
> +   *  single entry will be generated, with its name string formed by
> +   *  concatenating the name of the array and the string "[0]"."
> +   *
> +   * "For an active variable declared as an array of an aggregate 
> data
> +   *  type (structures or arrays), a separate entry will be generated
> +   *  for each active array element, unless noted immediately below.
> +   *  The name of each entry is formed by concatenating the name of
> +   *  the array, the "[" character, an integer identifying the 
> element
> +   *  number, and the "]" character.  These enumeration rules are
> +   *  applied recursively, treating each enumerated array element as 
> a
> +   *  separate active variable."
> +   */

So if you have a TCS or TES or GS with

in struct { vec4 foo; } bar[4];

Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo?
With the latter ones getting bogus locations? What is it supposed to
do in this case?

> +  const struct glsl_type *array_type = type->fields.array;
> +  if (array_type->base_type == GLSL_TYPE_STRUCT ||
> +  array_type->base_type == GLSL_TYPE_ARRAY) {
> + unsigned elem_location = location;
> + unsigned stride = array_type->count_attribute_slots(false);
> + for (unsigned i = 0; i < type->length; i++) {
> +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i);
> +if (!add_shader_variable(ctx, shProg, resource_set,
> + stage_mask, programInterface,
> + var, elem, array_type,
> + use_implicit_location, elem_location,
> + outermost_struct_type))
> +   return false;
> +elem_location += stride;
> + }
> + return true;
> +  }
> +  /* fallthrough */
> +   }
> +
> default: {
>/* The ARB_program_interface_query spec says:
> *
> --
> 2.13.6
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types

2017-10-27 Thread Juan A. Suarez Romero

This patch is mostly a patch done by Ilia Mirkin.

It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.

CC: Ilia Mirkin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Signed-off-by: Juan A. Suarez Romero 
---
 src/compiler/glsl/linker.cpp | 36 
 1 file changed, 36 insertions(+)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f827b68555..7305b7740b 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx,
   return true;
}
 
+   case GLSL_TYPE_ARRAY: {
+  /* The ARB_program_interface_query spec says:
+   *
+   * "For an active variable declared as an array of basic types, a
+   *  single entry will be generated, with its name string formed by
+   *  concatenating the name of the array and the string "[0]"."
+   *
+   * "For an active variable declared as an array of an aggregate data
+   *  type (structures or arrays), a separate entry will be generated
+   *  for each active array element, unless noted immediately below.
+   *  The name of each entry is formed by concatenating the name of
+   *  the array, the "[" character, an integer identifying the element
+   *  number, and the "]" character.  These enumeration rules are
+   *  applied recursively, treating each enumerated array element as a
+   *  separate active variable."
+   */
+  const struct glsl_type *array_type = type->fields.array;
+  if (array_type->base_type == GLSL_TYPE_STRUCT ||
+  array_type->base_type == GLSL_TYPE_ARRAY) {
+ unsigned elem_location = location;
+ unsigned stride = array_type->count_attribute_slots(false);
+ for (unsigned i = 0; i < type->length; i++) {
+char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i);
+if (!add_shader_variable(ctx, shProg, resource_set,
+ stage_mask, programInterface,
+ var, elem, array_type,
+ use_implicit_location, elem_location,
+ outermost_struct_type))
+   return false;
+elem_location += stride;
+ }
+ return true;
+  }
+  /* fallthrough */
+   }
+
default: {
   /* The ARB_program_interface_query spec says:
*
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak

2017-10-27 Thread Lionel Landwerlin


On 27/10/17 14:46, Emil Velikov wrote:

On 27 October 2017 at 13:55, Tapani Pälli  wrote:

Patch uses mem_ctx for allocation to ensure param array gets freed
later, in blorp clear case this happens in blorp_params_get_clear_kernel.

==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121)
==6164==by 0x130189F1: fs_visitor::assign_constant_locations() 
(brw_fs.cpp:2095)
==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194)
==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297)

Signed-off-by: Tapani Pälli 
---
  src/intel/compiler/brw_fs.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 4616529abc..6b27c38be7 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations()
  */
 uint32_t *param = stage_prog_data->param;
 stage_prog_data->nr_params = num_push_constants;
-   stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants);
+   stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, 
num_push_constants);

I'm likely having a dull moment:
isn't param/stage_prog_data->param freed (via ralloc_free) at the end
of the function?


The old one is freed, here we create a new one.


-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak

2017-10-27 Thread Emil Velikov

On 27 October 2017 at 13:55, Tapani Pälli  wrote:
> Patch uses mem_ctx for allocation to ensure param array gets freed
> later, in blorp clear case this happens in blorp_params_get_clear_kernel.
>
> ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
> ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
> ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121)
> ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() 
> (brw_fs.cpp:2095)
> ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
> ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
> ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
> ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194)
> ==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
> ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
> ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
> ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
> ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297)
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/intel/compiler/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 4616529abc..6b27c38be7 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations()
>  */
> uint32_t *param = stage_prog_data->param;
> stage_prog_data->nr_params = num_push_constants;
> -   stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants);
> +   stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, 
> num_push_constants);

I'm likely having a dull moment:
isn't param/stage_prog_data->param freed (via ralloc_free) at the end
of the function?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] radv: Implement VK_AMD_shader_info

2017-10-27 Thread Alex Smith

This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.

When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.

v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
does not include the terminator)

Signed-off-by: Alex Smith 
---
 src/amd/vulkan/radv_device.c |   9 ++
 src/amd/vulkan/radv_extensions.py|   1 +
 src/amd/vulkan/radv_pipeline.c   |   2 +-
 src/amd/vulkan/radv_pipeline_cache.c |  11 ++-
 src/amd/vulkan/radv_private.h|   3 +
 src/amd/vulkan/radv_shader.c | 179 +--
 6 files changed, 170 insertions(+), 35 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index d25e9c97ba..0c2f6fa631 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -944,10 +944,15 @@ VkResult radv_CreateDevice(
VkResult result;
struct radv_device *device;
 
+   bool keep_shader_info = false;
+
for (uint32_t i = 0; i < pCreateInfo->enabledExtensionCount; i++) {
const char *ext_name = pCreateInfo->ppEnabledExtensionNames[i];
if (!radv_physical_device_extension_supported(physical_device, 
ext_name))
return vk_error(VK_ERROR_EXTENSION_NOT_PRESENT);
+
+   if (strcmp(ext_name, VK_AMD_SHADER_INFO_EXTENSION_NAME) == 0)
+   keep_shader_info = true;
}
 
/* Check enabled features */
@@ -1041,10 +1046,14 @@ VkResult radv_CreateDevice(
device->physical_device->rad_info.max_se >= 2;
 
if (getenv("RADV_TRACE_FILE")) {
+   keep_shader_info = true;
+
if (!radv_init_trace(device))
goto fail;
}
 
+   device->keep_shader_info = keep_shader_info;
+
result = radv_device_init_meta(device);
if (result != VK_SUCCESS)
goto fail;
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index dfeb2880fc..eeb679d65a 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -81,6 +81,7 @@ EXTENSIONS = [
 Extension('VK_EXT_global_priority',   1, 
'device->rad_info.has_ctx_priority'),
 Extension('VK_AMD_draw_indirect_count',   1, True),
 Extension('VK_AMD_rasterization_order',   1, 
'device->rad_info.chip_class >= VI && device->rad_info.max_se >= 2'),
+Extension('VK_AMD_shader_info',   1, True),
 ]
 
 class VkVersion:
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c25642c966..c6d9debc7e 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1942,7 +1942,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
 
for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
free(codes[i]);
-   if (modules[i] && !pipeline->device->trace_bo)
+   if (modules[i] && !pipeline->device->keep_shader_info)
ralloc_free(nir[i]);
}
 
diff --git a/src/amd/vulkan/radv_pipeline_cache.c 
b/src/amd/vulkan/radv_pipeline_cache.c
index 5dee114749..441e2257d5 100644
--- a/src/amd/vulkan/radv_pipeline_cache.c
+++ b/src/amd/vulkan/radv_pipeline_cache.c
@@ -62,9 +62,11 @@ radv_pipeline_cache_init(struct radv_pipeline_cache *cache,
cache->hash_table = malloc(byte_size);
 
/* We don't consider allocation failure fatal, we just start with a 
0-sized
-* cache. */
+* cache. Disable caching when we want to keep shader debug info, since
+* we don't get the debug info on cached shaders. */
if (cache->hash_table == NULL ||
-   (device->instance->debug_flags & RADV_DEBUG_NO_CACHE))
+   (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) ||
+   device->keep_shader_info)
cache->table_size = 0;
else
memset(cache->hash_table, 0, byte_size);
@@ -186,8 +188,11 @@ radv_create_shader_variants_from_pipeline_cache(struct 
radv_device *device,
entry = radv_pipeline_cache_search_unlocked(cache, sha1);
 
if (!entry) {
+   /* Again, don't cache when we want debug info, since this isn't
+* present in the cache. */
if (!device->physical_device->disk_cache ||
-   (device->instance->debug_flags & RADV_DEBUG_NO_CACHE)) {
+   (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) ||
+   device->keep_shader_info) {
pthread_mutex_unlock(>mutex);
return false;
}
diff --git

Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak

2017-10-27 Thread Lionel Landwerlin


Reviewed-by: Lionel Landwerlin 

On 27/10/17 13:55, Tapani Pälli wrote:

Patch uses mem_ctx for allocation to ensure param array gets freed
later, in blorp clear case this happens in blorp_params_get_clear_kernel.

==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121)
==6164==by 0x130189F1: fs_visitor::assign_constant_locations() 
(brw_fs.cpp:2095)
==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194)
==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297)

Signed-off-by: Tapani Pälli 
---
  src/intel/compiler/brw_fs.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 4616529abc..6b27c38be7 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations()
  */
 uint32_t *param = stage_prog_data->param;
 stage_prog_data->nr_params = num_push_constants;
-   stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants);
+   stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, 
num_push_constants);
 if (num_pull_constants > 0) {
stage_prog_data->nr_pull_params = num_pull_constants;
stage_prog_data->pull_param = ralloc_array(NULL, uint32_t,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak

2017-10-27 Thread Tapani Pälli

Patch uses mem_ctx for allocation to ensure param array gets freed
later, in blorp clear case this happens in blorp_params_get_clear_kernel.

==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121)
==6164==by 0x130189F1: fs_visitor::assign_constant_locations() 
(brw_fs.cpp:2095)
==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194)
==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297)

Signed-off-by: Tapani Pälli 
---
 src/intel/compiler/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 4616529abc..6b27c38be7 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations()
 */
uint32_t *param = stage_prog_data->param;
stage_prog_data->nr_params = num_push_constants;
-   stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants);
+   stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, 
num_push_constants);
if (num_pull_constants > 0) {
   stage_prog_data->nr_pull_params = num_pull_constants;
   stage_prog_data->pull_param = ralloc_array(NULL, uint32_t,
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE

2017-10-27 Thread Ilia Mirkin

On Fri, Oct 27, 2017 at 5:18 AM, Alejandro Piñeiro  wrote:
> From the spec:
>"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
> resource when used as an image textures is returned in
> . This is equivalent to calling GetTexParameter"
>
> So we would need to return None for any target not supported by
> GetTexParameter. By mistake, we were using the target check for
> GetTexLevelParameter.
> ---
>  src/mesa/main/formatquery.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
> index 77c7faa2251..39c628039b8 100644
> --- a/src/mesa/main/formatquery.c
> +++ b/src/mesa/main/formatquery.c
> @@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
> internalformat, GLenum pname,
>if (!_mesa_has_ARB_shader_image_load_store(ctx))
>   goto end;
>
> -  if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true))
> +  /* As pointed by the spec quote below, this pname query should return
> +   * the same value that GetTexParameter. So if the target is not valid
> +   * for GetTexParameter we return the unsupported value. The check below
> +   * is the same target check used by GetTextParameter.

GetTexParameter

> +   */
> +  int targetIndex = _mesa_tex_target_to_index(ctx, target);
> +  if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX)
>   goto end;
>
>/* From spec: "Equivalent to calling GetTexParameter with  set
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Android: egl: add dependency on libnativewindow

2017-10-27 Thread Emil Velikov

On 26 October 2017 at 20:18, Rob Herring  wrote:
> system/window.h is no longer available by default and is part of
> libnativewindow, so add it to the shared libraries. It has to be conditional
> because the library is only present in O and later.
>
> Really, we should only be depending on vndk/window.h now, but that's only
> in O and changing would be pretty invasive.
>
> Signed-off-by: Rob Herring 
> ---
>  src/egl/Android.mk | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/egl/Android.mk b/src/egl/Android.mk
> index 2de842ca4172..11818694f4f8 100644
> --- a/src/egl/Android.mk
> +++ b/src/egl/Android.mk
> @@ -60,6 +60,10 @@ LOCAL_SHARED_LIBRARIES := \
> libgralloc_drm \
> libsync
>
> +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),)
Didn't we remove Android 4.x from the supported list?
Apart from that
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE

2017-10-27 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, Oct 27, 2017 at 11:18 AM, Alejandro Piñeiro
 wrote:
> From the spec:
>"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
> resource when used as an image textures is returned in
> . This is equivalent to calling GetTexParameter"
>
> So we would need to return None for any target not supported by
> GetTexParameter. By mistake, we were using the target check for
> GetTexLevelParameter.
> ---
>  src/mesa/main/formatquery.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
> index 77c7faa2251..39c628039b8 100644
> --- a/src/mesa/main/formatquery.c
> +++ b/src/mesa/main/formatquery.c
> @@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
> internalformat, GLenum pname,
>if (!_mesa_has_ARB_shader_image_load_store(ctx))
>   goto end;
>
> -  if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true))
> +  /* As pointed by the spec quote below, this pname query should return
> +   * the same value that GetTexParameter. So if the target is not valid
> +   * for GetTexParameter we return the unsupported value. The check below
> +   * is the same target check used by GetTextParameter.
> +   */
> +  int targetIndex = _mesa_tex_target_to_index(ctx, target);
> +  if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX)
>   goto end;
>
>/* From spec: "Equivalent to calling GetTexParameter with  set
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor

2017-10-27 Thread Rob Herring

On Fri, Oct 27, 2017 at 6:41 AM, Emil Velikov  wrote:
> On 26 October 2017 at 23:48, Mauro Rossi  wrote:
>> Having moved gallium_dri.so library to /vendor/lib/dri
>> also symlinks need to be coherently created using TARGET_OUT_VENDOR insted 
>> of TARGET_OUT
>> or all non Intel drivers will not be loaded with Android N and earlier,
>> thus causing SurfaceFlinger SIGABRT
>>
>> Fixes: c3f75d483c ("Android: move libraries to /vendor")
>>
>> Cc: 17.3 
>> ---
>>  src/gallium/targets/dri/Android.mk | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Rob Herring 

>>
>> diff --git a/src/gallium/targets/dri/Android.mk 
>> b/src/gallium/targets/dri/Android.mk
>> index 61b65769ff..3fa86a2d56 100644
>> --- a/src/gallium/targets/dri/Android.mk
>> +++ b/src/gallium/targets/dri/Android.mk
>> @@ -70,8 +70,8 @@ LOCAL_SHARED_LIBRARIES += $(sort $(GALLIUM_SHARED_LIBS))
>>  ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),)
>>  LOCAL_POST_INSTALL_CMD := \
>> $(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64), \
>> - mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \
>> - $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so 
>> $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \
>> + mkdir -p $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \
>> + $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so 
>> $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \
> Can we fold the long path into a variable and then reuse it?
> This code will be around for a bit, so it might be worth it.
>
> foo=$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)
> mkdir -p $(foo)
> $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so
> $(foo)/$(d)_dri.so;)
>
> -Emil
> *Please use better variable name than foo

bar?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 35/48] intel/eu: Make automatic exec sizes a configurable option

2017-10-27 Thread Iago Toral

On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> We have had a feature in codegen for some time that tries to
> automatically infer the execution size of an instruction from the
> width
> of its destination.  For things such as fixed function GS, clipper,
> and
> SF programs, this is very useful because they tend to have lots of
> hand-rolled register setup and trying to specify the exec size all
> the
> time would be prohibitive.  For things that come from a higher-level
> IR,
> however, it's easier to just set the right size all the time and the
> automatic exec sizes can, in fact, cause problems.  This commit makes
> it
> optional while enabling it by default.
> ---
>  src/intel/compiler/brw_eu.c  |  1 +
>  src/intel/compiler/brw_eu.h  | 10 ++
>  src/intel/compiler/brw_eu_emit.c | 32 ++--
> 
>  3 files changed, 29 insertions(+), 14 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_eu.c
> b/src/intel/compiler/brw_eu.c
> index b0bdc38..bc297a2 100644
> --- a/src/intel/compiler/brw_eu.c
> +++ b/src/intel/compiler/brw_eu.c
> @@ -296,6 +296,7 @@ brw_init_codegen(const struct gen_device_info
> *devinfo,
> memset(p, 0, sizeof(*p));
>  
> p->devinfo = devinfo;
> +   p->automatic_exec_sizes = true;
> /*
>  * Set the initial instruction store array size to 1024, if found
> that
>  * isn't enough, then it will double the store size at
> brw_next_insn()
> diff --git a/src/intel/compiler/brw_eu.h
> b/src/intel/compiler/brw_eu.h
> index 8e597b2..8abebeb 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -65,6 +65,16 @@ struct brw_codegen {
> bool compressed_stack[BRW_EU_MAX_INSN_STACK];
> brw_inst *current;
>  
> +   /** Whether or not the user wants automatic exec sizes
> +*
> +* If true, codegen will try to automatically infer the exec size
> of an
> +* instruction from the width of the destination register.  If
> false, it
> +* will take whatever is set by brw_set_default_exec_size
> verbatim.
> +*
> +* This is set to true by default in brw_init_codegen.
> +*/
> +   bool automatic_exec_sizes;
> +
> bool single_program_flow;
> const struct gen_device_info *devinfo;
>  
> diff --git a/src/intel/compiler/brw_eu_emit.c
> b/src/intel/compiler/brw_eu_emit.c
> index fae74cf..902914f 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -141,22 +141,26 @@ brw_set_dest(struct brw_codegen *p, brw_inst
> *inst, struct brw_reg dest)
>  
> /* Generators should set a default exec_size of either 8 (SIMD4x2
> or SIMD8)
>  * or 16 (SIMD16), as that's normally correct.  However, when
> dealing with
> -* small registers, we automatically reduce it to match the
> register size.
> -*
> -* In platforms that support fp64 we can emit instructions with a
> width of
> -* 4 that need two SIMD8 registers and an exec_size of 8 or 16.
> In these
> -* cases we need to make sure that these instructions have their
> exec sizes
> -* set properly when they are emitted and we can't rely on this
> code to fix
> -* it.
> +* small registers, it can be useful for us toautomatically 

to automatically

> reduce it to
> +* match the register size.
>  */
> -   bool fix_exec_size;
> -   if (devinfo->gen >= 6)
> -  fix_exec_size = dest.width < BRW_EXECUTE_4;
> -   else
> -  fix_exec_size = dest.width < BRW_EXECUTE_8;
> +   if (p->automatic_exec_sizes) {
> +  /*
> +   * In platforms that support fp64 we can emit instructions
> with a width
> +   * of 4 that need two SIMD8 registers and an exec_size of 8 or
> 16. In
> +   * these cases we need to make sure that these instructions
> have their
> +   * exec sizes set properly when they are emitted and we can't
> rely on
> +   * this code to fix it.
> +   */
> +  bool fix_exec_size;
> +  if (devinfo->gen >= 6)
> + fix_exec_size = dest.width < BRW_EXECUTE_4;
> +  else
> + fix_exec_size = dest.width < BRW_EXECUTE_8;
>  
> -   if (fix_exec_size)
> -  brw_inst_set_exec_size(devinfo, inst, dest.width);
> +  if (fix_exec_size)
> + brw_inst_set_exec_size(devinfo, inst, dest.width);
> +   }
>  }
>  
>  void
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

2017-10-27 Thread Iago Toral

I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and
38 but nothing major in general.

I have doubts that patch 20 isn't working around some other bug and 38
scares me a little bit but I guess if Jenkins is happy I shouldn't
worry too much.

Otherwise, patches 16-38 are:

Reviewed-by: Iago Toral Quiroga 

(patch 15 already has your Rb)

Iago

On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote:
> I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> I feel like patches 10, 11 could maybe use another extra review if
> there is someone who wants to do it, since I am not very familiar
> with
> how all the indirect addressing stuff works and the restrictions of
> the
> hardware that affect this.
> 
> Patch 14 looks good, although the part where you locate the DO block
> for a matching WHILE could probably use the review of someone else
> more
> familiar with the CFG code than me.
> 
> Iago
> 
> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > This series is a third respin of my subgroups prerequisites series
> > that
> > that I sent out a few weeks ago.  Not a whole lot has changed but
> > there are
> > some new patches.  Primarily,
> > 
> >  1) Some patches which were reviewed by Matt and Lionel were pushed
> > and are
> > no longer in the series.  Thanks guys!
> > 
> >  2) I've applied R-B tags from various people for patches which are
> > reviewed but depend on still unreviewed patches.
> > 
> >  3) A few patches to fix little-core.  In particular, the extra
> > little-core
> > EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> > integer
> > MUL.
> > 
> > This series can be found on fd.o nere:
> > 
> > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup
> > -p
> > rereqs-v3
> > 
> > Happy reviewing!
> > 
> > 
> > Cc: Matt Turner 
> > Cc: Francisco Jerez 
> > Cc: Connor Abbott 
> > 
> > Francisco Jerez (1):
> >   intel/fs: Restrict live intervals to the subset possibly
> > reachable
> > from any definition.
> > 
> > Jason Ekstrand (47):
> >   intel/fs: Pass builders instead of blocks into emit_[un]zip
> >   intel/fs: Be more explicit about our placement of [un]zip
> >   intel/fs: Use ANY/ALL32 predicates in SIMD32
> >   intel/fs: Don't stomp f0.1 in SIMD16 ballot
> >   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> >   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> >   intel/compiler: Add some restrictions to MOV_INDIRECT and
> > BROADCAST
> >   intel/eu: Just modify the offset in brw_broadcast
> >   intel/eu/reg: Add a subscript() helper
> >   intel/eu: Fix broadcast instruction for 64-bit values on little-
> > core
> >   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> >   intel/fs: Fix integer multiplication lowering for src/dst hazards
> >   intel/fs: Use the original destination region for int MUL
> > lowering
> >   i965/fs: Extend the live ranges of VGRFs which leave loops
> >   i965/fs/nir: Simplify 64-bit store_output
> >   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
> >   i965/fs/nir: Minor refactor of store_output
> >   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> >   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> >   intel/fs: Uniformize the index in readInvocation
> >   intel/fs: Retype dest to match value in read[First]Invocation
> >   intel/fs: Assign constant locations if they haven't been assigned
> >   intel/fs: Remove min_dispatch_width from fs_visitor
> >   intel/cs: Drop max_dispatch_width checks from compile_cs
> >   intel/cs: Stop setting dispatch_grf_start_reg
> >   intel/cs: Ignore runtime_check_aads_emit for CS
> >   intel/fs: Mark 64-bit values as being contiguous
> >   intel/cs: Rework the way thread local ID is handled
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Push subgroup ID instead of base thread ID
> >   intel/compiler/fs: Set up subgroup invocation as a system value
> >   intel/fs: Rework zero-length URB write handling
> >   intel/eu: Make automatic exec sizes a configurable option
> >   intel/eu: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Don't use automatic exec size inference
> >   nir: Add a new subgroups lowering pass
> >   nir: Add a ssa_dest_init_for_type helper
> >   nir: Make ballot intrinsics variable-size
> >   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
> >   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> > size
> >   nir,intel/compiler: Use a fixed subgroup size
> >   spirv: Add a vtn_constant_value helper
> >   spirv: Rework barriers
> >   nir: Validate base types on

1 2 >

1 - 100 of 133 matches

Mail list logo