Re: [Mesa-dev] [PATCH 3/4] i965: For color clears, only disable writes to components that exist.

2014-03-21 Thread Paul Berry
On 21 March 2014 04:47, Kenneth Graunke kenn...@whitecape.org wrote:

 The SIMD16 replicated FB write message only works if we don't need the
 color calculator to mask our framebuffer writes.  Previously, we bailed
 on it if color_mask wasn't true, true, true, true.  However, this was
 needlessly strict for formats with fewer than four components - only the
 components that actually exist matter.

 WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set
 to true, true, true, false.  This will work perfectly fine with the
 replicated data message; we just bailed unnecessarily.

 Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) and
 Bay Trail (at 2048x1152) by about 40% (using Chrome 24).

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: Dylan Baker baker.dyla...@gmail.com
 Cc: Keith Packard kei...@keithp.com
 Cc: Eric Anholt e...@anholt.net
 ---
  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Nice find!  Series is:

Reviewed-by: Paul Berry stereotype...@gmail.com



 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
 index 45b9fa0..7e19c2c 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
 @@ -231,7 +231,7 @@ brw_blorp_clear_params::brw_blorp_clear_params(struct
 brw_context *brw,
 /* Constant color writes ignore everyting in blend and color calculator
  * state.  This is not documented.
  */
 -   for (int i = 0; i  4; i++) {
 +   for (int i = 0; i  _mesa_format_num_components(irb-mt-format); i++)
 {
if (!color_mask[i]) {
   color_write_disable[i] = true;
   wm_prog_key.use_simd16_replicated_data = false;
 --
 1.9.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Drop special case for edgeflag thanks to Marek's change to core.

2014-03-17 Thread Paul Berry
On 17 March 2014 14:54, Eric Anholt e...@anholt.net wrote:

 ---
  src/mesa/drivers/dri/i965/brw_draw_upload.c | 9 -
  1 file changed, 9 deletions(-)


For future cross-referencing it would be nice to mention the SHA of Marek's
change in the commit message.



 diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
 b/src/mesa/drivers/dri/i965/brw_draw_upload.c
 index d42c074..d08e56a 100644
 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
 +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
 @@ -717,12 +717,6 @@ static void brw_emit_vertices(struct brw_context *brw)
uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC;
uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC;

 -  /* The gen4 driver expects edgeflag to come in as a float, and
 passes
 -   * that float on to the tests in the clipper.  Mesa's current vertex
 -   * attribute value for EdgeFlag is stored as a float, which works
 out.
 -   * glEdgeFlagPointer, on the other hand, gives us an unnormalized
 -   * integer ubyte.  Just rewrite that to convert to a float.
 -   */
if (input-attrib == VERT_ATTRIB_EDGEFLAG) {
   /* Gen6+ passes edgeflag as sideband along with the vertex,
 instead
* of in the VUE.  We have to upload it sideband as the last
 vertex
 @@ -732,9 +726,6 @@ static void brw_emit_vertices(struct brw_context *brw)
  gen6_edgeflag_input = input;
  continue;
   }
 -
 - if (format == BRW_SURFACEFORMAT_R8_UINT)
 -format = BRW_SURFACEFORMAT_R8_SSCALED;
}

switch (input-glarray-Size) {
 --
 1.9.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.

2014-03-11 Thread Paul Berry
On 10 March 2014 17:23, Ian Romanick i...@freedesktop.org wrote:

 I had a pretty similar patch on the top of my pow-optimization branch.
 I also expand x**3 and x**4.  I had hoped that would enable some cases
 to expand then merge to MADs.  It should also be faster on older GENs
 where POW perf sucks.  I didn't send it out because I wanted to add a
 similar optimization in the back end that would turn x*x*x*x back into
 x**4 on GPUs where the POW would be faster.


Be careful with that one, though: pow is undefined when x  0, so x*x*x*x
= pow(x, 4) isn't a valid conversion in general--it only works if you know
the operation started off as a pow in the first place.  (Note: you can do
x*x*x*x = pow(abs(x), 4), but that doesn't generalize to odd powers).



 I also didn't have anything in shader-db that benefitted from x**2 or
 x**3.  It seems like there were a couple that would be modified by a
 x**5 flattening, but I think that would universally be slower

 On 03/10/2014 03:54 PM, Matt Turner wrote:
  Cuts two instructions out of SynMark's Gl32VSInstancing benchmark.
  ---
   src/glsl/opt_algebraic.cpp | 8 
   1 file changed, 8 insertions(+)
 
  diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
  index 5c49a78..8494bd9 100644
  --- a/src/glsl/opt_algebraic.cpp
  +++ b/src/glsl/opt_algebraic.cpp
  @@ -528,6 +528,14 @@
 ir_algebraic_visitor::handle_expression(ir_expression *ir)
 if (is_vec_two(op_const[0]))
return expr(ir_unop_exp2, ir-operands[1]);
 
  +  if (is_vec_two(op_const[1])) {
  + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type,
 x,
  +  ir_var_temporary);
  + base_ir-insert_before(x);
  + base_ir-insert_before(assign(x, ir-operands[0]));
  + return mul(x, x);
  +  }
  +
 break;
 
  case ir_unop_rcp:
 

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] i965/gs: add snb support

2014-02-24 Thread Paul Berry
On 24 February 2014 00:36, Ilia Mirkin imir...@alum.mit.edu wrote:

 Before you read any further, this is nowhere close to working. However
 it's in
 a state where I think most of the structure is there, albeit with a lot of
 XXX
 comments. And I haven't actually implemented the new opcodes I've added.

 I was hoping one or two Intel people could take a look at this and let me
 know
 of any pitfalls I'm likely to run into. I've already gotten a lot of help
 and
 advice from Ken, but wanted to put something out publicly.

 Any and all feedback much appreciated!

 Not-Signed-off-by: Ilia Mirkin imir...@alum.mit.edu


I'm really excited to see you're working on this!  A lot of people are
going to be really happy to see geometry shader support added to Gen6.

AFAICT, your approach is good.  I plan to look at it in more detail later
today or sometime tomorrow.  But before I do, here are some general
comments:

1. For long-term readability, it might be useful to have 3 classes:
vec4_gs_visitor (for stuff that's common to all gs visitors),
gen6_gs_visitor (for the stuff that's specific to gen6 geometry shaders),
and gen7plus_gs_visitor (for the stuff that's specific to gen7+ geometry
shaders).  What you currently have is a little harder to read, because it
means that to understand gen6 geometry shaders, you have to keep in mind
that some of the code in vec4_gs_visitor is never going to be executed
because it will always be overridden by gen6_gs_visitor.

2. Stream output (a.k.a. transform feedback): As you've probably already
figured out, transform feedback on gen6 is tricky because the hardware
doesn't have a dedicated pipeline stage for it; you have to emit code in
the geometry shader to do it.  This is a problem because the GL spec says
that two things are supposed to happen in between the GS stage and
transform feedback: dangling vertex removal (ignoring line strips
containing just a single vertex and ignoring triangle strips containing
less than 2 vertices), and expanding line/triangle strips into individual
line segments/triangles.  In Gen6, these operations happen after the GS
stage, so the code you emit in the geometry shader to do transform feedback
needs to simulate them.

Back when I first implemented transform feedback for Gen6, supporting
user-defined geometry shaders was the furthest thing from my mind, so I
implemented the transform feedback code in raw assembly rather than
re-using the vec4_visitor/vec4_generator infrastructure (see
gen6_sol_program()).  Unfortunately that's going to make it hard to re-use
that code in user-defined geometry shaders.  Also, when I implemented that
code, I was able to make some major simplifying assumptions because there
was no user-defined geometry shader--that meant that the number of vertices
to process was known at compile time, and I didn't have to worry about
dangling vertex removal or expanding line/triangle strips into individual
line segments/triangles (because the hardware automatically performs those
operations before invoking the GS).  You're not going to have that luxury.
My recommendation is: feel free to use gen6_sol_program() as a reference
point, but don't bother trying to re-use it--the simplifying assumptions
are just going to make it too different from what you want.  I'd recommend
leaving gen6_sol_program() around to handle the case where there's no
user-defined geometry shader, and then you can write fresh code for
handling transform feedback when there *is* a user-defined geometry shader.

Before you go too far on the implementation, I'd also recommend writing
some new piglit tests in tests/spec/glsl-1.50/execution/geometry/ to verify
that dangling vertex removal and strip expansion are handled properly for
transform feedback data.  If you don't have gs-supporting hardware at your
disposal to validate those tests against, I (or one of the Intel folks)
would be happy to validate the tests for you.  Note in particular that in
order to expand a triangle strip to triangles, you have to be careful with
the order to make sure that both the provoking vertex and the triangle
orientation are preserved (it's not necessary to follow
ARB_provoking_vertex, though; you can just assume the GL default of
LAST_VERTEX_CONVENTION applies).  For example, if the pre-expansion vertex
order is ABCDEF, then the only correct post-expansion order is
ABC,CBD,CDE,EDF.  I'm not sure whether Gen7+ hardware gets this order
exactly right in all circumstances, but it seems like it should be easy to
get it right on Gen6, because the order is entirely determined by software
:)

I'd also recommend adding some piglit tests to make sure that transform
feedback overflow conditions are handled properly when a geometry shader is
in use, since overflow handling on Gen6 is (partially) the responsibility
of the shader code you generate, and you're not going to be able to
piggy-back on the overflow handling in gen6_sol_program().

One other note: check out this text from the Gen6 

Re: [Mesa-dev] [PATCH] i965: Don't try to use the hardware blitter for multisampled miptrees.

2014-02-21 Thread Paul Berry
On 21 February 2014 19:15, Kenneth Graunke kenn...@whitecape.org wrote:

 The blitter is completely ignorant of MSAA buffer layouts, so any
 attempt to use BLT paths with MSAA buffers is likely to break
 spectacularly.

 In most cases, BLORP handles MSAA blits, so we never hit this bug.
 Until recently, it also wasn't worth fixing, since Meta couldn't handle
 MSAA either, so there was nothing to fall back to.  But now there is.

 +143 piglit tests on Broadwell (which doesn't have BLORP support).
 Surprisingly, three also start failing.  No changes on Ivybridge.


That actually doesn't surprise me too much.  Since color buffers use an
array-like sample layout (sample N in the Nth layer), trying to use the
blitter on a multisampled buffer just gets you sample 0 from each pixel.
 Which for some tests is good enough to pass (even though the blit is
actually doing the wrong thing).  So it's possible that a few of these
tests that were passing by luck with the blitter don't yet have proper Meta
support :)



 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/intel_blit.c | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/intel_blit.c
 b/src/mesa/drivers/dri/i965/intel_blit.c
 index df85dc9..cedb230 100644
 --- a/src/mesa/drivers/dri/i965/intel_blit.c
 +++ b/src/mesa/drivers/dri/i965/intel_blit.c
 @@ -165,6 +165,10 @@ intel_miptree_blit(struct brw_context *brw,
 mesa_format src_format = _mesa_get_srgb_format_linear(src_mt-format);
 mesa_format dst_format = _mesa_get_srgb_format_linear(dst_mt-format);

 +   /* The blitter doesn't understand multisampling at all. */
 +   if (src_mt-num_samples  0 || dst_mt-num_samples  0)
 +  return false;
 +


You might consider moving this check up to the top of the function, since
it doesn't depend on src_format or dst_format.  But it's not a big deal.
 Either way, the patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 8/9] i965: support gl_InvocationID for gen7

2014-02-06 Thread Paul Berry
On 4 February 2014 15:07, Jordan Justen jordan.l.jus...@intel.com wrote:

 v2:
  * Make gl_InstanceID a system value

 v3:
  * Properly shift from R0.1 into DST.4 by adding
GS_OPCODE_GET_INSTANCE_ID

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_defines.h   | 12 
  src/mesa/drivers/dri/i965/brw_shader.cpp  |  2 ++
  src/mesa/drivers/dri/i965/brw_vec4.h  |  1 +
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 20
 
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 16 +---
  5 files changed, 48 insertions(+), 3 deletions(-)


Hopefully I will get a chance to look at this in more detail soon, but for
now, consider this:

Acked-by: Paul Berry stereotype...@gmail.com



 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 75d09fc..199a699 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -902,6 +902,13 @@ enum opcode {
  *   form the final channel mask.
  */
 GS_OPCODE_SET_CHANNEL_MASKS,
 +
 +   /**
 +* Get the Instance ID fields from the payload.
 +*
 +* - dst is the GRF for gl_InvocationID.
 +*/
 +   GS_OPCODE_GET_INSTANCE_ID,
  };

  enum brw_urb_write_flags {
 @@ -1538,6 +1545,11 @@ enum brw_message_target {
  # define BRW_GS_EDGE_INDICATOR_0   (1  8)
  # define BRW_GS_EDGE_INDICATOR_1   (1  9)

 +/* GS Thread Payload
 + */
 +/* R0 */
 +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27
 +
  /* 3DSTATE_GS Output Vertex Size has an effective maximum of 62.  It's
   * counted in multiples of 16 bytes.
   */
 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 6cc2595..5e3005a 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -525,6 +525,8 @@ brw_instruction_name(enum opcode op)
return prepare_channel_masks;
 case GS_OPCODE_SET_CHANNEL_MASKS:
return set_channel_masks;
 +   case GS_OPCODE_GET_INSTANCE_ID:
 +  return get_instance_id;

 default:
/* Yes, this leaks.  It's in debug code, it should never occur, and
 if
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index e17b5cd..182a1e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -616,6 +616,7 @@ private:
 void generate_gs_set_dword_2_immed(struct brw_reg dst, struct brw_reg
 src);
 void generate_gs_prepare_channel_masks(struct brw_reg dst);
 void generate_gs_set_channel_masks(struct brw_reg dst, struct brw_reg
 src);
 +   void generate_gs_get_instance_id(struct brw_reg dst);
 void generate_oword_dual_block_offsets(struct brw_reg m1,
   struct brw_reg index);
 void generate_scratch_write(vec4_instruction *inst,
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
 index 94d1e79..a48d829 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
 @@ -639,6 +639,22 @@ vec4_generator::generate_gs_set_channel_masks(struct
 brw_reg dst,
  }

  void
 +vec4_generator::generate_gs_get_instance_id(struct brw_reg dst)
 +{
 +   /* We want to right shift R0.0  R0.1 by
 GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT
 +* and store into dst.0  dst.4. So generate the instruction:
 +*
 +* shr(8) dst1 R01,4,0 GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT {
 align1 WE_all }
 +*/
 +   brw_push_insn_state(p);
 +   brw_set_access_mode(p, BRW_ALIGN_1);
 +   dst = retype(dst, BRW_REGISTER_TYPE_UD);
 +   struct brw_reg r0(retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
 +   brw_SHR(p, dst, stride(r0, 1, 4, 0),
 brw_imm_ud(GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT));
 +   brw_pop_insn_state(p);
 +}
 +
 +void
  vec4_generator::generate_oword_dual_block_offsets(struct brw_reg m1,
struct brw_reg index)
  {
 @@ -1218,6 +1234,10 @@
 vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
generate_gs_set_channel_masks(dst, src[0]);
break;

 +   case GS_OPCODE_GET_INSTANCE_ID:
 +  generate_gs_get_instance_id(dst);
 +  break;
 +
 case SHADER_OPCODE_SHADER_TIME_ADD:
brw_shader_time_add(p, src[0],

  prog_data-base.binding_table.shader_time_start);
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 index 40743cc..360703b 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 @@ -51,9 +51,19 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context
 *brw,
  dst_reg *
  vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
  {
 -   /* Geometry

Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-05 Thread Paul Berry
On 4 February 2014 10:03, Jordan Justen jljus...@gmail.com wrote:

 
  What about trying to make use of
  MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?
 
  We could add
  extensions.c:bool _mesa_is_extension_override_enabled(char *)
 
  And then
  if (_mesa_is_extension_override_enabled(GL_ARB_compute_shader))
 
  Or, similarly, get overrides shoved into ctx-Extensions to allow
  checking for overrides at this early stage.
 
  -Jordan
 
 
  I looked into what would be necessary to do this, and it's unfortunately
  more complicated than it should be due to some flukes about
 initialization
  order (perhaps this is what you were alluding to when you said get
  overrides shoved into ctx-Extensions to allow checking for overrides at
  this early stage).

 Yeah, I'll admit I looked at it a bit, and decided to keep my feedback
 'hand wavy'. :)

   Currently, our initialization order is:
 
  1. brwCreateContext() calls brw_initialize_context_constants(), which
 needs
  to know whether compute shaders are supported in order to set constants
 like
  MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS
 correctly.
 
  2. Later, brwCreateContext() calls intelInitExtensions(), which is the
  function where we'll set ctx-Extensions.ARB_compute_shader to true once
  compute shaders are fully supported.
 
  3. Much later in initialization, when the context is being made current
 for
  the first time, core Mesa calls _mesa_make_extension_string(), which
 calls
  get_extension_override() to modify the the ctx-Extensions table based on
  the environment variable override.
 
  To do what you suggested, I would either have to:
 
  A. change 1 to call _mesa_is_extension_override_enabled(); that
 function, in
  turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
  couldn't reuse the parsing code in _mesa_make_extension_string(), since
 that
  parsing code updates ctx-Extensions as a side effect, and it's not time
 to
  update ctx-Extensions yet.  In addition to the code duplication, we
 would
  have a danger of bugs, since the override takes effect so late in
  initialization--if we ever accidentally introduced some code in between
  steps 2 and 3 that checked the value of
 ctx-Extensions.ARB_compute_shader,
  it would see false even if compute shaders were enabled by override.
 
  Or:
 
  B. change the order of initialization so that 2 happens first, followed
 by 3
  and then 1.  In the long run I think this would be a good thing, but
 it's a
  big change (since it affects all back ends) and I'm not sure it would be
 a
  good idea to hold up this patch series waiting for it.
 
 
  How would you feel if I landed the patch series as is, and then worked
 on a
  follow up patch to do B?

 Sure, you can have my r-b for this then.

 -Jordan


Great, thanks.  I've pushed the patches.  I'll work on B as soon as I can.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-04 Thread Paul Berry
On 1 February 2014 23:21, Jordan Justen jljus...@gmail.com wrote:

 On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com
 wrote:
  This will allow testing of compute shader functionality before it is
  completed.
 
  To enable ARB_compute_shader functionality in the i965 driver, set
  INTEL_COMPUTE_SHADER=1.
  ---
   src/mesa/drivers/dri/i965/brw_context.c  | 11 ++-
   src/mesa/drivers/dri/i965/intel_extensions.c |  2 ++
   2 files changed, 12 insertions(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
 b/src/mesa/drivers/dri/i965/brw_context.c
  index 1b42751..76dd9be 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.c
  +++ b/src/mesa/drivers/dri/i965/brw_context.c
  @@ -298,10 +298,17 @@ brw_initialize_context_constants(struct
 brw_context *brw)
 ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits =
 BRW_MAX_TEX_UNIT;
  else
 ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0;
  +   if (getenv(INTEL_COMPUTE_SHADER)) {

 What about trying to make use of
 MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?

 We could add
 extensions.c:bool _mesa_is_extension_override_enabled(char *)

 And then
 if (_mesa_is_extension_override_enabled(GL_ARB_compute_shader))

 Or, similarly, get overrides shoved into ctx-Extensions to allow
 checking for overrides at this early stage.

 -Jordan


I looked into what would be necessary to do this, and it's unfortunately
more complicated than it should be due to some flukes about initialization
order (perhaps this is what you were alluding to when you said get
overrides shoved into ctx-Extensions to allow checking for overrides at
this early stage).  Currently, our initialization order is:

1. brwCreateContext() calls brw_initialize_context_constants(), which needs
to know whether compute shaders are supported in order to set constants
like MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS
correctly.

2. Later, brwCreateContext() calls intelInitExtensions(), which is the
function where we'll set ctx-Extensions.ARB_compute_shader to true once
compute shaders are fully supported.

3. Much later in initialization, when the context is being made current for
the first time, core Mesa calls _mesa_make_extension_string(), which calls
get_extension_override() to modify the the ctx-Extensions table based on
the environment variable override.

To do what you suggested, I would either have to:

A. change 1 to call _mesa_is_extension_override_enabled(); that function,
in turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
couldn't reuse the parsing code in _mesa_make_extension_string(), since
that parsing code updates ctx-Extensions as a side effect, and it's not
time to update ctx-Extensions yet.  In addition to the code duplication,
we would have a danger of bugs, since the override takes effect so late in
initialization--if we ever accidentally introduced some code in between
steps 2 and 3 that checked the value of ctx-Extensions.ARB_compute_shader,
it would see false even if compute shaders were enabled by override.

Or:

B. change the order of initialization so that 2 happens first, followed by
3 and then 1.  In the long run I think this would be a good thing, but it's
a big change (since it affects all back ends) and I'm not sure it would be
a good idea to hold up this patch series waiting for it.


How would you feel if I landed the patch series as is, and then worked on a
follow up patch to do B?



  +  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits =
 BRW_MAX_TEX_UNIT;
  +  ctx-Const.MaxUniformBufferBindings += 12;
  +   } else {
  +  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0;
  +   }
  ctx-Const.MaxCombinedTextureImageUnits =
 ctx-Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
 ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits +
  -  ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits;
  +  ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
  +  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
 
  ctx-Const.MaxTextureLevels = 14; /* 8192 */
  if (ctx-Const.MaxTextureLevels  MAX_TEXTURE_LEVELS)
  @@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context
 *brw)
 ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters =
 MAX_ATOMIC_COUNTERS;
 ctx-Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters =
 MAX_ATOMIC_COUNTERS;
 ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters =
 MAX_ATOMIC_COUNTERS;
  +  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxAtomicCounters =
 MAX_ATOMIC_COUNTERS;
 ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers =
 BRW_MAX_ABO;
 ctx-Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers =
 BRW_MAX_ABO;
 ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers =
 BRW_MAX_ABO;
  +  ctx-Const.Program

Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
 on the output file.  That *might* be benign for
now (I haven't looked through all the callers to write_string to see
whether the distinction between  and NULL matters) but even so, this is
the sort of thing that can lead to subtle bugs in the future.

I'd suggest that instead, we store strlen + 1, so that the reader can tell
the difference between NULL and .


 +
 +  if (str)
 + write(str, len + 1);


It's surprising to see you writing len + 1 bytes (so surprising that at
first I thought it must be a mistake).  Reading ahead to patch 3/10, it
looks like you did this so that you could call ralloc_strdup() in
read_string().  But that seems a little wasteful to me, not because we're
wasting a byte but because it means that read_string() traverses the string
twice:

- First it reads the length
- Then it checks that a null terminator exists in the expected place
- Then it calls ralloc_strdup(), which walks the string to see how long it
is (totally unnecessary, since the length is known)
- ralloc_strdup() calls ralloc_array() to allocate memory for the result
- ralloc_strdup() calls memcpy() to make a copy of the string
- ralloc_strdup() stores a null terminator at the end of the copied string.

Personally, I would just call ralloc_array() directly from read_string(),
then memcpy() the contents of the string and store the null terminator.  In
which case there is no need to store the null terminator when writing the
string.

But I'm nit picking at this point, so I'm not going to be a stickler about
it.  If you would prefer to just add a comment explaining why you're
serializing the null terminator, I would be ok with that too.


 +   }
 +
 +   unsigned position()
 +   {
 +  return pos;
 +   }
 +
 +   /**
 +* check if some data was written
 +*/
 +   bool data_was_written(void *data, uint32_t id)


For functions that ask questions, I recommend using a word order that's
unambiguously a question (e.g. was_data_written).  data_was_written
sounds more like a statement, so someone might misinterpret this function
as a synonym for mark_data_written().

But I'm nit picking again here.  With the other issues addressed, this
patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
Whoops, I discovered another issue:


On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote:

 Class will be used by the shader binary cache implementation.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/memory_writer.h | 188
 +++
  1 file changed, 188 insertions(+)
  create mode 100644 src/glsl/memory_writer.h

 diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
 new file mode 100644
 index 000..979169f
 --- /dev/null
 +++ b/src/glsl/memory_writer.h
 @@ -0,0 +1,188 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#pragma once
 +#ifndef MEMORY_WRITER_H
 +#define MEMORY_WRITER_H
 +
 +#include stdlib.h
 +#include unistd.h
 +#include string.h
 +
 +#include main/hash_table.h
 +
 +#ifdef __cplusplus
 +/**
 + * Helper class for writing data to memory
 + *
 + * This class maintains a dynamically-sized memory buffer and allows
 + * for data to be efficiently appended to it with automatic resizing.
 + */
 +class memory_writer
 +{
 +public:
 +   memory_writer() :
 +  memory(NULL),
 +  curr_size(0),
 +  pos(0)
 +   {
 +  data_hash = _mesa_hash_table_create(0, int_equal);
 +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
 +   }
 +
 +   ~memory_writer()
 +   {
 +  free(memory);
 +  _mesa_hash_table_destroy(data_hash, NULL);
 +   }
 +
 +   /* user wants to claim the memory */
 +   char *release_memory(size_t *size)
 +   {
 +  /* final realloc to free allocated but unused memory */
 +  char *result = (char *) realloc(memory, pos);
 +  *size = pos;
 +  memory = NULL;
 +  curr_size = 0;
 +  pos = 0;
 +  return result;
 +   }
 +
 +/**
 + * write functions per type
 + */
 +#define DECL_WRITER(type) void write_ ##type (const type data) {\
 +   write(data, sizeof(type));\
 +}
 +
 +   DECL_WRITER(int32_t);
 +   DECL_WRITER(int64_t);
 +   DECL_WRITER(uint8_t);
 +   DECL_WRITER(uint32_t);
 +
 +   void write_bool(bool data)
 +   {
 +  uint8_t val = data;
 +  write_uint8_t(val);
 +   }
 +
 +   /* write function that reallocates more memory if required */
 +   void write(const void *data, int size)
 +   {
 +  if (!memory || pos  (curr_size - size))
 + if (!grow(size))
 +return;
 +
 +  memcpy(memory + pos, data, size);
 +
 +  pos += size;
 +   }
 +
 +   void overwrite(const void *data, int size, int offset)
 +   {
 +  if (offset  0 || offset + size  pos)
 + return;
 +  memcpy(memory + offset, data, size);
 +   }
 +
 +   /* length is written to make reading safe */
 +   void write_string(const char *str)
 +   {
 +  uint32_t len = str ? strlen(str) : 0;
 +  write_uint32_t(len);
 +
 +  if (str)
 + write(str, len + 1);
 +   }
 +
 +   unsigned position()
 +   {
 +  return pos;
 +   }
 +
 +   /**
 +* check if some data was written
 +*/
 +   bool data_was_written(void *data, uint32_t id)
 +   {
 +  hash_entry *entry =
 + _mesa_hash_table_search(data_hash, hash_value,
 +(void*) (intptr_t) id);


This isn't an efficient way to use a hashtable.  You're always passing the
same hash_value (the one computed in the constructor), which means that
every single hash entry will be stored in the same chain, and lookups will
be really slow.

Since you're only using this hashtable to detect duplicate glsl_types,
there's a much easier approach: take advantage of the fact that glsl_type
is a flyweight class (in other words, code elsewhere in Mesa ensures that
no duplicate glsl_type objects exist anywhere in memory, so any two
glsl_type pointers can be compared simply using pointer equality).  Because
of that, you should be able to just use a pointer-based hashtable, like
ir_variable_refcount_visitor does.  In short, that 

Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
On 4 February 2014 19:52, Paul Berry stereotype...@gmail.com wrote:

 Whoops, I discovered another issue:


 On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote:

 Class will be used by the shader binary cache implementation.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/memory_writer.h | 188
 +++
  1 file changed, 188 insertions(+)
  create mode 100644 src/glsl/memory_writer.h

 diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
 new file mode 100644
 index 000..979169f
 --- /dev/null
 +++ b/src/glsl/memory_writer.h
 @@ -0,0 +1,188 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining
 a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
 ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#pragma once
 +#ifndef MEMORY_WRITER_H
 +#define MEMORY_WRITER_H
 +
 +#include stdlib.h
 +#include unistd.h
 +#include string.h
 +
 +#include main/hash_table.h
 +
 +#ifdef __cplusplus
 +/**
 + * Helper class for writing data to memory
 + *
 + * This class maintains a dynamically-sized memory buffer and allows
 + * for data to be efficiently appended to it with automatic resizing.
 + */
 +class memory_writer
 +{
 +public:
 +   memory_writer() :
 +  memory(NULL),
 +  curr_size(0),
 +  pos(0)
 +   {
 +  data_hash = _mesa_hash_table_create(0, int_equal);
 +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
 +   }
 +
 +   ~memory_writer()
 +   {
 +  free(memory);
 +  _mesa_hash_table_destroy(data_hash, NULL);
 +   }
 +
 +   /* user wants to claim the memory */
 +   char *release_memory(size_t *size)
 +   {
 +  /* final realloc to free allocated but unused memory */
 +  char *result = (char *) realloc(memory, pos);
 +  *size = pos;
 +  memory = NULL;
 +  curr_size = 0;
 +  pos = 0;
 +  return result;
 +   }
 +
 +/**
 + * write functions per type
 + */
 +#define DECL_WRITER(type) void write_ ##type (const type data) {\
 +   write(data, sizeof(type));\
 +}
 +
 +   DECL_WRITER(int32_t);
 +   DECL_WRITER(int64_t);
 +   DECL_WRITER(uint8_t);
 +   DECL_WRITER(uint32_t);
 +
 +   void write_bool(bool data)
 +   {
 +  uint8_t val = data;
 +  write_uint8_t(val);
 +   }
 +
 +   /* write function that reallocates more memory if required */
 +   void write(const void *data, int size)
 +   {
 +  if (!memory || pos  (curr_size - size))
 + if (!grow(size))
 +return;
 +
 +  memcpy(memory + pos, data, size);
 +
 +  pos += size;
 +   }
 +
 +   void overwrite(const void *data, int size, int offset)
 +   {
 +  if (offset  0 || offset + size  pos)
 + return;
 +  memcpy(memory + offset, data, size);
 +   }
 +
 +   /* length is written to make reading safe */
 +   void write_string(const char *str)
 +   {
 +  uint32_t len = str ? strlen(str) : 0;
 +  write_uint32_t(len);
 +
 +  if (str)
 + write(str, len + 1);
 +   }
 +
 +   unsigned position()
 +   {
 +  return pos;
 +   }
 +
 +   /**
 +* check if some data was written
 +*/
 +   bool data_was_written(void *data, uint32_t id)
 +   {
 +  hash_entry *entry =
 + _mesa_hash_table_search(data_hash, hash_value,
 +(void*) (intptr_t) id);


 This isn't an efficient way to use a hashtable.  You're always passing the
 same hash_value (the one computed in the constructor), which means that
 every single hash entry will be stored in the same chain, and lookups will
 be really slow.

 Since you're only using this hashtable to detect duplicate glsl_types,
 there's a much easier approach: take advantage of the fact that glsl_type
 is a flyweight class (in other words, code elsewhere in Mesa ensures that
 no duplicate glsl_type objects exist anywhere in memory, so any two
 glsl_type pointers can be compared simply using pointer equality).  Because
 of that, you should be able to just use

Re: [Mesa-dev] [PATCH 02/10] glsl: serialize methods for IR instructions

2014-02-04 Thread Paul Berry
On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote:

 +
 +
 +/**
 + * Serialization of exec_list, writes length of the list +
 + * calls serialize_data for each instruction
 + */
 +void
 +serialize_list(exec_list *list, memory_writer mem)
 +{
 +   uint32_t list_len = 0;
 +   foreach_list(node, list)
 +  list_len++;
 +
 +   mem.write_uint32_t(list_len);
 +
 +   foreach_list(node, list)
 +  ((ir_instruction *)node)-serialize(mem);


Nit pick: we could avoid having to walk the list twice by using
mem.overwrite() to write out the length once we've written all the list
elements.  A similar comment applies to ir_call::serialize_data().


 +}
 +
 +
 +static void
 +serialize_glsl_type(const glsl_type *type, memory_writer mem)
 +{
 +   uint32_t data_len = 0;
 +
 +   mem.write_string(type-name);
 +
 +   unsigned start_pos = mem.position();
 +   mem.write_uint32_t(data_len);
 +
 +   uint32_t type_hash;
 +
 +   /**
 +* notify reader if a user defined type exists already
 +* (has been serialized before)
 +*/
 +   uint8_t user_type_exists = 0;
 +
 +   /* serialize only user defined types */
 +   switch (type-base_type) {
 +   case GLSL_TYPE_ARRAY:
 +   case GLSL_TYPE_STRUCT:
 +   case GLSL_TYPE_INTERFACE:
 +  break;
 +   default:
 +  goto glsl_type_serilization_epilogue;
 +   }
 +


With the changes I recommended in the last patch, I believe we can replace
everything from here...


 +   type_hash = _mesa_hash_data(type, sizeof(glsl_type));

+
 +   /* check if this type has been written before */
 +   if (mem.data_was_written((void *)type, type_hash))
 +  user_type_exists = 1;
 +   else
 +  mem.mark_data_written((void *)type, type_hash);


...to here with:

   user_type_exists = mem.make_unique_id(type, type_hash);

Although I'd recommend renaming type_hash to type_id so that there's no
confusion about the fact that it's a unique ID and not a hash (hashes
aren't guaranteed to be unique).


 +
 +   mem.write_uint8_t(user_type_exists);
 +   mem.write_uint32_t(type_hash);
 +
 +   /* no need to write again ... */
 +   if (user_type_exists)
 +  goto glsl_type_serilization_epilogue;


Nit pick: how about instead of the goto, just do:

if (!user_type_exists) {
   /* glsl type data */
   mem.write_uint32_t((uint32_t) type-length);
   ...
}

data_len = mem.position() - start_pos - sizeof(data_len);
...



 +
 +   /* glsl type data */
 +   mem.write_uint32_t((uint32_t)type-length);
 +   mem.write_uint8_t((uint8_t)type-base_type);
 +   mem.write_uint8_t((uint8_t)type-interface_packing);
 +
 +   if (type-base_type == GLSL_TYPE_ARRAY)
 +  serialize_glsl_type(type-element_type(), mem);
 +   else if (type-base_type == GLSL_TYPE_STRUCT ||
 +  type-base_type == GLSL_TYPE_INTERFACE) {
 +  glsl_struct_field *field = type-fields.structure;
 +  for (unsigned k = 0; k  type-length; k++, field++) {
 + mem.write_string(field-name);
 + serialize_glsl_type(field-type, mem);
 + mem.write_uint8_t((uint8_t)field-row_major);
 + mem.write_int32_t(field-location);
 + mem.write_uint8_t((uint8_t)field-interpolation);
 + mem.write_uint8_t((uint8_t)field-centroid);
 +  }
 +   }
 +
 +glsl_type_serilization_epilogue:
 +
 +   data_len = mem.position() - start_pos - sizeof(data_len);
 +   mem.overwrite(data_len, sizeof(data_len), start_pos);
 +}
 +
 +
 +void
 +ir_variable::serialize_data(memory_writer mem)
 +{
 +   /* name can be NULL, see ir_print_visitor for explanation */
 +   const char *non_null_name = name ? name : parameter;


Since mem.write_string handles NULL strings, can we get rid of this?


 +   int64_t unique_id = (int64_t) (intptr_t) this;
 +   uint8_t mode = data.mode;
 +   uint8_t has_constant_value = constant_value ? 1 : 0;
 +   uint8_t has_constant_initializer = constant_initializer ? 1 : 0;
 +
 +   serialize_glsl_type(type, mem);
 +
 +   mem.write_string(non_null_name);
 +   mem.write_int64_t(unique_id);
 +   mem.write_uint8_t(mode);
 +
 +   mem.write(data, sizeof(data));
 +
 +   mem.write_uint32_t(num_state_slots);
 +   mem.write_uint8_t(has_constant_value);
 +   mem.write_uint8_t(has_constant_initializer);
 +
 +   for (unsigned i = 0; i  num_state_slots; i++) {
 +  mem.write_int32_t(state_slots[i].swizzle);
 +  for (unsigned j = 0; j  5; j++) {
 + mem.write_int32_t(state_slots[i].tokens[j]);
 +  }
 +   }
 +
 +   if (constant_value)
 +  constant_value-serialize(mem);
 +
 +   if (constant_initializer)
 +  constant_initializer-serialize(mem);
 +
 +   uint8_t has_interface_type = get_interface_type() ? 1 : 0;
 +
 +   mem.write_uint8_t(has_interface_type);
 +   if (has_interface_type)
 +  serialize_glsl_type(get_interface_type(), mem);
 +}


Everything's a nit pick except for the suggestion to use make_unique_id().
With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev

Re: [Mesa-dev] [PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.

2014-02-03 Thread Paul Berry
On 1 February 2014 21:25, Jordan Justen jljus...@gmail.com wrote:

 On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com
 wrote:
  ---
   src/glsl/main.cpp   | 1 +
   src/glsl/standalone_scaffolding.cpp | 1 +
   src/mesa/main/context.c | 1 +
   src/mesa/main/get.c | 1 +
   src/mesa/main/get_hash_params.py| 3 +++
   src/mesa/main/mtypes.h  | 1 +
   6 files changed, 8 insertions(+)
 
  diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
  index bb2054f..94bc1cc 100644
  --- a/src/glsl/main.cpp
  +++ b/src/glsl/main.cpp
  @@ -53,6 +53,7 @@ initialize_context(struct gl_context *ctx, gl_api api)
  ctx-Const.MaxComputeWorkGroupSize[0] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[1] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[2] = 64;
  +   ctx-Const.MaxComputeWorkGroupInvocations = 1024;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /*
 not used */
  diff --git a/src/glsl/standalone_scaffolding.cpp
 b/src/glsl/standalone_scaffolding.cpp
  index e8eb529..0c83ea3 100644
  --- a/src/glsl/standalone_scaffolding.cpp
  +++ b/src/glsl/standalone_scaffolding.cpp
  @@ -143,6 +143,7 @@ void initialize_context_to_defaults(struct
 gl_context *ctx, gl_api api)
  ctx-Const.MaxComputeWorkGroupSize[0] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[1] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[2] = 64;
  +   ctx-Const.MaxComputeWorkGroupInvocations = 1024;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
  ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /*
 not used */
  diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
  index ebe27b4..942f247 100644
  --- a/src/mesa/main/context.c
  +++ b/src/mesa/main/context.c
  @@ -695,6 +695,7 @@ _mesa_init_constants(struct gl_context *ctx)
  ctx-Const.MaxComputeWorkGroupSize[0] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[1] = 1024;
  ctx-Const.MaxComputeWorkGroupSize[2] = 64;
  +   ctx-Const.MaxComputeWorkGroupInvocations = 1024;
   }
 
 
  diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
  index 6b914f4..bcbb5d5 100644
  --- a/src/mesa/main/get.c
  +++ b/src/mesa/main/get.c
  @@ -382,6 +382,7 @@ EXTRA_EXT(ARB_texture_multisample);
   EXTRA_EXT(ARB_texture_gather);
   EXTRA_EXT(ARB_shader_atomic_counters);
   EXTRA_EXT(ARB_draw_indirect);
  +EXTRA_EXT(ARB_compute_shader);
 
   static const int
   extra_ARB_color_buffer_float_or_glcore[] = {
  diff --git a/src/mesa/main/get_hash_params.py
 b/src/mesa/main/get_hash_params.py
  index 7f025a9..c18e848 100644
  --- a/src/mesa/main/get_hash_params.py
  +++ b/src/mesa/main/get_hash_params.py
  @@ -742,6 +742,9 @@ descriptor=[
   # GL_ARB_vertex_attrib_binding
 [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET,
 CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
 [ MAX_VERTEX_ATTRIB_BINDINGS,
 CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ],
  +
  +# GL_ARB_compute_shader
  +  [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS,
 CONTEXT_ENUM(Const.MaxComputeWorkGroupInvocations),
 extra_ARB_compute_shader ],

 CONTEXT_ENUM and CONTEXT_INT seem to follow the same code paths, but
 would CONTEXT_INT be a better fit here?


You're right, CONTEXT_INT makes more sense here.  I've changed it.



 Also, how do we decide between adding to gl_constants and just using
 CONST(1024)? My thought would be to use CONST unless we are pretty
 sure drivers will want to vary the value.


My reasoning for using gl_constants is that the value of the constant is
determined by a hardware limitation rather than a core Mesa software
limitation, hence it's likely to vary from one back end to the next.  I had
a quick look through our uses of CONST and it looks like most of the time
we use CONST it's when we're talking about a core Mesa software limitation
(e.g. MAX_MODELVIEW_STACK_DEPTH is CONST(MAX_MODELVIEW_STACK_DEPTH) because
that's the size of the per-context data structure we use to keep track of
the modelview matrix stack.

I admit that we haven't historically been terribly consistent about this
decision, though :)



 21-23 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

   ]},
 
   # Enums restricted to OpenGL Core profile
  diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
  index d3f3a30..6481dc1 100644
  --- a/src/mesa/main/mtypes.h
  +++ b/src/mesa/main/mtypes.h
  @@ -3323,6 +3323,7 @@ struct gl_constants
 
  /** GL_ARB_compute_shader */
  GLuint MaxComputeWorkGroupSize[3];
  +   GLuint MaxComputeWorkGroupInvocations;
   };
 
 
  --
  1.8.5.2
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.

2014-02-03 Thread Paul Berry
On 1 February 2014 22:28, Jordan Justen jljus...@gmail.com wrote:

 On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com
 wrote:
  ---
   src/mesa/main/shaderapi.c | 18 ++
   1 file changed, 18 insertions(+)
 
  diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
  index 053f27b..680d449 100644
  --- a/src/mesa/main/shaderapi.c
  +++ b/src/mesa/main/shaderapi.c
  @@ -663,6 +663,24 @@ get_programiv(struct gl_context *ctx, GLuint
 program, GLenum pname, GLint *param
 
 *params = shProg-NumAtomicBuffers;
 return;
  +   case GL_COMPUTE_WORK_GROUP_SIZE: {
  +  int i;
  +  if (!_mesa_is_desktop_gl(ctx) ||
 !ctx-Extensions.ARB_compute_shader)
  + break;
  +  if (!shProg-LinkStatus) {
  + _mesa_error(ctx, GL_INVALID_OPERATION, glGetProgramiv(program
 not 
  + linked successfully));

 Nit, but dropping  successfully sounds better to me. It seems more
 likely that they tried to get the value before linking than that they
 ignored a link error.


Sure, I can go along with that.  I've changed it to just
glGetProgramiv(program not linked).



 24-26 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

  + return;
  +  }
  +  if (shProg-_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
  + _mesa_error(ctx, GL_INVALID_OPERATION, glGetProgramiv(no
 compute 
  + shaders));
  + return;
  +  }
  +  for (i = 0; i  3; i++)
  + params[i] = shProg-Comp.LocalSize[i];
  +  return;
  +   }
  default:
 break;
  }
  --
  1.8.5.2
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.

2014-02-03 Thread Paul Berry
On 1 February 2014 22:37, Jordan Justen jljus...@gmail.com wrote:

 On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com
 wrote:
  ---
   src/mesa/drivers/dri/i965/brw_context.h |  8 
   src/mesa/drivers/dri/i965/brw_program.c | 11 +++
   2 files changed, 19 insertions(+)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_context.h
 b/src/mesa/drivers/dri/i965/brw_context.h
  index df32ccb..abc1783 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.h
  +++ b/src/mesa/drivers/dri/i965/brw_context.h
  @@ -316,6 +316,14 @@ struct brw_fragment_program {
  GLuint id;  /** serial no. to identify frag progs, never re-used */
   };
 
  +
  +/** Subclass of Mesa compute program */
  +struct brw_compute_program {
  +   struct gl_compute_program program;
  +   unsigned id;  /** serial no. to identify frag progs, never re-used
 */

 frag in comment


Whoops, thanks.  I've changed it to serial no. to identify compute
progs



 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

  +};
  +
  +
   struct brw_shader {
  struct gl_shader base;
 
  diff --git a/src/mesa/drivers/dri/i965/brw_program.c
 b/src/mesa/drivers/dri/i965/brw_program.c
  index 90844e5..2d92acb 100644
  --- a/src/mesa/drivers/dri/i965/brw_program.c
  +++ b/src/mesa/drivers/dri/i965/brw_program.c
  @@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct
 gl_context *ctx,
 }
  }
 
  +   case GL_COMPUTE_PROGRAM_NV: {
  +  struct brw_compute_program *prog =
 CALLOC_STRUCT(brw_compute_program);
  +  if (prog) {
  + prog-id = get_new_program_id(brw-intelScreen);
  +
  + return _mesa_init_compute_program(ctx, prog-program, target,
 id);
  +  } else {
  + return NULL;
  +  }
  +   }
  +
  default:
 assert(!Unsupported target in brwNewProgram());
 return NULL;
  --
  1.8.5.2
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 10/11] glsl: add a pass to convert out of SSA form

2014-01-31 Thread Paul Berry
-else_instructions.push_tail(assign(var, phi-else_src));
 +
 +   phi-remove();
 +}
 +
 +static void
 +eliminate_phi_loop_begin(ir_phi_loop_begin *phi, ir_loop *ir, exec_list
 *instrs)
 +{
 +   ir_variable *var = insert_decl(instrs, phi-dest-type,
 ralloc_parent(ir));
 +   ir-body_instructions.push_head(phi-dest);
 +   phi-dest-insert_after(assign(phi-dest, var));
 +   phi-dest-data.mode = ir_var_temporary;
 +
 +   if (phi-enter_src != NULL)
 +  ir-insert_before(assign(var, phi-enter_src));
 +
 +   if (phi-repeat_src != NULL)
 +  ir-body_instructions.push_tail(assign(var, phi-repeat_src));
 +
 +   foreach_list(n, phi-continue_srcs) {
 +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
 +
 +  if (src-src != NULL)
 +src-jump-insert_before(assign(var, src-src));
 +   }
 +
 +   phi-remove();
 +}
 +
 +static void
 +eliminate_phi_loop_end(ir_phi_loop_end *phi, ir_loop *ir, exec_list
 *instrs)
 +{
 +   ir_variable *var = insert_decl(instrs, phi-dest-type,
 ralloc_parent(ir));
 +   ir-insert_after(phi-dest);
 +   phi-dest-insert_after(assign(phi-dest, var));
 +   phi-dest-data.mode = ir_var_temporary;
 +
 +   foreach_list(n, phi-break_srcs) {
 +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
 +
 +  if (src-src != NULL)
 +src-jump-insert_before(assign(var, src-src));
 +   }
 +
 +   phi-remove();
 +}
 +
 +namespace {
 +
 +class ir_from_ssa_visitor : public ir_hierarchical_visitor
 +{
 +public:
 +   ir_from_ssa_visitor(exec_list *base_instrs) : base_instrs(base_instrs)
 +   {
 +   }
 +
 +   virtual ir_visitor_status visit_leave(ir_if *);
 +   virtual ir_visitor_status visit_enter(ir_loop *);
 +   virtual ir_visitor_status visit_leave(ir_loop *);
 +   virtual ir_visitor_status visit(ir_dereference_variable *);
 +
 +private:
 +   exec_list *base_instrs;
 +};
 +
 +};
 +
 +ir_visitor_status
 +ir_from_ssa_visitor::visit_leave(ir_if *ir)
 +{
 +   foreach_list_safe(n, ir-phi_nodes) {
 +  eliminate_phi_if((ir_phi_if *) n, ir, this-base_instrs);
 +   }
 +
 +   return visit_continue;
 +}
 +
 +ir_visitor_status
 +ir_from_ssa_visitor::visit_enter(ir_loop *ir)
 +{
 +   foreach_list_safe(n, ir-begin_phi_nodes) {
 +  eliminate_phi_loop_begin((ir_phi_loop_begin *) n, ir,
 this-base_instrs);
 +   }
 +
 +   return visit_continue;
 +}
 +
 +ir_visitor_status
 +ir_from_ssa_visitor::visit_leave(ir_loop *ir)
 +{
 +   foreach_list_safe(n, ir-end_phi_nodes) {
 +  eliminate_phi_loop_end((ir_phi_loop_end *) n, ir,
 this-base_instrs);
 +   }
 +
 +   return visit_continue;
 +}
 +
 +ir_visitor_status
 +ir_from_ssa_visitor::visit(ir_dereference_variable *ir)
 +{
 +   if (this-in_assignee  ir-var-data.mode == ir_var_temporary_ssa) {
 +  this-base_ir-insert_before(ir-var);
 +  ir-var-data.mode = ir_var_temporary;
 +   }
 +
 +   return visit_continue;
 +}
 +
 +static void
 +convert_from_ssa_function(exec_list *instructions)
 +{
 +   ir_from_ssa_visitor v(instructions);
 +   v.run(instructions);
 +}
 +
 +void
 +convert_from_ssa(exec_list *instructions)
 +{
 +   foreach_list(node, instructions) {
 +  ir_instruction *ir = (ir_instruction *) node;
 +  ir_function *f = ir-as_function();
 +  if (f) {
 +foreach_list(sig_node, f-signatures) {
 +   ir_function_signature *sig = (ir_function_signature *)
 sig_node;
 +
 +   convert_from_ssa_function(sig-body);
 +}
 +  }
 +   }
 +}


It looks like the only reason you need to manually walk through the
functions and signatures is so that you can make sure that
ir_from_ssa_visitor::base_instrs always points to the correct function's
instruction list.  Normally the way we achieve that sort of thing is to do:

ir_visitor_status
ir_from_ssa_visitor::visit_enter(ir_function_signature *ir)
{
   this-base_instrs = ir-body;
   return visit_continue;
}

ir_visitor_status
ir_from_ssa_visitor::visit_leave(ir_function_signature *ir)
{
   this-base_instrs = NULL;
   return visit_continue;
}

That way you could get rid of convert_from_ssa_function() and
convert_from_ssa() would just become:

void
convert_from_ssa(exec_list *instructions)
{
   ir_from_ssa_visitor v;
   v.run(instructions);
}


I don't have really strong feelings about it, though.  Provided that the
confusing comment in eliminate_phi_if() is fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 00/11] glsl: add Single Static Assignment (SSA)

2014-01-31 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote:

 This series enables GLSL IR support for SSA, including passes to convert
 to and from SSA form. SSA is a form of the intermediate representation
 of a compiler in which each variable is assigned exactly once. SSA form
 makes many optimizations faster and easier to write, and enables other
 more powerful optimizations. SSA is used in GCC [1] and LLVM [2] as well
 as various compiler backends within Mesa itself, such as r600g-sb and
 Nouveau. Adding support for SSA will allow the various optimizations
 these backends perform to be implemented in one place, instead of
 making each driver reinvent the wheel (as several have already done).
 Additionally, all new backends would recieve these optimizations,
 reducing the burden of writing a compiler backend for a new driver.

 Even though no optimization passes are now implemented, I am putting out
 this series to solicit feedback on the design, to make sure I don't have
 to rewrite things before I go ahead and write these new passes.

 There are no piglit regressions on Softpipe, except for the
 spec/OpenGL 2.0/max-samplers test, which only passed before because the
 compiler happened to unroll the loop; the extra copies caused by the
 conversion to and from SSA stop the compiler from unrolling, meaning
 that the resulting GLSL IR code contains an indirect sampler index which
 glsl-to-tgsi can't handle.


I had a detailed look at your patches and I really like where you're going
with this.  I sent a lot of feedback in response to patches 4, 5, 7, 9, and
10; I don't think any of my feedback would require a major change to your
overall plan.  Nice work!  Consider patches 1, 2, 3, 6, 8, and 11:

Reviewed-by: Paul Berry stereotype...@gmail.com

Note, however, that probably patch 11 should be postponed until we've
written something that makes use of the SSA form (i.e. the GVN-GCM
algorithm you mentioned, or a direct conversion from GLSL IR to an SSA
backend).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: Fix continue statements in do-while loops.

2014-01-31 Thread Paul Berry
From the GLSL 4.40 spec, section 6.4 (Jumps):

The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.

Previously, we incorrectly treated a continue statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the continue statement to IR.  (We already do a similar
thing in for loops, to ensure that continue causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast_to_hir.cpp | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 950f513..8d096ad 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4029,17 +4029,22 @@ ast_jump_statement::hir(exec_list *instructions,
 _mesa_glsl_error( loc, state,
  break may only appear in a loop or a switch);
   } else {
-/* For a loop, inline the for loop expression again,
- * since we don't know where near the end of
- * the loop body the normal copy of it
- * is going to be placed.
+/* For a loop, inline the for loop expression again, since we don't
+ * know where near the end of the loop body the normal copy of it is
+ * going to be placed.  Same goes for the condition for a do-while
+ * loop.
  */
 if (state-loop_nesting_ast != NULL 
-mode == ast_continue 
-state-loop_nesting_ast-rest_expression) {
-   state-loop_nesting_ast-rest_expression-hir(instructions,
- state);
-}
+mode == ast_continue) {
+if (state-loop_nesting_ast-rest_expression) {
+   state-loop_nesting_ast-rest_expression-hir(instructions,
+ state);
+}
+if (state-loop_nesting_ast-mode ==
+ast_iteration_statement::ast_do_while) {
+   state-loop_nesting_ast-condition_to_hir(instructions, state);
+}
+ }
 
 if (state-switch_state.is_switch_innermost 
 mode == ast_break) {
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl: Make condition_to_hir() callable from outside ast_iteration_statement.

2014-01-31 Thread Paul Berry
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast.h  |  3 +--
 src/glsl/ast_to_hir.cpp | 10 +-
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 0bda28d..2d6f3a2 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -888,14 +888,13 @@ public:
 
ast_node *body;
 
-private:
/**
 * Generate IR from the condition of a loop
 *
 * This is factored out of ::hir because some loops have the condition
 * test at the top (for and while), and others have it at the end 
(do-while).
 */
-   void condition_to_hir(class ir_loop *, struct _mesa_glsl_parse_state *);
+   void condition_to_hir(exec_list *, struct _mesa_glsl_parse_state *);
 };
 
 
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1bfb4e5..950f513 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4369,14 +4369,14 @@ ast_case_label::hir(exec_list *instructions,
 }
 
 void
-ast_iteration_statement::condition_to_hir(ir_loop *stmt,
+ast_iteration_statement::condition_to_hir(exec_list *instructions,
  struct _mesa_glsl_parse_state *state)
 {
void *ctx = state;
 
if (condition != NULL) {
   ir_rvalue *const cond =
-condition-hir( stmt-body_instructions, state);
+condition-hir(instructions, state);
 
   if ((cond == NULL)
  || !cond-type-is_boolean() || !cond-type-is_scalar()) {
@@ -4397,7 +4397,7 @@ ast_iteration_statement::condition_to_hir(ir_loop *stmt,
new(ctx) ir_loop_jump(ir_loop_jump::jump_break);
 
 if_stmt-then_instructions.push_tail(break_stmt);
-stmt-body_instructions.push_tail(if_stmt);
+instructions-push_tail(if_stmt);
   }
}
 }
@@ -4432,7 +4432,7 @@ ast_iteration_statement::hir(exec_list *instructions,
state-switch_state.is_switch_innermost = false;
 
if (mode != ast_do_while)
-  condition_to_hir(stmt, state);
+  condition_to_hir(stmt-body_instructions, state);
 
if (body != NULL)
   body-hir( stmt-body_instructions, state);
@@ -4441,7 +4441,7 @@ ast_iteration_statement::hir(exec_list *instructions,
   rest_expression-hir( stmt-body_instructions, state);
 
if (mode == ast_do_while)
-  condition_to_hir(stmt, state);
+  condition_to_hir(stmt-body_instructions, state);
 
if (mode != ast_do_while)
   state-symbols-pop_scope();
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/7] glsl: parse invocations layout qualifier for ARB_gpu_shader5

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote:

 _mesa_glsl_parse_state::gs_invocations will store the
 invocation count.

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com


This looks like it will work if the shader contains a line like this:

   layout(triangles, invocations=6) in;

But it won't handle this:

   layout(triangles) in;
   layout(invocations=6) in;

Although it's not explicitly mentioned in the GLSL spec, I believe that
this should be allowed too.  (The spec does explicitly allow similar
splitting of geometry shader output layout qualifiers.)

Note that the way we handle geometry shader input primitive types is
unusual, so it might not have been the best to imitate.  For other types of
shader-global layout qualifiers (i.e. geometry shader output primitive
count and geometry shader max_vertices), the code in glsl_parser.yy simply
merges the qualifiers into the global state-out_qualifier object at the
time that they are parsed.  The reason we handle geometry shader input
primitive types specially is because they have additional semantics (they
cause geometry shader input arrays to be resized), so we can't process them
until ast-to-hir time.  But that doesn't apply to the invocation count.

I'd recommend an in_qualifier field to _mesa_glsl_parse_state, and handling
invocations the same way that geometry shader output qualifiers are handled
(grep for -out_qualifier to how this works).  Unfortuneatly, because of
the extra semantics associated with input primitive types, we'll still have
to process them specially.

So I think the code in glsl_parser.yy will need to look something like this:

   | layout_qualifier IN_TOK ';'
   {
  void *ctx = state;
  $$ = NULL;
  if (state-stage != MESA_SHADER_GEOMETRY) {
 _mesa_glsl_error( @1, state,
  input layout qualifiers only valid in 
  geometry shaders);
  } else {
 if ($1.flags.q.prim_type) {
/* Make sure this is a valid input primitive type. */
switch ($1.prim_type) {
case GL_POINTS:
case GL_LINES:
case GL_LINES_ADJACENCY:
case GL_TRIANGLES:
case GL_TRIANGLES_ADJACENCY:
   $$ = new(ctx) ast_gs_input_layout(@1, $1);
   break;
default:
   _mesa_glsl_error(@1, state,
invalid geometry shader input primitive
type);
   break;
}
 }
 if (!state-in_qualifier-merge_qualifier( @1, state, $1))
YYERROR;
  }
   }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/7] glsl/linker: produce gl_shader_program Geom.Invocations

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote:

 Grab the parsed invocation count, check for consistency
 during linking, and finally save the result in
 gl_shader_program Geom.Invocations.

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/glsl/glsl_parser_extras.cpp |  2 ++
  src/glsl/linker.cpp | 18 ++
  src/mesa/main/mtypes.h  |  2 ++
  3 files changed, 22 insertions(+)

 diff --git a/src/glsl/glsl_parser_extras.cpp
 b/src/glsl/glsl_parser_extras.cpp
 index 87784ed..3e98966 100644
 --- a/src/glsl/glsl_parser_extras.cpp
 +++ b/src/glsl/glsl_parser_extras.cpp
 @@ -1339,6 +1339,8 @@ set_shader_inout_layout(struct gl_shader *shader,
 if (state-out_qualifier-flags.q.max_vertices)
shader-Geom.VerticesOut = state-out_qualifier-max_vertices;

 +   shader-Geom.Invocations = state-gs_invocations;
 +


Considering my comment on patch 1, I think this will change to:

shader-Geom.Invocations = 0;
if (state-in_qualifier-flags.q.invocations)
   shader-Geom.Invocations = state-in_qualifier-invocations;



 if (state-gs_input_prim_type_specified) {
shader-Geom.InputType = state-gs_input_prim_type;
 } else {
 diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
 index 93b4754..800de0b 100644
 --- a/src/glsl/linker.cpp
 +++ b/src/glsl/linker.cpp
 @@ -1206,6 +1206,7 @@ link_gs_inout_layout_qualifiers(struct
 gl_shader_program *prog,
 unsigned num_shaders)
  {
 linked_shader-Geom.VerticesOut = 0;
 +   linked_shader-Geom.Invocations = 0;
 linked_shader-Geom.InputType = PRIM_UNKNOWN;
 linked_shader-Geom.OutputType = PRIM_UNKNOWN;

 @@ -1259,6 +1260,18 @@ link_gs_inout_layout_qualifiers(struct
 gl_shader_program *prog,
  }
  linked_shader-Geom.VerticesOut = shader-Geom.VerticesOut;
}
 +
 +  if (shader-Geom.Invocations != 0) {
 +if (linked_shader-Geom.Invocations != 0 
 +linked_shader-Geom.Invocations != shader-Geom.Invocations) {
 +   linker_error(prog, geometry shader defined with conflicting 
 +invocation count (%d and %d)\n,
 +linked_shader-Geom.Invocations,
 +shader-Geom.Invocations);
 +   return;
 +}
 +linked_shader-Geom.Invocations = shader-Geom.Invocations;
 +  }
 }

 /* Just do the intrastage - interstage propagation right now,
 @@ -1285,6 +1298,11 @@ link_gs_inout_layout_qualifiers(struct
 gl_shader_program *prog,
return;
 }
 prog-Geom.VerticesOut = linked_shader-Geom.VerticesOut;
 +
 +   if (linked_shader-Geom.Invocations == 0)
 +  linked_shader-Geom.Invocations = 1;


I think it would be equivalent if we dropped these two lines and simply
initialized linked_shader-Geom.Invocations to 1 at the top of the
function--is that right?


 +
 +   prog-Geom.Invocations = linked_shader-Geom.Invocations;
  }

  /**
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 3d42a21..2d90b0a 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -2401,6 +2401,7 @@ struct gl_shader
  */
 struct {
GLint VerticesOut;
 +  GLint Invocations;


It would be nice to have a comment above this declaration explaining that
if the shader didn't specify an invocation count, this value will be 1 (so
that back-end implementors know they don't have to coerce 0 to 1).


/**
 * GL_POINTS, GL_LINES, GL_LINES_ADJACENCY, GL_TRIANGLES, or
 * GL_TRIANGLES_ADJACENCY, or PRIM_UNKNOWN if it's not set in this
 @@ -2599,6 +2600,7 @@ struct gl_shader_program
 struct {
GLint VerticesIn;
GLint VerticesOut;
 +  GLint Invocations;


Same comment applies here.


GLenum InputType;  /** GL_POINTS, GL_LINES, GL_LINES_ADJACENCY_ARB,
GL_TRIANGLES, or GL_TRIANGLES_ADJACENCY_ARB
 */
GLenum OutputType; /** GL_POINTS, GL_LINE_STRIP or
 GL_TRIANGLE_STRIP */
 --
 1.8.5.3


With those minor issues addressed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/7] main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote:

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/mesa/main/shaderapi.c | 6 ++
  1 file changed, 6 insertions(+)

 diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
 index a8336c9..fb107d5 100644
 --- a/src/mesa/main/shaderapi.c
 +++ b/src/mesa/main/shaderapi.c
 @@ -603,6 +603,12 @@ get_programiv(struct gl_context *ctx, GLuint program,
 GLenum pname, GLint *param
if (check_gs_query(ctx, shProg))
   *params = shProg-Geom.VerticesOut;
return;
 +   case GL_GEOMETRY_SHADER_INVOCATIONS:
 +  if (!has_core_gs)


This needs to be:

   if (!has_core_gs || !ctx-Extensions.ARB_gpu_shader5)

so that the query will generate the proper error message on platforms that
don't support ARB_gpu_shader5.


 + break;
 +  if (check_gs_query(ctx, shProg))
 + *params = shProg-Geom.Invocations;
 +  return;
 case GL_GEOMETRY_INPUT_TYPE:
if (!has_core_gs)
   break;
 --
 1.8.5.3


With that fixed, the patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/7] i965: support gl_InvocationID for gen7

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote:

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_defines.h   |  5 +
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 24
 ---
  2 files changed, 26 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 7f4cd10..5fe1aba 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -1500,6 +1500,11 @@ enum brw_message_target {
  # define BRW_GS_EDGE_INDICATOR_0   (1  8)
  # define BRW_GS_EDGE_INDICATOR_1   (1  9)

 +/* GS Thread Payload
 + */
 +/* R0 */
 +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27
 +
  /* 3DSTATE_GS Output Vertex Size has an effective maximum of 62.  It's
   * counted in multiples of 16 bytes.
   */
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 index 40743cc..12e137c 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 @@ -51,9 +51,27 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context
 *brw,
  dst_reg *
  vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
  {
 -   /* Geometry shaders don't use any system values. */
 -   assert(!Unreached);
 -   return NULL;
 +   dst_reg *reg;
 +   src_reg r0(retype(brw_vec4_grf(0, 0), BRW_REGISTER_TYPE_UD));
 +
 +   switch (ir-data.location) {
 +   case SYSTEM_VALUE_INVOCATION_ID:
 +  this-current_annotation = initialize gl_InvocationID;
 +
 +  reg = new(mem_ctx) dst_reg(this, ir-type);
 +
 +  /* Copy and shift gen7 instance id from R0 into the
 +   * gl_InvocationID register.
 +   */
 +  emit(SHR(*reg, src_reg(r0),
 +   (uint32_t) GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT));
 +  break;
 +   default:
 +  assert(!not reached);
 +  break;
 +   }
 +
 +   return reg;
  }


This would do what you want if invocation ID 0 was delivered in the upper
bits of R0.0 and invocation ID 1 was delivered in the upper bits of R0.4.
But that's not the case.  They are delivered in R0.0 and R0.1
respectively.  I think you need to generate this instruction:

SHR (8) dst1 R01;4,0 GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT

Note that this instruction requires ALIGN1 mode and a goofy register
region, so you can't generate it directly from the visitor; what you'll
have to do instead is create a new opcode to do it.  If you want to see
examples where we've done similar things, grep for
GS_OPCODE_SET_WRITE_OFFSET, GS_OPCODE_SET_VERTEX_COUNT,
GS_OPCODE_SET_DWORD_2_IMMED, GS_OPCODE_PREPARE_CHANNEL_MASKS, or
GS_OPCODE_SET_CHANNEL_MASKS.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 7/7] i965: support instanced GS on gen7

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote:

 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_context.h   | 2 ++
  src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
  src/mesa/drivers/dri/i965/brw_vec4_gs.c   | 2 ++
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 --
  src/mesa/drivers/dri/i965/gen7_gs_state.c | 2 ++
  5 files changed, 11 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_context.h
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 8d098e6..ade4458 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -639,6 +639,8 @@ struct brw_gs_prog_data

 bool include_primitive_id;

 +   int invocations;
 +
 /**
  * True if the thread should be dispatched in DUAL_INSTANCE mode,
 false if
  * it should be dispatched in DUAL_OBJECT mode.
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 5fe1aba..eaf6e8f 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -1478,6 +1478,7 @@ enum brw_message_target {
  # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT 0
  # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID 1
  # define GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT20
 +# define GEN7_GS_INSTANCE_CONTROL_SHIFT15
  # define GEN7_GS_DISPATCH_MODE_SINGLE  (0  11)
  # define GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE   (1  11)
  # define GEN7_GS_DISPATCH_MODE_DUAL_OBJECT (2  11)
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 index abc181b..3c6393f 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 @@ -48,6 +48,8 @@ do_gs_prog(struct brw_context *brw,
 c.prog_data.include_primitive_id =
(gp-program.Base.InputsRead  VARYING_BIT_PRIMITIVE_ID) != 0;

 +   c.prog_data.invocations = gp-program.Invocations;
 +
 /* Allocate the references to the uniforms that will end up in the
  * prog_data associated with the compiled program, and which will be
 freed
  * by the state cache.
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 index 12e137c..21dbc29 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 @@ -596,9 +596,11 @@ brw_gs_emit(struct brw_context *brw,
 }

 /* Compile the geometry shader in DUAL_OBJECT dispatch mode, if we can
 do
 -* so without spilling.
 +* so without spilling. If the GS invocations count  1, then we can't
 use
 +* dual object mode.
  */
 -   if (likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {
 +   if (c-prog_data.invocations = 1 ||
 +   likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {


This should be , not ||.

With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com

I sent comments on patches 1, 2, 4, 6, and 7.  Patches 3 and 5 are:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure

2014-01-28 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote:

 This patch introduces all the changes to the IR that are necessary for
 representing programs in the SSA form. This consists of a new variable
 mode, the SSA temporary, which is guarenteed to be written to exactly
 once, and classes to represent phi nodes in the IR.

 In the current code, variables are first declared using an ir_variable
 instruction inserted into the instruction stream, and then every
 dereference will point to the ir_variable declared earlier. SSA
 temporaries, however, do not work this way. Instead, the variable
 is declared when it is assigned. That is, the variable is owned by
 the one and only instruction where it is defined.

 In SSA, phi nodes may exist at the beginning of any join nodes, or
 basic blocks with more than one predecessor. In our IR, this can happen
 in one of three places:

 - After an if statement, where the then branch and the else branch
 converge.
 - At the beginning of a loop, which can be reached from either before
 the loop (on the first iteration), the end of the loop (when we get to
 the end of the loop and jump back to the beginning), or any continue
 statement.
 - At the end of a loop, which can be reached from any break statement
 within the loop.

 Accordingly, there are three different types of phi nodes: if phi nodes,
 phi nodes at the beginning of a loop, and phi nodes at the end of a
 loop, all of which derive from the ir_phi base class.
 ---
  src/glsl/ir.cpp|  56 +++
  src/glsl/ir.h  | 196
 -
  src/glsl/ir_clone.cpp  | 147 ---
  src/glsl/ir_hierarchical_visitor.cpp   |  36 +
  src/glsl/ir_hierarchical_visitor.h |  11 ++
  src/glsl/ir_hv_accept.cpp  |  55 ++-
  src/glsl/ir_print_visitor.cpp  | 196
 -
  src/glsl/ir_print_visitor.h|  15 ++
  src/glsl/ir_validate.cpp   | 158 +++-
  src/glsl/ir_visitor.h  |   8 +
  src/mesa/drivers/dri/i965/brw_fs.h |   4 +
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  28 
  src/mesa/drivers/dri/i965/brw_vec4.h   |   4 +
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  24 +++
  src/mesa/program/ir_to_mesa.cpp|  28 
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  29 
  16 files changed, 956 insertions(+), 39 deletions(-)

 diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
 index 1a36bd6..f1ded80 100644
 --- a/src/glsl/ir.cpp
 +++ b/src/glsl/ir.cpp
 @@ -1229,6 +1229,37 @@ ir_loop::ir_loop()
  }


 +ir_phi::ir_phi()
 +{
 +   this-dest = NULL;
 +}


Rather than have a no-argument constructor for ir_phi that sets this-dest
to NULL, and then have the derived classes set dest to the appropriate
value, let's just do:

ir_phi::ir_phi(ir_variable *dest)
   : dest(dest)
{
}

Then in the derived classes we can just pass dest on through, e.g.:

ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
 ir_variable *else_src)
   : ir_phi(dest), if_src(if_src), else_src(else_src)
{
   this-ir_type = ir_type_phi_if;
}



 +
 +
 +ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
 +ir_variable *else_src)
 +   : if_src(if_src), else_src(else_src)
 +{
 +   this-ir_type = ir_type_phi_if;
 +   this-dest = dest;
 +}
 +
 +
 +ir_phi_loop_begin::ir_phi_loop_begin(ir_variable* dest, ir_variable*
 enter_src,
 +ir_variable* repeat_src)
 +   : enter_src(enter_src), repeat_src(repeat_src)
 +{
 +   this-ir_type = ir_type_phi_loop_begin;
 +   this-dest = dest;
 +}
 +
 +
 +ir_phi_loop_end::ir_phi_loop_end(ir_variable *dest)
 +{
 +   this-ir_type = ir_type_phi_loop_end;
 +   this-dest = dest;
 +}
 +
 +
  ir_dereference_variable::ir_dereference_variable(ir_variable *var)
  {
 assert(var != NULL);
 @@ -1554,6 +1585,9 @@ ir_variable::ir_variable(const struct glsl_type
 *type, const char *name,
 this-data.max_array_access = 0;
 this-data.atomic.buffer_index = 0;
 this-data.atomic.offset = 0;
 +   this-ssa_assignment = NULL;
 +   this-ssa_phi = NULL;
 +   this-ssa_call = NULL;

 if (type != NULL) {
if (type-base_type == GLSL_TYPE_SAMPLER)
 @@ -1722,12 +1756,19 @@ steal_memory(ir_instruction *ir, void *new_ctx)
  {
 ir_variable *var = ir-as_variable();
 ir_constant *constant = ir-as_constant();
 +   ir_dereference_variable *deref = ir-as_dereference_variable();
 +   ir_phi *phi = ir-as_phi();
 +   ir_phi_loop_begin *phi_loop_begin = ir-as_phi_loop_begin();
 +   ir_phi_loop_end *phi_loop_end = ir-as_phi_loop_end();
 if (var != NULL  var-constant_value != NULL)
steal_memory(var-constant_value, ir);

 if (var != NULL  var-constant_initializer != NULL)
steal_memory(var-constant_initializer, ir);

 +   if 

Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure

2014-01-28 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote:

 diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
 index cb732a5..7075579 100644
 --- a/src/glsl/ir_clone.cpp
 +++ b/src/glsl/ir_clone.cpp
 @@ -40,7 +40,15 @@ ir_rvalue::clone(void *mem_ctx, struct hash_table *ht)
 const
  ir_variable *
  ir_variable::clone(void *mem_ctx, struct hash_table *ht) const
  {
 -   ir_variable *var = new(mem_ctx) ir_variable(this-type, this-name,
 +   ir_variable *var;
 +
 +   if (ht) {
 +  var = (ir_variable *)hash_table_find(ht, this);
 +  if (var)
 +return var;
 +   }


Can you add a comment explaining why it's necessary to do a hashtable
lookup now?  (I'm guessing it's because a phi node can refer to an SSA
temporary that occurs later in the instruction stream, and therefore when
we go to clone an SSA temporary variable, it's possible that we've already
cloned it, but I'm not 100% certain about that).


 +
 +   var = new(mem_ctx) ir_variable(this-type, this-name,
(ir_variable_mode)
 this-data.mode);


Let's fix up this line to align the opening paren with the initial t in
this-type.


 +
 +ir_phi *
 +ir_phi::clone(void *mem_ctx, struct hash_table *ht) const
 +{
 +   /* shouldn't get here */
 +   assert(0);
 +   return new(mem_ctx) ir_phi();
 +}


There's no need to implement this function.  If we leave it out (and leave
out the corresponding declaration of clone() in ir.h's declaration of the
ir_phi class), then ir_phi will be an abstract base class, and the compiler
will make sure we never instantiate it.


 +static ir_phi_jump_src *
 +clone_phi_jump_src(ir_phi_jump_src *src, void *mem_ctx, struct hash_table
 *ht)
 +{
 +   ir_phi_jump_src *new_src = new(mem_ctx) ir_phi_jump_src();
 +   new_src-src = src-src-clone(mem_ctx, ht);
 +   new_src-jump = src-jump-clone(mem_ctx, ht);
 +   return new_src;


Let's make a constructor for ir_phi_jump_src that takes src and jump as
arguments, so that we can just do:

   return new(mem_ctx) ir_phi_jump_src(src-src-clone(mem_ctx, ht),
   src-jump-clone(mem_ctx, ht));

+
 +
 +void
 +ir_print_visitor::visit(ir_phi *ir)
 +{
 +   printf(error);
 +}


I think if we make ir_phi an abstract base class (like I suggested earlier)
we can drop this function entirely, as well as all the other visit(class
ir_phi *) functions introduced in this patch.


 +
 +
 +void
 +ir_print_visitor::visit(ir_phi_if *ir)
 +{
 +   printf((phi\n);


For consistency with the class names, I'd prefer for this to be:

   printf((phi_if\n);


 +
 +   indentation++;
 +   indent();
 +   ir-dest-accept(this);
 +   printf(\n);
 +
 +   indent();
 +   if (ir-if_src) {
 +  printf(%s , unique_name(ir-if_src));
 +   } else {
 +  printf((undefined) );
 +   }
 +   if (ir-else_src) {
 +  printf(%s), unique_name(ir-else_src));
 +   } else {
 +  printf((undefined)));
 +   }


Would it be worth modifying unique_name() so that it returns (undefined)
if it's passed a null pointer?  That would allow us to simplify the 10
lines above to:

   printf(%s %s, unique_name(ir-if_src), unique_name(ir-else_src));

As well as similar simplifications in the functions below.


 +   indentation--;
 +}
 +
 +void
 +ir_print_visitor::print_phi_jump_src(ir_phi_jump_src *src)
 +{
 +   printf((%s , unique_name(src-jump));


I think this is going to make the output IR difficult to follow, since it
won't be obvious that (break@1 ...) represents a phi node rather than a
break instruction.  I'd recommend changing this to:

   printf((phi_jump_src %s , unique_name(src-jump));


 +   if (src-src) {
 +  printf( %s)\n, unique_name(src-src));
 +   } else {
 +  printf( (undefined))\n);
 +   }
 +}
 +
 +void
 +ir_print_visitor::visit(ir_phi_loop_begin *ir)
 +{
 +   printf((phi\n);


As with ir_phi_if, I'd recommend changing this to:

   printf((phi_loop_begin\n);

A similar comment applies to ir_print_visitor::visit(ir_phi_loop_end *).


 +
 +   indentation++;
 +   indent();
 +   ir-dest-accept(this);
 +   printf(\n);
 +
 +   indent();
 +   if (ir-enter_src) {
 +  printf(%s , unique_name(ir-enter_src));
 +   } else {
 +  printf((undefined) );
 +   }
 +   if (ir-repeat_src) {
 +  printf(%s\n, unique_name(ir-repeat_src));
 +   } else {
 +  printf((undefined)\n);
 +   }
 +
 +   foreach_list(n, ir-continue_srcs) {
 +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
 +  indent();
 +  print_phi_jump_src(src);
 +   }
 +
 +   indentation--;
 +   indent();
 +   printf());
 +}
 +
 +
 +
 +
 +ir_visitor_status
 +ir_validate::visit(ir_phi_loop_begin *ir)
 +{
 +   validate_phi_dest(ir, this-ht);
 +
 +   /*
 +* note: ir_phi_loop_begin is a special case in that variables may be,
 and in
 +* fact normally are, defined *after* they are used in everything
 except
 +* enter_src
 +*/
 +
 +   if (ir-enter_src) {
 +  validate_var_use(ir-enter_src, this-ht, (void *) ir,
 ir_phi_loop_begin);
 +   }
 +
 

Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-28 Thread Paul Berry
On 28 January 2014 13:15, Carl Worth cwo...@cworth.org wrote:

 Paul Berry stereotype...@gmail.com writes:
  Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback
  functions were using ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]

 I see that this commit was nominated for the stable branch when it was
 committed.

 However, it appears to depend at least on commit b22146dc714 which is on


I think you mean 3b22146dc714 (mesa: Replace
ctx-Shader.Current{Vertex,Fragment,Geometry}Program with an array.).


 master, but is not on the 10.0 branch. So this patch cannot be applied
 as-is to 10.0.

 My current inclination is to simply drop this patch from the list of
 candidates for 10.0. If you think differently, please recommend an
 alternate plan.

 -Carl


(Also from IRC: cworth stereotype441: In my recent mail, I should have
included 43e77215b13 as well. (It's another patch I plan to drop from
stable unless you guide me differently.))

In principle we could make versions of these patches that don't depend on
commit 3b22146dc714, but I don't think it's worth it--the bugs I've fixed
in these patches are unlikely to arise until ARB_separate_shader_objects is
supported.

So yeah, I'm ok with dropping these two patches from the list of candidates
for 10.0.

I believe it was Ken who initially suggested marking these patches as
candidates for stable.  Cc'ing him in case he wants to express a dissenting
opinion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 04/11] glsl: add dead branch analysis

2014-01-27 Thread Paul Berry
 conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +/**
 + * \file ir_dead_branches.h
 + *
 + * Provides a visitor which determines, for each if instruction, whether
 + * control will never flow the from the then-block or else-block
 + * to the next instruction due to jump statements (break, continue,
 return,
 + * discard).
 + */
 +
 +#include ir.h
 +#include ir_visitor.h
 +
 +class ir_dead_branches


It would be nice to have a comment above this class indicating that one
ir_dead_branches object is created for each ir_if.  (It's clear from
context, but for a casual user of this visitor it might not be obvious).


 +{
 +public:
 +   ir_dead_branches(ir_if *ir);
 +
 +   ir_if *ir;
 +
 +   /**
 +* whether a jump statement is guarenteed to be hit when the
 +* then_instructions are run, making the branch from the
 then_instructions
 +* dead
 +*/
 +   bool then_dead;
 +   bool else_dead; /**  ditto for the else_instructions */
 +
 +   /** whether the then branch is dead due to a return or discard */


This comment is ambiguous.  Some people interpret a boolean which is
documented as whether A or B as meaning A if the bool is true; B if the
bool is false.  I'd recommend changing this comment to something like
True if the then branch is dead due to a return or discard; false if the
then branch is dead due to a continue.


 +   bool then_dead_return;
 +   bool else_dead_return; /**  ditto for else branch */
 +};
 +
 +class ir_dead_branches_visitor : public ir_hierarchical_visitor


It would be nice to have a short comment above this class giving a brief
overview of how the class is intended to be used (i.e. construct it, visit
the IR using it, then call get_dead_branches() to retrieve the dead branch
information corresponding to a given ir_if block).


 +{
 +public:
 +   ir_dead_branches_visitor();
 +   ~ir_dead_branches_visitor();
 +
 +   virtual ir_visitor_status visit_enter(ir_if *);
 +   virtual ir_visitor_status visit_enter(ir_loop *);
 +   virtual ir_visitor_status visit(ir_loop_jump *);
 +   virtual ir_visitor_status visit_enter(ir_return *);
 +   virtual ir_visitor_status visit_enter(ir_discard *);
 +
 +   ir_dead_branches *get_dead_branches(ir_if *ir);
 +
 +private:
 +   void visit_return();
 +
 +   ir_if *outer_if;
 +   bool in_loop;


It would be nice to have a comment above in_loop explaining that it is
only true if we're visiting code inside a loop that is contained within
outer_if.  (That is, when we visit an if that's nested inside a loop, we
reset in_loop to false).  Without such a comment, someone might erroneously
think that in_loop is true whenever we're visiting code that is contained
within a loop (to arbitrarily deep nesting), which isn't true.


 +   bool in_then;


Similarly, it would be nice to have a comment above in_loop explaining
that it indicates whether we are traversing the then branch or the else
branch of the if statement indicated by outer_if.  Without this comment,
someone might think that in_then is true whenever we're visiting code that
is contained within the then branch of any if statement, to arbitrarily
deep nesting, which isn't true.


 +
 +   struct hash_table *ht;


It would be nice to have a comment above this field explaining what the
keys and the values in the hash table are.


 +};
 --
 1.8.3.1


One final comment: I notice that the only way in which
ir_dead_branches_visitor::outer_if is used (other than comparing it to
NULL) is to pass it to get_dead_branches().  How about if instead of
storing outer_if in the ir_dead_branches_visitor class, we simply store a
pointer to the associated ir_dead_branches object.  That way we would avoid
a lot of unnecessary hashtable lookups.

But I don't have a terribly strong feeling about that.

With all the comment changes, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 05/11] glsl: add loop jump visitor

2014-01-27 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote:

 +class ir_loop_jumps_visitor : public ir_hierarchical_visitor
 +{
 +public:
 +   ir_loop_jumps_visitor();
 +   ~ir_loop_jumps_visitor();
 +
 +   virtual ir_visitor_status visit_enter(ir_loop *);
 +   virtual ir_visitor_status visit(ir_loop_jump *);
 +
 +   ir_loop_jumps *get_loop_jumps(ir_loop *ir);
 +
 +private:
 +   ir_loop *outer_loop;


As with outer_if in the previous patch, I think we could save some
unnecessary hashtable lookups if we simply stored the ir_loop_jumps * here
instead of storing the ir_loop *.  But I don't feel terribly strongly about
it.

Either way, the patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.

2014-01-22 Thread Paul Berry
On 22 January 2014 04:50, Francisco Jerez curroje...@riseup.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  On 15 January 2014 11:42, Francisco Jerez curroje...@riseup.net wrote:
 
  v2: Reuse the glsl_sampler_dim enum for images.  Reuse the
  glsl_type::sampler_* fields instead of creating new ones specific
  to image types.  Reuse the same constructor as for samplers adding
  a new 'base_type' argument.
  ---
  Is this what you had in mind Paul?
 
 
  Yes, that seems reasonable.  Thanks.

 Does your previous reviewed-by still apply for v2 of this patch?

 Thanks.


Yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-22 Thread Paul Berry
Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback
functions were using ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] to
find the program that would be the source of transform feedback data.
This isn't correct--if there's a geometry shader present it should be
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY].  (These might be
different if separate shader objects are in use).

This patch creates a function get_xfb_source(), which figures out the
correct program to use based on GL state, and updates
_mesa_{Begin,Resume}TransformFeedback to call it.  get_xfb_source() is
written in terms of the gl_shader_stage enum, so it should not need
modification when we add tessellation shaders in the future.  It also
creates a new driver flag, NewTransformFeedbackProg, which is flagged
whenever this program changes.

To reduce future confusion, this patch also rewords some comments and
error message text to avoid referring to vertex shaders.
---
 src/mesa/main/mtypes.h|  8 --
 src/mesa/main/transformfeedback.c | 52 +--
 2 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 3dd9678..7fd3298 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1815,8 +1815,9 @@ struct gl_transform_feedback_object
 
/**
 * The shader program active when BeginTransformFeedback() was called.
-* When active and unpaused, this equals
-* ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX].
+* When active and unpaused, this equals ctx-Shader.CurrentProgram[stage],
+* where stage is the pipeline stage that is the source of data for
+* transform feedback.
 */
struct gl_shader_program *shader_program;
 
@@ -3779,6 +3780,9 @@ struct gl_driver_flags
/** gl_context::TransformFeedback::CurrentObject */
GLbitfield NewTransformFeedback;
 
+   /** gl_context::TransformFeedback::CurrentObject::shader_program */
+   GLbitfield NewTransformFeedbackProg;
+
/** gl_context::RasterDiscard */
GLbitfield NewRasterizerDiscard;
 
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 74897ba..9376a9e 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -24,7 +24,7 @@
 
 
 /*
- * Vertex transform feedback support.
+ * Transform feedback support.
  *
  * Authors:
  *   Brian Paul
@@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices(
  **/
 
 
+/**
+ * Figure out which stage of the pipeline is the source of transform feedback
+ * data given the current context state, and return its gl_shader_program.
+ *
+ * If no active program can generate transform feedback data (i.e. no vertex
+ * shader is active), returns NULL.
+ */
+static struct gl_shader_program *
+get_xfb_source(struct gl_context *ctx)
+{
+   int i;
+   for (i = MESA_SHADER_FRAGMENT - 1; i = MESA_SHADER_VERTEX; i--) {
+  if (ctx-Shader.CurrentProgram[i] != NULL)
+ return ctx-Shader.CurrentProgram[i];
+   }
+   return NULL;
+}
+
+
 void GLAPIENTRY
 _mesa_BeginTransformFeedback(GLenum mode)
 {
struct gl_transform_feedback_object *obj;
-   struct gl_transform_feedback_info *info;
+   struct gl_transform_feedback_info *info = NULL;
+   struct gl_shader_program *source;
GLuint i;
unsigned vertices_per_prim;
GET_CURRENT_CONTEXT(ctx);
 
obj = ctx-TransformFeedback.CurrentObject;
 
-   if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) {
+   /* Figure out what pipeline stage is the source of data for transform
+* feedback.
+*/
+   source = get_xfb_source(ctx);
+   if (source == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   glBeginTransformFeedback(no program active));
   return;
}
 
-   info =
-  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkedTransformFeedback;
+   info = source-LinkedTransformFeedback;
 
if (info-NumOutputs == 0) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
@@ -452,7 +475,10 @@ _mesa_BeginTransformFeedback(GLenum mode)
   obj-GlesRemainingPrims = max_vertices / vertices_per_prim;
}
 
-   obj-shader_program = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
+   if (obj-shader_program != source) {
+  ctx-NewDriverState |= ctx-DriverFlags.NewTransformFeedbackProg;
+  obj-shader_program = source;
+   }
 
assert(ctx-Driver.BeginTransformFeedback);
ctx-Driver.BeginTransformFeedback(ctx, mode, obj);
@@ -519,7 +545,7 @@ bind_buffer_range(struct gl_context *ctx, GLuint index,
 
 
 /**
- * Specify a buffer object to receive vertex shader results.  Plus,
+ * Specify a buffer object to receive transform feedback results.  Plus,
  * specify the starting offset to place the results, and max size.
  * Called from the glBindBufferRange() function.
  */
@@ -563,7 +589,7 @@ _mesa_bind_buffer_range_transform_feedback(struct 
gl_context *ctx,
 
 
 /**
- * Specify a buffer object to receive vertex shader results.
+ * Specify a 

[Mesa-dev] [PATCH 2/2] i965/gen7: Use to the correct program when uploading transform feedback state.

2014-01-22 Thread Paul Berry
Transform feedback may come from either the geometry shader or the
vertex shader, so we can't use
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the current
post-link transform feedback information.  Fortunately we can use
ctx-TransformFeedback.CurrentObject-shader_program.
---
 src/mesa/drivers/dri/i965/brw_state_upload.c |  1 +
 src/mesa/drivers/dri/i965/gen7_sol_state.c   | 15 +--
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index af554eb..ca256c2 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -305,6 +305,7 @@ void brw_init_state( struct brw_context *brw )
STATIC_ASSERT(BRW_NUM_STATE_BITS = 8 * sizeof(brw-state.dirty.brw));
 
ctx-DriverFlags.NewTransformFeedback = BRW_NEW_TRANSFORM_FEEDBACK;
+   ctx-DriverFlags.NewTransformFeedbackProg = BRW_NEW_TRANSFORM_FEEDBACK;
ctx-DriverFlags.NewRasterizerDiscard = BRW_NEW_RASTERIZER_DISCARD;
ctx-DriverFlags.NewUniformBuffer = BRW_NEW_UNIFORM_BUFFER;
ctx-DriverFlags.NewAtomicBuffer = BRW_NEW_ATOMIC_BUFFER;
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 1ca2b74..5d2f243 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -39,14 +39,11 @@ static void
 upload_3dstate_so_buffers(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
-   /* BRW_NEW_VERTEX_PROGRAM */
-   const struct gl_shader_program *vs_prog =
-  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
-   const struct gl_transform_feedback_info *linked_xfb_info =
-  vs_prog-LinkedTransformFeedback;
/* BRW_NEW_TRANSFORM_FEEDBACK */
struct gl_transform_feedback_object *xfb_obj =
   ctx-TransformFeedback.CurrentObject;
+   const struct gl_transform_feedback_info *linked_xfb_info =
+  xfb_obj-shader_program-LinkedTransformFeedback;
int i;
 
/* Set up the up to 4 output buffers.  These are the ranges defined in the
@@ -102,12 +99,11 @@ gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
  const struct brw_vue_map *vue_map)
 {
struct gl_context *ctx = brw-ctx;
-   /* BRW_NEW_VERTEX_PROGRAM */
-   const struct gl_shader_program *vs_prog =
-  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
/* BRW_NEW_TRANSFORM_FEEDBACK */
+   struct gl_transform_feedback_object *xfb_obj =
+  ctx-TransformFeedback.CurrentObject;
const struct gl_transform_feedback_info *linked_xfb_info =
-  vs_prog-LinkedTransformFeedback;
+  xfb_obj-shader_program-LinkedTransformFeedback;
uint16_t so_decl[128];
int buffer_mask = 0;
int next_offset[4] = {0, 0, 0, 0};
@@ -260,7 +256,6 @@ const struct brw_tracked_state gen7_sol_state = {
.dirty = {
   .mesa  = (_NEW_LIGHT),
   .brw   = (BRW_NEW_BATCH |
-   BRW_NEW_VERTEX_PROGRAM |
 BRW_NEW_VUE_MAP_GEOM_OUT |
 BRW_NEW_TRANSFORM_FEEDBACK)
},
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-22 Thread Paul Berry
On 22 January 2014 08:20, Kenneth Graunke kenn...@whitecape.org wrote:

 On 01/22/2014 06:07 AM, Paul Berry wrote:
  @@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices(
**/
 
 
  +/**
  + * Figure out which stage of the pipeline is the source of transform
 feedback
  + * data given the current context state, and return its
 gl_shader_program.
  + *
  + * If no active program can generate transform feedback data (i.e. no
 vertex
  + * shader is active), returns NULL.
  + */
  +static struct gl_shader_program *
  +get_xfb_source(struct gl_context *ctx)
  +{
  +   int i;
  +   for (i = MESA_SHADER_FRAGMENT - 1; i = MESA_SHADER_VERTEX; i--) {

 I think this would be clearer as:

 for (i = MESA_SHADER_GEOMETRY; i = MESA_SHADER_VERTEX; i--) {
...
 }

 Note that the pipeline ordering is:
 Vertex - Tess. Control - Tess. Eval - Geometry - Transform Feedback
 (http://www.opengl.org/wiki/Rendering_Pipeline_Overview)

 So either implementation would work even with tessellation shaders.

 Either way, this series is:
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 Cc: 10.0 mesa-sta...@lists.freedesktop.org


That's a good point--I like your suggestion.  Thanks for the review!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 23/23] i965/blorp: switch eu-emitter to use FS IR and fs_generator

2014-01-22 Thread Paul Berry
On 22 January 2014 09:17, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 No regressions on IVB (piglit quick + unit tests).

 v2 (Paul):
   - no need to patch the unit tests anymore. Original logic
 was altered and unit tests updated to match the
 fs-generator
   - lrp emission moves from the blorp compiler core into the
 emitter here (previously there was a separate refactoring
 patch which is not really needed anymore as the lrp logic
 got refactored when the original lrp logic got fixed).
   - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the
 generator in fs_inst::target instead of hardcoding it

 CC: Paul Berry stereotype...@gmail.com
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com


Thanks, Topi.  The whole series is now:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V3 7/8] glsl: remove remaining is_array variables

2014-01-22 Thread Paul Berry
On 22 January 2014 03:33, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Previously the reason we needed is_array was because we used array_size ==
 NULL to
 represent both non-arrays and unsized arrays.  Now that we use a non-NULL
 array_specifier to represent an unsized array, is_array is redundant.

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/ast.h  | 32 +++-
  src/glsl/ast_to_hir.cpp | 10 +-
  src/glsl/ast_type.cpp   |  6 ++
  src/glsl/glsl_parser.yy | 30 +++---
  src/glsl/glsl_parser_extras.cpp | 15 ++-
  5 files changed, 35 insertions(+), 58 deletions(-)


When I try to build this patch I get the compile error:

  CXX  ast_to_hir.lo
../../src/glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue*
ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*)':
../../src/glsl/ast_to_hir.cpp:2877:12: error: 'class ast_declaration' has
no member named 'is_array'
   assert(!decl-is_array);
^

I'm guessing you missed this because you are building in release mode, so
asserts didn't get compiled.  If that's the case, I'd like to encourage you
to do a debug build and double-check that piglit tests still pass.  You can
do that by adding the --enable-debug option when you run ./autogen.sh.


With that fixed, the series is:

Reviewed-by: Paul Berry stereotype...@gmail.com

I'll try to push my patch glsl: Simplify aggregate type inference to
prepare for ARB_arrays_of_arrays. within the next hour.  Do you have
commit access to Mesa or would you like me to push the series for you?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Replace *_generator::shader with is_glsl boolean.

2014-01-22 Thread Paul Berry
The shader field in fs_generator, vec4_generator, and gen8_generator
was only used for one purpose; to figure out if we were compiling an
assembly shader or a GLSL shader.  And it wasn't being used properly:
in vec4 shaders we were always initializing it based on
prog-_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we
were compiling a geometry shader or a vertex shader.

This was a fairly benign problem, since it's unlikely that a
real-world program will try to mix and match GLSL and assembly shaders
using separate shader objects.  But it seems worth fixing.

This patch replaces the shader field with a new is_glsl boolean, and
initializes it based on information from the caller, so that it always
refers to the correct shader stage.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  6 --
 src/mesa/drivers/dri/i965/brw_fs.h|  8 +---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 12 ++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4.h  |  8 +---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 11 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  7 +--
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp   | 13 ++---
 src/mesa/drivers/dri/i965/gen8_generator.cpp  |  6 --
 src/mesa/drivers/dri/i965/gen8_generator.h|  5 +++--
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 10 +-
 11 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a0e4830..c0d65d5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3512,11 +3512,13 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
 
const unsigned *assembly = NULL;
if (brw-gen = 8) {
-  gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != 
BAD_FILE);
+  gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE,
+  shader != NULL);
   assembly = g.generate_assembly(v.instructions, simd16_instructions,
  final_assembly_size);
} else {
-  fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE);
+  fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE,
+ shader != NULL);
   assembly = g.generate_assembly(v.instructions, simd16_instructions,
  final_assembly_size);
}
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index a903908..ad0aa99 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -512,7 +512,8 @@ public:
 struct brw_wm_compile *c,
 struct gl_shader_program *prog,
 struct gl_fragment_program *fp,
-bool dual_source_output);
+bool dual_source_output,
+bool is_glsl);
~fs_generator();
 
const unsigned *generate_assembly(exec_list *simd8_instructions,
@@ -615,7 +616,6 @@ private:
struct brw_wm_compile *c;
 
struct gl_shader_program *prog;
-   struct gl_shader *shader;
const struct gl_fragment_program *fp;
 
unsigned dispatch_width; /** 8 or 16 */
@@ -623,6 +623,7 @@ private:
exec_list discard_halt_patches;
bool dual_source_output;
void *mem_ctx;
+   const bool is_glsl;
 };
 
 /**
@@ -637,7 +638,8 @@ public:
  struct brw_wm_compile *c,
  struct gl_shader_program *prog,
  struct gl_fragment_program *fp,
- bool dual_source_output);
+ bool dual_source_output,
+ bool is_glsl);
~gen8_fs_generator();
 
const unsigned *generate_assembly(exec_list *simd8_instructions,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index e701fc5..a8e81b8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -40,14 +40,14 @@ fs_generator::fs_generator(struct brw_context *brw,
struct brw_wm_compile *c,
struct gl_shader_program *prog,
struct gl_fragment_program *fp,
-   bool dual_source_output)
+   bool dual_source_output,
+   bool is_glsl)
 
-   : brw(brw), c(c), prog(prog), fp(fp), dual_source_output(dual_source_output)
+   : brw(brw), c(c), prog(prog), fp(fp),
+ dual_source_output(dual_source_output), is_glsl(is_glsl)
 {
ctx = brw-ctx;
 
-   shader = prog ? prog-_LinkedShaders[MESA_SHADER_FRAGMENT] : NULL;
-
mem_ctx = c;
 
p = rzalloc(mem_ctx, struct brw_compile);
@@ -1301,7 +1301,7 @@ fs_generator::generate_code(exec_list *instructions)

Re: [Mesa-dev] Two small blorp fixes found during refactoring

2014-01-21 Thread Paul Berry
On 21 January 2014 08:07, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 These are not known to have caused any problems but they would get
 fixed when switching into fs-generator. Both fixes need
 corresponding patching of the unit tests and fixing them aforehand
 makes the switch possible without any changes in the tests.

 Topi Pohjolainen (2):
   i965/blorp: patch jump counters also for endif
   i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR

  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   | 48
 ++
  .../drivers/dri/i965/test_blorp_blit_eu_gen.cpp| 30 +++---
  2 files changed, 47 insertions(+), 31 deletions(-)


Series is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 1/8] mesa: Add ARB_arrays_of_arrays

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/mesa/main/extensions.c | 1 +
  src/mesa/main/mtypes.h | 1 +
  2 files changed, 2 insertions(+)


Since this is a GLSL extension, you also need to update
glcpp_parser_create() (in glcpp-parse.y) to call add_builtin_define(parser,
GL_ARB_arrays_of_arrays) if extensions-ARB_arrays_of_arrays is set.

With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com

Personally I would also add the extension to the
_mesa_glsl_supported_extensions[] table in this patch, as well as add the
bools ARB_arrays_of_arrays_enable and ARB_arrays_of_arrays_warn to
_mesa_glsl_parse_state.  I see you've done that as part of patch 3/8
instead.  I suppose that's a matter of personal preference though.



 diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
 index 2e0ccc3..95eb7ca 100644
 --- a/src/mesa/main/extensions.c
 +++ b/src/mesa/main/extensions.c
 @@ -80,6 +80,7 @@ static const struct extension extension_table[] = {
 /* ARB Extensions */
 { GL_ARB_ES2_compatibility,
 o(ARB_ES2_compatibility),   GL, 2009 },
 { GL_ARB_ES3_compatibility,
 o(ARB_ES3_compatibility),   GL, 2012 },
 +   { GL_ARB_arrays_of_arrays,
  o(ARB_arrays_of_arrays),GL, 2012 },
 { GL_ARB_base_instance,   o(ARB_base_instance),
   GL, 2011 },
 { GL_ARB_blend_func_extended,
 o(ARB_blend_func_extended), GL, 2009 },
 { GL_ARB_clear_buffer_object, o(dummy_true),
  GL, 2012 },
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 33df682..eef67a8 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3366,6 +3366,7 @@ struct gl_extensions
 GLboolean ANGLE_texture_compression_dxt;
 GLboolean ARB_ES2_compatibility;
 GLboolean ARB_ES3_compatibility;
 +   GLboolean ARB_arrays_of_arrays;
 GLboolean ARB_base_instance;
 GLboolean ARB_blend_func_extended;
 GLboolean ARB_color_buffer_float;
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 2/8] glsl: Add arrays_of_arrays to yacc definition

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/glsl_parser.yy | 153
 +---
  1 file changed, 66 insertions(+), 87 deletions(-)

 diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
 index 1c56d6f..6d63668 100644
 --- a/src/glsl/glsl_parser.yy
 +++ b/src/glsl/glsl_parser.yy
 @@ -1611,19 +1565,55 @@ storage_qualifier:
 }
 ;

 -type_specifier:
 -   type_specifier_nonarray
 -   | type_specifier_nonarray '[' ']'
 +array_specifier:
 +   '[' ']'
 {
 -  $$ = $1;
 -  $$-is_array = true;
 -  $$-array_size = NULL;
 +  void *ctx = state;
 +  $$ = new(ctx) ast_array_specifier();
 +  $$-is_unsized_array = true;
 +  $$-set_location(yylloc);
 +   }
 +   | '[' constant_expression ']'
 +   {
 +  void *ctx = state;
 +  $$ = new(ctx) ast_array_specifier();
 +  $$-array_dimensions.push_tail( $2-link);
 +  $$-is_unsized_array = false;
 +  $$-set_location(yylloc);
 +   }
 +   | array_specifier '[' ']'
 +   {
 +  if (!state-ARB_arrays_of_arrays_enable) {
 + _mesa_glsl_error( @1, state,
 +  #version 120 / GL_ARB_arrays_of_arrays 
 +  required for defining arrays of arrays);


This error message is confusing.  It could be reasonably interpreted as
saying that either version 120 or GL_ARB_arrays_of_arrays is sufficient to
allow arrays of arrays, which isn't true.  I would just drop the mention of
version 120 entirely--the only reason it's significant at all is that
version 120 is the first version of GLSL to mention that arrays of arrays
aren't allowed, and that was just because it was left out of version 110 as
an oversight.



 +  } else {
 + _mesa_glsl_error( @1, state,
 +  only the outermost array dimension can 
 +  be unsized);
 +  }


To prevent uninitialized data in the error condition, I'd recommend doing

$$ = $1;

here.  Otherwse we might crash before the user can see the error message.


 }
 -   | type_specifier_nonarray '[' constant_expression ']'
 +   | array_specifier '[' constant_expression ']'
 +   {
 +  if (!state-ARB_arrays_of_arrays_enable) {
 + _mesa_glsl_error( @1, state,
 +  #version 120 / GL_ARB_arrays_of_arrays 
 +  required for defining arrays of arrays);


Similar comment about dropping #version 120 from the error message here.


 +  }
 +


I think we need to add:

$$ = $1;

here.


 +  $$-array_dimensions.push_tail( $3-link);
 +  $$-dimension_count++;
 +  $$-set_location(yylloc);


Assuming I'm right about adding $$ = $1; above, there's no need to call
set_location() here.  The location was already set when the
ast_array_specifier was created.


 +   }
 +   ;
 +
 +type_specifier:
 +   type_specifier_nonarray
 +   | type_specifier_nonarray array_specifier
 {
$$ = $1;
$$-is_array = true;
 -  $$-array_size = $3;
 +  $$-array_specifier = $2;
 }
 ;


With those changes, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com

Note, however, that some of my review comments on patch 3/8 will probably
affect the content of this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 4/8] glsl: only call mark_max_array if we are assigning an array

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/ast_to_hir.cpp | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)


Can you say more in the commit message about why you're making this
change?  Is this fixing an existing bug?  Is it preparing for a later
change (which would otherwise create a bug if the change weren't made)?  Or
are you doing it just because it seems reasonable to do?



 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index 326aa58..d02c9ff 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -830,8 +830,10 @@ do_assignment(exec_list *instructions, struct
 _mesa_glsl_parse_state *state,

  rhs-type-array_size());
  d-type = var-type;
}
 -  mark_whole_array_access(rhs);
 -  mark_whole_array_access(lhs);
 +  if (lhs-type-is_array()) {
 + mark_whole_array_access(rhs);
 + mark_whole_array_access(lhs);
 +  }
 }

 /* Most callers of do_assignment (assign, add_assign, pre_inc/dec,
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.

2014-01-21 Thread Paul Berry
Most of the time it is not necessary to perform type inference to
compile GLSL; the type of every expression can be inferred from the
contents of the expression itself (and previous type declarations).
The exception is aggregate initializers: their type is determined by
the LHS of the variable being assigned to.  For example, in the
statement:

   mat2 foo = { { 1, 2 }, { 3, 4 } };

the type of { 1, 2 } is only known to be vec2 (as opposed to, say,
ivec2, uvec2, int[2], or a struct) because of the fact that the result
is being assigned to a mat2.

Previous to this patch, we handled this situation by doing some type
inference during parsing: when parsing a declaration like the one
above, we would call _mesa_set_aggregate_type(), which would infer the
type of each aggregate initializer and store it in the corresponding
ast_aggregate_initializer::constructor_type field.  Since this
happened at parse time, we couldn't do the type inference using
glsl_type objects; we had to use ast_type_specifiers, which are much
more awkward to work with.  Things are about to get more complicated
when we add support for ARB_arrays_of_arrays.

This patch simplifies things by postponing the call to
_mesa_set_aggregate_type() until ast-to-hir time, when we have access
to glsl_type objects.  As a side benefit, we only need to have one
call to _mesa_set_aggregate_type() now, instead of six.
---

Timothy: I was inspired to write this patch by the complexities you
encountered during [PATCH V2 5/8] glsl: Aggregate initializer support
for arrays of array.  Can you try rebasing your series on top of this
patch to see if it simplifies things?  I believe that with these
changes, you should be able to drop patch 5/8 entirely.

 src/glsl/ast.h  |  16 +++--
 src/glsl/ast_function.cpp   |   8 +--
 src/glsl/ast_to_hir.cpp |   7 +++
 src/glsl/glsl_parser.yy |  27 
 src/glsl/glsl_parser_extras.cpp | 136 +++-
 5 files changed, 46 insertions(+), 148 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 76911f0..b24052b 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -296,7 +296,16 @@ public:
   /* empty */
}
 
-   ast_type_specifier *constructor_type;
+   /**
+* glsl_type of the aggregate, which is inferred from the LHS of whatever
+* the aggregate is being used to initialize.  This can't be inferred at
+* parse time (since the parser deals with ast_type_specifiers, not
+* glsl_types), so the parser leaves it NULL.  However, the ast-to-hir
+* conversion code makes sure to fill it in with the appropriate type
+* before hir() is called.
+*/
+   const glsl_type *constructor_type;
+
virtual ir_rvalue *hir(exec_list *instructions,
   struct _mesa_glsl_parse_state *state);
 };
@@ -978,9 +987,8 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 YYLTYPE loc, YYLTYPE idx_loc);
 
 extern void
-_mesa_ast_set_aggregate_type(const ast_type_specifier *type,
- ast_expression *expr,
- _mesa_glsl_parse_state *state);
+_mesa_ast_set_aggregate_type(const glsl_type *type,
+ ast_expression *expr);
 
 void
 emit_function(_mesa_glsl_parse_state *state, ir_function *f);
diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 2d05d07..4c5b0e4 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -1687,14 +1687,12 @@ ast_aggregate_initializer::hir(exec_list *instructions,
 {
void *ctx = state;
YYLTYPE loc = this-get_location();
-   const char *name;
 
if (!this-constructor_type) {
   _mesa_glsl_error(loc, state, type of C-style initializer unknown);
   return ir_rvalue::error_value(ctx);
}
-   const glsl_type *const constructor_type =
-  this-constructor_type-glsl_type(name, state);
+   const glsl_type *const constructor_type = this-constructor_type;
 
if (!state-ARB_shading_language_420pack_enable) {
   _mesa_glsl_error(loc, state, C-style initialization requires the 
@@ -1702,12 +1700,12 @@ ast_aggregate_initializer::hir(exec_list *instructions,
   return ir_rvalue::error_value(ctx);
}
 
-   if (this-constructor_type-is_array) {
+   if (constructor_type-is_array()) {
   return process_array_constructor(instructions, constructor_type, loc,
this-expressions, state);
}
 
-   if (this-constructor_type-structure) {
+   if (constructor_type-is_record()) {
   return process_record_constructor(instructions, constructor_type, loc,
 this-expressions, state);
}
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 4cc8eb1..8d13610 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2593,6 +2593,13 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
   ? attribute : varying);
   

Re: [Mesa-dev] [PATCH V2 5/8] glsl: Aggregate initializer support for arrays of array

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/ast.h  | 19 +++-
  src/glsl/ast_function.cpp   | 14 +++--
  src/glsl/ast_to_hir.cpp | 29 +-
  src/glsl/glsl_parser.yy | 36 +-
  src/glsl/glsl_parser_extras.cpp | 68
 +++--
  5 files changed, 139 insertions(+), 27 deletions(-)


I believe I've found an easier solution to this: please see [PATCH] glsl:
Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.,
which I believe makes this patch unnecessary.



 diff --git a/src/glsl/ast.h b/src/glsl/ast.h
 index 4dda32e..8bccca7 100644
 --- a/src/glsl/ast.h
 +++ b/src/glsl/ast.h
 @@ -308,10 +308,16 @@ public:
: ast_expression(ast_aggregate, NULL, NULL, NULL),
  constructor_type(NULL)
 {
 -  /* empty */
 +  array_dimension = 1;
 +  is_array = false;
 }

 ast_type_specifier *constructor_type;
 +   unsigned array_dimension;
 +
 +   ast_array_specifier *array_specifier;
 +   bool is_array;
 +
 virtual ir_rvalue *hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
  };
 @@ -588,6 +594,11 @@ public:
  struct _mesa_glsl_parse_state *state)
const;

 +   const struct glsl_type *glsl_type(const char **name,
 + struct _mesa_glsl_parse_state *state,
 + bool skip_outer_dimension)
 +  const;
 +
 virtual void print(void) const;

 ir_rvalue *hir(exec_list *, struct _mesa_glsl_parse_state *);
 @@ -1005,4 +1016,10 @@ extern void
  check_builtin_array_max_size(const char *name, unsigned size,
   YYLTYPE loc, struct _mesa_glsl_parse_state
 *state);

 +extern const glsl_type *
 +process_array_type(YYLTYPE *loc, const glsl_type *base,
 +   ast_array_specifier *array_specifier,
 +  struct _mesa_glsl_parse_state *state,
 +   bool skip_first_dim);
 +
  #endif /* AST_H */
 diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
 index 2d05d07..3b38cb6 100644
 --- a/src/glsl/ast_function.cpp
 +++ b/src/glsl/ast_function.cpp
 @@ -1693,8 +1693,18 @@ ast_aggregate_initializer::hir(exec_list
 *instructions,
_mesa_glsl_error(loc, state, type of C-style initializer
 unknown);
return ir_rvalue::error_value(ctx);
 }
 -   const glsl_type *const constructor_type =
 -  this-constructor_type-glsl_type(name, state);
 +   const glsl_type *constructor_type =
 +  this-constructor_type-glsl_type(name, state,
 +this-array_dimension  1 
 !this-is_array);
 +
 +   /* This only handles vec4 foo[..].  The earlier
 constructor_type-glsl_type(...)
 +* call already handled the vec4[..] foo case.
 +*/
 +   if (this-is_array) {
 +  constructor_type =
 + process_array_type(loc, constructor_type, this-array_specifier,
 +state, this-array_dimension  1);
 +   }

 if (!state-ARB_shading_language_420pack_enable) {
_mesa_glsl_error(loc, state, C-style initialization requires the 
 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index d02c9ff..226d128 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -1817,10 +1817,11 @@ process_array_size(exec_node *node,
 return result;
  }

 -static const glsl_type *
 +const glsl_type *
  process_array_type(YYLTYPE *loc, const glsl_type *base,
 ast_array_specifier *array_specifier,
 -   struct _mesa_glsl_parse_state *state)
 +   struct _mesa_glsl_parse_state *state,
 +   bool skip_first_dim)
  {
 const glsl_type *array_type = NULL;
 const glsl_type *element_type = base;
 @@ -1865,6 +1866,8 @@ process_array_type(YYLTYPE *loc, const glsl_type
 *base,

unsigned array_size;
for (/* nothing */; !node-is_head_sentinel(); node = node-prev) {
 + if (skip_first_dim  node-prev-is_head_sentinel())
 +break;
   array_size = process_array_size(node, state);
   array_type_temp = glsl_type::get_array_instance(array_type_temp,
   array_size);
 @@ -1891,6 +1894,14 @@ const glsl_type *
  ast_type_specifier::glsl_type(const char **name,
   struct _mesa_glsl_parse_state *state) const
  {
 +   return this-glsl_type(name, state, false);
 +}
 +
 +const glsl_type *
 +ast_type_specifier::glsl_type(const char **name,
 + struct _mesa_glsl_parse_state *state,
 +  bool skip_outer_dim) const
 +{
 const struct glsl_type *type;

 type = state-symbols-get_type(this-type_name);
 @@ -1898,7 +1909,8 @@ 

Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/ast_to_hir.cpp | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index 226d128..62b7ec2 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions,
if (state-is_version(120, 300))
   break;
/* FALLTHROUGH */
 +case GLSL_TYPE_ARRAY:
 +   if (state-ARB_arrays_of_arrays_enable)
 +  break;
 +   /* FALLTHROUGH */
 default:
_mesa_glsl_error( loc, state,
 vertex shader input / attribute cannot
 have 


I see two problems with this:

1. Since the case above has a fall-through, this has an unintended side
effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes
will be allowed prior to GLSL 1.20.  Although it's unlikely that anyone
will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it
would be nice to get the logic right.

2. When the type of the input is an array of arrays, we need to recurse
through it to make sure it doesn't contain any prohibited types (e.g. an
array of arrays of structs should still be prohibited).

I think the easiest solution to this would be to replace the line:

const glsl_type *check_type = var-type-is_array()
   ? var-type-fields.array : var-type;

with something like this:

const glsl_type *check_type = var-type;
while (check_type-is_array())
   check_type = check_type-element_type();

Then we can leave the switch statement alone.

With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader

2014-01-21 Thread Paul Berry
On 21 January 2014 18:23, Paul Berry stereotype...@gmail.com wrote:

 On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/ast_to_hir.cpp | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index 226d128..62b7ec2 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions,
if (state-is_version(120, 300))
   break;
/* FALLTHROUGH */
 +case GLSL_TYPE_ARRAY:
 +   if (state-ARB_arrays_of_arrays_enable)
 +  break;
 +   /* FALLTHROUGH */
 default:
_mesa_glsl_error( loc, state,
 vertex shader input / attribute cannot
 have 


 I see two problems with this:

 1. Since the case above has a fall-through, this has an unintended side
 effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes
 will be allowed prior to GLSL 1.20.  Although it's unlikely that anyone
 will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it
 would be nice to get the logic right.

 2. When the type of the input is an array of arrays, we need to recurse
 through it to make sure it doesn't contain any prohibited types (e.g. an
 array of arrays of structs should still be prohibited).

 I think the easiest solution to this would be to replace the line:

 const glsl_type *check_type = var-type-is_array()
? var-type-fields.array : var-type;

 with something like this:

 const glsl_type *check_type = var-type;
 while (check_type-is_array())
check_type = check_type-element_type();

 Then we can leave the switch statement alone.

 With that fixed, this patch is:

 Reviewed-by: Paul Berry stereotype...@gmail.com


As an aside, this is just one example of what I suspect are many places
where the GLSL compiler makes the implicit assumption that an array can't
contain an array.  Once we think ARB_arrays_of_arrays support is complete,
it would probably be worth it for someone with a lot of familiarity with
the GLSL front end to grep for is_array() to make sure we haven't missed
anything :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 7/8] glsl: create type name for arrays of arrays

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 We need to insert outermost dimensions in the correct spot otherwise
 the dimension order will be backwards

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/glsl_types.cpp | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
 index 12d4ac0..f7f3117 100644
 --- a/src/glsl/glsl_types.cpp
 +++ b/src/glsl/glsl_types.cpp
 @@ -300,8 +300,18 @@ glsl_type::glsl_type(const glsl_type *array, unsigned
 length) :

 if (length == 0)
snprintf(n, name_length, %s[], array-name);
 -   else
 -  snprintf(n, name_length, %s[%u], array-name, length);
 +   else {
 +  /* insert outermost dimensions in the correct spot */
 +  const char *pos = strchr(array-name, '[');
 +  if (pos) {
 + int idx = pos - array-name;
 + snprintf(n, idx+1, %s, array-name);
 + snprintf(n + idx, name_length, [%u], length);
 + snprintf(n + strlen(n), name_length, %s, array-name + idx);


Technically the last two snprintf's are unsafe, since the second argument
is a length, so we need to subtract the offset we supplied to the first
argument.

Also, there's no need to split the second snprintf from the third; both of
them can be combined to:

snprintf(n + idx, name_length - idx, [%u]%s, length, array-name + idx);

Good catch on noticing that it's necessary to reverse the order of the
dimensions.  With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com


 +  } else {
 + snprintf(n, name_length, %s[%u], array-name, length);
 +  }
 +   }

 this-name = n;
  }
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 8/8] docs: Mark ARB_arrays_of_arrays as started

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  docs/GL3.txt | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 3/8] glsl: Add array specifier to ast code

2014-01-21 Thread Paul Berry
On 21 January 2014 14:54, Timothy Arceri t_arc...@yahoo.com.au wrote:

 Hi Paul

 Thanks for the thorough review and taking the time to explain how to make
 improvements. I can already see a number of improvements that can be made
 to patch 5 based on the suggestions you have already made so you could hold
 off reviewing that one if you havent started already and I've post a new
 series in a couple of days.


No problem.  As you can see, I already sent out my own ideas about patch
5/8.  Sorry about the mid-air collision on that--I didn't notice your reply
until I had already finished my review of the whole series.  Hopefully my
patch will do what you need, and you won't need to work on patch 5 any
further.

Thanks for working on this!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.

2014-01-21 Thread Paul Berry
On 15 January 2014 11:42, Francisco Jerez curroje...@riseup.net wrote:

 v2: Reuse the glsl_sampler_dim enum for images.  Reuse the
 glsl_type::sampler_* fields instead of creating new ones specific
 to image types.  Reuse the same constructor as for samplers adding
 a new 'base_type' argument.
 ---
 Is this what you had in mind Paul?


Yes, that seems reasonable.  Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.

2014-01-21 Thread Paul Berry
On 15 January 2014 13:53, Francisco Jerez curroje...@riseup.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  On 2 December 2013 11:31, Francisco Jerez curroje...@riseup.net wrote:
 
  +/**
  + * Get either of the 8-component halves of a 16-component register.
  + */
  +static inline fs_reg
  +half(const fs_reg reg, unsigned idx)
  +{
  +   assert(idx == 0 || (reg.file != HW_REG  reg.file != IMM));
  +   return byte_offset(reg, 8 * idx * reg.stride * type_sz(reg.type));
  +}
  +
 
 
  I'd like to see a comment explaining that it's safe to call this function
  on a register with 32 bits per component, since brw_reg.h's byte_offset()
  handles byte offsets greater than 32 by incrementing the register number.
 
 I've added a comment on the allowed range of fs_reg::subreg_offset here
 [1].  Do you think that's clear enough?


The comment you added is good, but it doesn't address my concern.  My
concern is that callers of half() might see the phrase Get either of the
8-component halves of a 16-component register and reasonably think this
means get either of the 8-components halves of a register that stores 16
components in its 256 bits.  Since most 16-wide FS registers are
actually register pairs that store 16 components in 512 bits, they might be
misled into thinking that half() doesn't work on these register pairs.
Adding a comment to fs_reg::subreg_offset doesn't help that situation
because it doesn't change the misleading text above the half() function.

How about changing the comment above half() to say this:

/**
 * Get either of the 8-component halves of a 16-component register.
 *
 * Note: this also works if \c reg represents a SIMD16 pair of registers.
 */


not realize that is is safe to call it on a register with 32 bits per
component, because they might erroneously think oh, this function only
works on 16-component registers; that is, registers that



 Thanks.

  Also, for sanity's sake, it would be nice for this function to include
 the
  assertion:
 
 assert(idx  2);
 
  With those changes, this patch is:
 
  Reviewed-by: Paul Berry stereotype...@gmail.com

 [1]
 http://cgit.freedesktop.org/~currojerez/mesa/commit/?h=image-load-storeid=14ec9dfd42fec1ee5e5e7f48f98a36fe620cf4e6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.

2014-01-21 Thread Paul Berry
On 21 January 2014 19:21, Paul Berry stereotype...@gmail.com wrote:


 not realize that is is safe to call it on a register with 32 bits per
 component, because they might erroneously think oh, this function only
 works on 16-component registers; that is, registers that


Oops, sorry about the extraneous sentence fragment there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] i965/fs: Take into account reg_offset consistently for MRF regs.

2014-01-21 Thread Paul Berry
On 15 January 2014 14:01, Francisco Jerez curroje...@riseup.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  On 2 December 2013 11:31, Francisco Jerez curroje...@riseup.net wrote:
 
  Until now it was only being taken into account in the VEC4 back-end
  but not in the FS back-end.  Do it in both cases.
  ---
   src/mesa/drivers/dri/i965/brw_fs.h |  2 +-
   src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++
   src/mesa/drivers/dri/i965/brw_shader.h |  7 ---
   3 files changed, 11 insertions(+), 8 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
  b/src/mesa/drivers/dri/i965/brw_fs.h
  index 2c36d9f..f918f7e 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs.h
  +++ b/src/mesa/drivers/dri/i965/brw_fs.h
  @@ -615,4 +615,4 @@ bool brw_do_channel_expressions(struct exec_list
  *instructions);
   bool brw_do_vector_splitting(struct exec_list *instructions);
   bool brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program
  *prog);
 
  -struct brw_reg brw_reg_from_fs_reg(fs_reg *reg);
  +struct brw_reg brw_reg_from_fs_reg(fs_reg *reg, unsigned
 dispatch_width);
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  index 8d310a1..1de59eb 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  @@ -981,8 +981,9 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
   }
 
   struct brw_reg
  -brw_reg_from_fs_reg(fs_reg *reg)
  +brw_reg_from_fs_reg(fs_reg *reg, unsigned dispatch_width)
   {
  +   const int reg_size = 4 * dispatch_width;
 
 
  What happens when reg.type is UW and dispatch_width is 16?  In that case,
  we would compute reg_size == 64, but the correct value seems like it's
  actually 32 in this case.
 
  Are we perhaps relying on reg.type being a 32-bit type?  If so, maybe we
  should add an assertion:
 
 assert(type_sz(reg.type) == 4);
 

 Nope, reg_size is supposed to be the size in bytes of a ::reg_offset
 unit, i.e. one hardware register in SIMD8 mode and two hardware
 registers in SIMD16 mode as the comment at the definition of
 ::reg_offset explains.  The fixed factor of four is intentional and
 correct no matter what the register type is.

 Thanks.


Ok, I see.  It appears that both this function *and* the comment above
reg_offset are assuming that the data type is 32 bit.  The comment above
reg_offset says:

* For pre-register-allocation GRFs and MRFs, this is in units of a
* float per pixel (1 hardware register for SIMD8 or SIMD4x2 mode,
* or 2 registers for SIMD16 mode).  For uniforms, this is in units
* of 1 float.

but when we get around to adding support for double-precision floats (a
feature of GL 4.0), this will no longer work; for double precision types
we'll need reg_offset to be measured in units of at least 4 hardware
registers in SIMD16 mode to avoid overlap.

Similarly, for types that are 16 bits, if we consider reg_offset to be
measured in units of 2 hardware registers in SIMD16 mode, we're actually
wasting registers, since all 16 values actually fit in a single hardware
register.  That's not really a big deal right now, since we use 16-bit
types so rarely in the FS back-end.  In fact, I bet we never use a nonzero
reg_offset on a 16-bit type.

It still seems to me that it would be worth putting an assertion here to
help alert us to what needs to change when we add double precision support
(or if someday we have hardware that supports half float computation).  I'm
not 100% sure what the assertion could be.  assert(type_sz(reg-type) ==
4); was an optimistic guess.  We might have to do
assert(type_sz(reg-type) == 4 || reg-reg_offset == 0); in order to
avoid tripping on the rare cases where we currently use 16-bit types in
fragment shaders.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter

2014-01-20 Thread Paul Berry
On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 index b189aa2..dcfd82b 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const
 struct brw_reg src0,
  true /* eot */,
  use_header);
  }
 +
 +void
 +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type,
 +   const struct brw_reg dst,
 +   const struct brw_reg src_1,
 +   const struct brw_reg src_2)
 +{
 +   if (texture_data_type == BRW_REGISTER_TYPE_F)
 +  brw_ADD(func, dst, src_1, src_2);
 +   else
 +  brw_AVG(func, dst, src_1, src_2);
 +}


It's a bit of an awkward split to have most of the algorithm for combining
samples in brw_blorp_blit_program::manual_blend_average(), but the choice
of whether to use ADD or AVG is here in
brw_blorp_eu_emitter::emit_combine().  How about if we replace
texture_data_type with a bool called combine_using_add?  That way someone
reading manual_blend_average() won't have to refer to emit_combine() to
understand what the algorithm does; and similarly someone reading
emt_combine() won't have to look at manual_blend_average() to understand
why we use ADD for floats and AVG for ints.

With that change, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/42] i965/blorp: wrap LRP

2014-01-20 Thread Paul Berry
On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 The split of the emission of the two halfs into single emission
 call prapares for fs_generator support that already does similar
 thing. No regressions seen on IVB (unit tests and piglit quick).

 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 26
 ++
  src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h | 13 +
  2 files changed, 23 insertions(+), 16 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 index 1b7310b..b95104e 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 @@ -1679,29 +1679,23 @@
 brw_blorp_blit_program::manual_blend_bilinear(unsigned num_samples)
 }

  #define SAMPLE(x, y) offset(texture_data[x], y)
 -   brw_set_access_mode(func, BRW_ALIGN_16);
 -   brw_set_compression_control(func, BRW_COMPRESSION_NONE);
 for (int index = 3; index  0; ) {
/* Since we're doing SIMD16, 4 color channels fits in to 8
 registers.
 * Counter value of 8 in 'for' loop below is used to interpolate all
 * the color components.
 */
 -  for (int k = 0; k  8; ++k)
 - brw_LRP(func,
 - vec8(SAMPLE(index - 1, k)),
 - offset(x_frac, k  1),
 - SAMPLE(index, k),
 - SAMPLE(index - 1, k));
 +  for (int k = 0; k  8; k += 2)
 + emit_lrp(vec8(SAMPLE(index - 1, k)),
 +  offset(x_frac, k  1),


This line can just be x_frac now, because k is always even, so (k  1) is
always zero.


 +  SAMPLE(index, k),
 +  SAMPLE(index - 1, k));
index -= 2;
 }
 -   for (int k = 0; k  8; ++k)
 -  brw_LRP(func,
 -  vec8(SAMPLE(0, k)),
 -  offset(y_frac, k  1),
 -  vec8(SAMPLE(2, k)),
 -  vec8(SAMPLE(0, k)));
 -   brw_set_compression_control(func, BRW_COMPRESSION_COMPRESSED);
 -   brw_set_access_mode(func, BRW_ALIGN_1);
 +   for (int k = 0; k  8; k += 2)
 +  emit_lrp(vec8(SAMPLE(0, k)),
 +   offset(y_frac, k  1),


Same comment applies here.


 +   vec8(SAMPLE(2, k)),
 +   vec8(SAMPLE(0, k)));
  #undef SAMPLE
  }


With those two things fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 37/42] i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
 b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
 index 1ecf076..3f2301c 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
 @@ -162,6 +162,24 @@ protected:
brw_RNDD(func, dst, src);
 }

 +   inline void emit_if(int op,
 +   const struct brw_reg x,
 +   const struct brw_reg y)
 +   {
 +  brw_CMP(func, vec16(brw_null_reg()), op, x, y);
 +  brw_IF(func, BRW_EXECUTE_16);
 +   }


Can we maybe call tihs emit_cmp_if(), so that it's clear when looking at
the caller that it performs both a CMP and an IF?

With that change, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 38/42] i965/fs: allow unit tests to dump the final patched assembly

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 Unit tests comparing generated blorp programs to known good need
 to have the dump in designated file instead of in default
 standard output. The comparison also expects the jump counters
 of if-else-instructions to be correctly set and hence the dump
 needs to be taken _after_ 'patch_IF_ELSE()' is run (the default
 dump of the fs_generator does this before).

 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_fs.h |  7 +--
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 +--
  2 files changed, 14 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 9bef07c..d40d0a8 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -510,10 +510,13 @@ public:

 const unsigned *generate_assembly(exec_list *simd8_instructions,
   exec_list *simd16_instructions,
 - unsigned *assembly_size);
 + unsigned *assembly_size,
 + bool dump_enabled = false,
 + FILE *dump_file = stdout);


Nit pick: rather than add 2 args, wouldn't it be easier to just add the
dump_file arg, with a default value of NULL, and only dump instructions if
it is non-NULL?

Either way, the patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 39/42] i965/fs: introduce blorp specific rt-write for fs_generator

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 index df91235..4c159e6 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 @@ -190,6 +190,21 @@ fs_generator::generate_fb_write(fs_inst *inst)
 mark_surface_used(surf_index);
  }

 +void
 +fs_generator::generate_blorp_fb_write(fs_inst *inst)
 +{
 +   brw_fb_WRITE(p,
 +16 /* dispatch_width */,
 +inst-base_mrf,
 +brw_reg_from_fs_reg(inst-src[0]),
 +BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE,
 +1 /* BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX */,


This seems like it would lead to a maintenance burden if anyone ever
decides to change BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to a number
other than 1.  Can we move the declaration of
BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to somewhere that's accessible
to this file so that we can just pass it in directly?  Alternatively, we
could store it in inst-target.

With that issue addressed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com


 +inst-mlen,
 +0,
 +true,
 +inst-header_present);
 +}
 +

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/42] i965/eu: introduce blorp specific flavour of lrp

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 This is rather ugly but as I couldn't think of anything better
 for now and wanted to get the rest of the series under review,
 I left it as it is.
 Even though immediately surrounding code has tabs this piece is
 written space-indented.

 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com


This patch is unnecessary.  The blorp specific flavour of lrp is actually
a previously undiscovered bug in blorp: blorp should be using
BRW_COMPRESSION_2NDHALF for its second half LRPs.  I believe the bug may be
benign, but we should fix it anyhow, and I think we can fix it by just
dropping this patch entirely.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter

2014-01-20 Thread Paul Berry
On 20 January 2014 19:36, Paul Berry stereotype...@gmail.com wrote:

 On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 index b189aa2..dcfd82b 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
 @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const
 struct brw_reg src0,
  true /* eot */,
  use_header);
  }
 +
 +void
 +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type,
 +   const struct brw_reg dst,
 +   const struct brw_reg src_1,
 +   const struct brw_reg src_2)
 +{
 +   if (texture_data_type == BRW_REGISTER_TYPE_F)
 +  brw_ADD(func, dst, src_1, src_2);
 +   else
 +  brw_AVG(func, dst, src_1, src_2);
 +}


 It's a bit of an awkward split to have most of the algorithm for combining
 samples in brw_blorp_blit_program::manual_blend_average(), but the choice
 of whether to use ADD or AVG is here in
 brw_blorp_eu_emitter::emit_combine().  How about if we replace
 texture_data_type with a bool called combine_using_add?  That way someone
 reading manual_blend_average() won't have to refer to emit_combine() to
 understand what the algorithm does; and similarly someone reading
 emt_combine() won't have to look at manual_blend_average() to understand
 why we use ADD for floats and AVG for ints.


On further reflection, I think it would be even better to replace the
texture_data_type argument with an opcode argument--that way the caller can
pass in BRW_OPCODE_ADD or BRW_OPCODE_AVG.  Once we get to patch 42/42, this
function can be changed to just plumb the opcode straight through into the
fs_inst constructor.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 42/42] i965/blorp: switch eu-emitter to use FS IR and fs_generator

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote:

 Unfortunately the unit tests need to be patched as well. This is
 because the direct eu-emitter only patches jump counters for
 if-else (see patch_IF_ELSE()) while the fs_generator patches the
 endif as well (see brw_set_uip_jip()).


I think the fact that blorp wasn't patching jump counters for if-else was a
previously undiscovered bug in blorp.  I suspect that the reason we never
discovered it was because blorp doesn't contain complex enough flow control
to cause an endif to ever execute a jump.

I'd recommend splitting this patch into two patches:

1. fix the previously undiscovered blorp bug (by having blorp call
brw_set_uip_jip() at the appropriate time) and update the unit tests.

2. switch eu-emitter to use FS IR and fs_generator.

That will make it a lot easier to reassure ourselves that the changes are
correct, and in the unlikely event that we later discover that they aren't,
it'll make it easier to bisect to the problem.

With this patch split, it is:

Reviewed-by: Paul Berry stereotype...@gmail.com

I've sent comments on patches 24, 27, 37, 38, 39, and 41.  All the others
are:

Reviewed-by: Paul Berry stereotype...@gmail.com


I really like how you structured this series, Topi.  It was easy to follow
what you did and to reassure myself that it was correct.  And I'm really
glad you did it, since it led us to discover two preexisting bugs!  Nice
work.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Reserve space for Vertex Count in GS outputs.

2014-01-20 Thread Paul Berry
On 20 January 2014 22:56, Kenneth Graunke kenn...@whitecape.org wrote:

 v2: Also increment ir-offset in the GS visitor, rather than at the
 final assembly generation stage (requested by Paul).

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_vec4_gs.c   | 6 ++
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 +++
  2 files changed, 13 insertions(+)

 Hey Paul,

 I think I implemented all of your feedback from last time.  (Previously,
 I'd done this in gen8_vec4_generator, and you asked me to move it to the
 visitor.  This patch does that - and it is indeed much nicer.)

  --Ken


Looks good, thanks!

Reviewed-by: Paul Berry stereotype...@gmail.com



 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 index 018b0b6..c40112b 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 @@ -194,6 +194,12 @@ do_gs_prog(struct brw_context *brw,
c.prog_data.output_vertex_size_hwords * 32 *
 gp-program.VerticesOut;
 output_size_bytes += 32 * c.prog_data.control_data_header_size_hwords;

 +   /* Broadwell stores Vertex Count as a full 8 DWord (32 byte) URB
 output,
 +* which comes before the control header.
 +*/
 +   if (brw-gen = 8)
 +  output_size_bytes += 32;
 +
 assert(output_size_bytes = 1);
 if (output_size_bytes  GEN7_MAX_GS_URB_ENTRY_SIZE_BYTES)
return false;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 index f33c80d..f0351ea 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 @@ -267,6 +267,13 @@ vec4_gs_visitor::emit_urb_write_opcode(bool complete)

 vec4_instruction *inst = emit(GS_OPCODE_URB_WRITE);
 inst-offset = c-prog_data.control_data_header_size_hwords;
 +
 +   /* We need to increment Global Offset by 1 to make room for Broadwell's
 +* extra Vertex Count payload at the beginning of the URB entry.
 +*/
 +   if (brw-gen = 8)
 +  inst-offset++;
 +
 inst-urb_write_flags = BRW_URB_WRITE_PER_SLOT_OFFSET;
 return inst;
  }
 --
 1.8.5.2


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality

2014-01-17 Thread Paul Berry
On 16 January 2014 05:37, Tapani Pälli tapani.pa...@intel.com wrote:

  On 01/15/2014 06:13 PM, Paul Berry wrote:

 On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/mesa/main/shaderapi.c | 44
 ++--
  1 file changed, 38 insertions(+), 6 deletions(-)

 +   char *data = mesa_program_serialize(shProg, size);
 +
 +   /* we have more data that can fit to user given buffer */
 +   if (size  bufSize) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
 +  if (data)
 + free(data);


  Why would we ever expect mesa_program_serialize to set size to a nonzero
 value but return NULL?  It seems like this could only happen if there's a
 bug, in which case this really ought to be

  assert(data !=NULL);

  Also, it's safe to call free() on a NULL pointer.  According to the C
 standard, freeing a NULL pointer does nothing.


 sure, will fix




 +  return;
 +   }
 +
 +   if (data) {


  Similarly, this if-statement is unnecessary.


 it is required for memcpy but not for free


Oops, you're right.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program

2014-01-17 Thread Paul Berry
On 15 January 2014 02:46, Tapani tapani.pa...@intel.com wrote:

  On 01/14/2014 07:53 PM, Paul Berry wrote:

 On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

   +static void
 +serialize_uniform_storage(gl_uniform_storage *uni, memory_writer blob)


  I don't think this is right.  The ARB_get_program_binary spec says:

 8. How does ProgramBinary interact with uniform values, including
shader-specified defaults?

 RESOLVED: All uniforms specified in the binary are reset to their
 shader-
 specified defaults, or to zero if they aren't specified, when the
 program
 binary is loaded. The spec language has been updated to specify this
 behavior.

  The OES_get_program_binary spec doesn't mention uniforms at all, but I
 believe this is not intended to indicate a behavioural difference--it's
 just a consequence of the fact that ARB_get_program_binary was written
 later, so they had a chance to clarify things.  In fact, issue #7 in
 ARB_get_program_binary specifically says:

 7. How does this spec differ from the OES_get_program_binary and why?

 ...

 There are other areas where language was clarified, but this is meant
 to
 be consistent with the intent of the original specification and not to
 alter previously established behavior.

  So I believe we shouldn't be serializing uniform values.


 For me this seemed much easier way to serialize than recreate it though.
 Would it be enough if I reset the default values in place?


Yes, I think that would be reasonable.  Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization

2014-01-17 Thread Paul Berry
On 14 January 2014 02:35, Tapani Pälli tapani.pa...@intel.com wrote:

  On 01/13/2014 08:27 PM, Paul Berry wrote:

 On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Class will be used by the shader binary cache implementation.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/memory_writer.h | 147
 +++
  1 file changed, 147 insertions(+)
  create mode 100644 src/glsl/memory_writer.h

 diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
 new file mode 100644
 index 000..a6c6b55
 --- /dev/null
 +++ b/src/glsl/memory_writer.h
 @@ -0,0 +1,147 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining
 a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
 ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#pragma once
 +#ifndef MEMORY_WRITER_H
 +#define MEMORY_WRITER_H
 +
 +#include stdlib.h
 +#include unistd.h
 +#include string.h
 +
 +#ifdef __cplusplus
 +/**
 + * Helper class for writing data to memory
 + *
 + * This class maintains a dynamically-sized memory buffer and allows
 + * for data to be efficiently appended to it with automatic resizing.
 + */
 +class memory_writer
 +{
 +public:
 +   memory_writer() :
 +  memory(NULL),
 +  curr_size(0),
 +  pos(0) {}
 +
 +   ~memory_writer()
 +   {
 +  free(memory);
 +   }
 +
 +   /* user wants to claim the memory */
 +   char *release_memory(size_t *size)
 +   {
 +  /* final realloc to free allocated but unused memory */
 +  char *result = (char *) realloc(memory, pos);
 +  *size = pos;
 +  memory = NULL;
 +  curr_size = 0;
 +  pos = 0;
 +  return result;
 +   }
 +
 +/**
 + * write functions per type
 + */
 +#define DECL_WRITER(type) int write_ ##type (const type data) {\
 +   return write(data, sizeof(type));\
 +}
 +
 +   DECL_WRITER(int32_t);
 +   DECL_WRITER(int64_t);
 +   DECL_WRITER(uint8_t);
 +   DECL_WRITER(uint32_t);
 +
 +   int write_bool(bool data)
 +   {
 +  uint8_t val = data;
 +  return write_uint8_t(val);
 +   }
 +
 +   /* write function that reallocates more memory if required */
 +   int write(const void *data, int32_t size)
 +   {
 +  if (!memory || pos  (int32_t)(curr_size - size))
 + if (grow(size))
 +return -1;
 +
 +  memcpy(memory + pos, data, size);
 +
 +  pos += size;
 +  return 0;
 +   }
 +
 +   int overwrite(const void *data, int32_t size, int32_t offset)
 +   {
 +  if (offset  0 || offset + size  pos)
 + return -1;
 +  memcpy(memory + offset, data, size);
 +  return 0;
 +   }
 +
 +   int write_string(const char *str)
 +   {
 +  if (!str)
 + return -1;
 +  char terminator = '\0';
 +  write(str, strlen(str));
 +  write(terminator, 1);


  C strings include a terminator, so there's no reason to write out the
 string contents and the terminator separtely.  You can just do:

  write(str, strlen(str) + 1);

 Also, don't forget to propagate the return code to the caller:

  return write(str, strlen(str) + 1);


 Ah right, will do. It could be that this needs to be modified back to how
 it was though (to write to length of the string too). I think it is
 otherwise impossible for the reader side to know if terminator really
 exists and it might continue to read forever (or call strlen for
 unterminated string).


I don't know of any C library function that does exactly what you want.  Of
course you could always code up your own loop.  But I think I agree that
storing the length first is the best approach.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache

2014-01-17 Thread Paul Berry
On 14 January 2014 03:35, Tapani Pälli tapani.pa...@intel.com wrote:

  On 01/14/2014 01:24 AM, Paul Berry wrote:

 On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

   +   var-state_slots = NULL;
 +
 +   if (var-num_state_slots  0) {
 +  var-state_slots = ralloc_array(var, ir_state_slot,
 + var-num_state_slots);


  Security issue: if the shader blob is corrupted, num_state_slots could
 be outrageously large, causing this memory allocation to fail (which would
 then cause the loop below to perform illegal memory accesses).  We should
 validate that num_state_slots is within a reasonable range, and if it
 isn't, abort reading the binary blob.


 What is the expected reasonable range for state_slots?


I believe the uniform with the most state slots is gl_LightSource, which
has 96.

What I'd suggest doing (in a separate patch) is: add a

#define MAX_NUM_STATE_SLOTS 96

somewhere, and then in builtin_variable_generator::add_uniform(), right
after setting uni-num_state_slots, add:

assert(uni-num_state_slots = MAX_NUM_STATE_SLOTS);

That way, in the unlikely event that we ever add a GL feature that requires
more than this number of state slots, the assertion failure will alert us
that we need to increase MAX_NUM_STATE_SLOTS.




+   /* fill parse state from shader header information */
 +   switch (shader-Type) {
 +   case GL_VERTEX_SHADER:
 +  state-target = MESA_SHADER_VERTEX;
 +  break;
 +   case GL_FRAGMENT_SHADER:
 +  state-target = MESA_SHADER_FRAGMENT;
 +  break;
 +   case GL_GEOMETRY_SHADER_ARB:
 +  state-target = MESA_SHADER_GEOMETRY;
 +  break;


  There should be a default case here to handle corrupted blobs with an
 illegal type.


 I'm now doing _mesa_shader_enum_to_shader_stage(shader-Type) here which
 will assert if type is illegal.


An assertion isn't good enough, because assertions don't get checked in
release builds.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Add GS support to INTEL_DEBUG=shader_time.

2014-01-17 Thread Paul Berry
Previously, time spent in geometry shaders would be counted as part of
the vertex shader time.
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 +++
 src/mesa/drivers/dri/i965/brw_program.c   | 10 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  6 +++---
 src/mesa/drivers/dri/i965/brw_vec4.h  |  9 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 10 --
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  3 ++-
 src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp |  3 ++-
 8 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9c51646..972eb9f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -783,6 +783,9 @@ enum shader_time_shader_type {
ST_VS,
ST_VS_WRITTEN,
ST_VS_RESET,
+   ST_GS,
+   ST_GS_WRITTEN,
+   ST_GS_RESET,
ST_FS8,
ST_FS8_WRITTEN,
ST_FS8_RESET,
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index abfc921..a6a2403 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -288,7 +288,7 @@ get_written_and_reset(struct brw_context *brw, int i,
   uint64_t *written, uint64_t *reset)
 {
enum shader_time_shader_type type = brw-shader_time.types[i];
-   assert(type == ST_VS || type == ST_FS8 || type == ST_FS16);
+   assert(type == ST_VS || type == ST_GS || type == ST_FS8 || type == ST_FS16);
 
/* Find where we recorded written and reset. */
int wi, ri;
@@ -340,6 +340,8 @@ brw_report_shader_time(struct brw_context *brw)
   switch (type) {
   case ST_VS_WRITTEN:
   case ST_VS_RESET:
+  case ST_GS_WRITTEN:
+  case ST_GS_RESET:
   case ST_FS8_WRITTEN:
   case ST_FS8_RESET:
   case ST_FS16_WRITTEN:
@@ -349,6 +351,7 @@ brw_report_shader_time(struct brw_context *brw)
  continue;
 
   case ST_VS:
+  case ST_GS:
   case ST_FS8:
   case ST_FS16:
  get_written_and_reset(brw, i, written, reset);
@@ -372,6 +375,7 @@ brw_report_shader_time(struct brw_context *brw)
 
   switch (type) {
   case ST_VS:
+  case ST_GS:
   case ST_FS8:
   case ST_FS16:
  total_by_type[type] += scaled[i];
@@ -432,6 +436,9 @@ brw_report_shader_time(struct brw_context *brw)
   case ST_VS:
  stage = vs;
  break;
+  case ST_GS:
+ stage = gs;
+ break;
   case ST_FS8:
  stage = fs8;
  break;
@@ -449,6 +456,7 @@ brw_report_shader_time(struct brw_context *brw)
 
printf(\n);
print_shader_time_line(total, vs, -1, total_by_type[ST_VS], total);
+   print_shader_time_line(total, gs, -1, total_by_type[ST_GS], total);
print_shader_time_line(total, fs8, -1, total_by_type[ST_FS8], total);
print_shader_time_line(total, fs16, -1, total_by_type[ST_FS16], total);
 }
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9d3735a..2e319e38 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1494,10 +1494,10 @@ vec4_visitor::emit_shader_time_end()
 */
emit(ADD(diff, src_reg(diff), src_reg(-2u)));
 
-   emit_shader_time_write(ST_VS, src_reg(diff));
-   emit_shader_time_write(ST_VS_WRITTEN, src_reg(1u));
+   emit_shader_time_write(st_base, src_reg(diff));
+   emit_shader_time_write(st_written, src_reg(1u));
emit(BRW_OPCODE_ELSE);
-   emit_shader_time_write(ST_VS_RESET, src_reg(1u));
+   emit_shader_time_write(st_reset, src_reg(1u));
emit(BRW_OPCODE_ENDIF);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d4029d8..f2ecb2b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -231,7 +231,10 @@ public:
struct brw_shader *shader,
void *mem_ctx,
 bool debug_flag,
-bool no_spills);
+bool no_spills,
+shader_time_shader_type st_base,
+shader_time_shader_type st_written,
+shader_time_shader_type st_reset);
~vec4_visitor();
 
dst_reg dst_null_f()
@@ -547,6 +550,10 @@ private:
 * If true, then register allocation should fail instead of spilling.
 */
const bool no_spills;
+
+   const shader_time_shader_type st_base;
+   const shader_time_shader_type st_written;
+   const shader_time_shader_type st_reset;
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 5d5b169..8d8db5e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -41,7 +41,8 @@ 

Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality

2014-01-15 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/mesa/main/shaderapi.c | 44
 ++--
  1 file changed, 38 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
 index 57511e8..c07b226 100644
 --- a/src/mesa/main/shaderapi.c
 +++ b/src/mesa/main/shaderapi.c
 @@ -57,6 +57,7 @@
  #include ../glsl/ir.h
  #include ../glsl/ir_uniform.h
  #include ../glsl/program.h
 +#include ../glsl/shader_cache.h

  /** Define this to enable shader substitution (see below) */
  #define SHADER_SUBST 0
 @@ -1632,8 +1633,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei
 bufSize, GLsizei *length,
 if (length != NULL)
*length = 0;

 -   (void) binaryFormat;
 -   (void) binary;
 +   size_t size = 0;
 +   char *data = mesa_program_serialize(shProg, size);
 +
 +   /* we have more data that can fit to user given buffer */
 +   if (size  bufSize) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
 +  if (data)
 + free(data);


Why would we ever expect mesa_program_serialize to set size to a nonzero
value but return NULL?  It seems like this could only happen if there's a
bug, in which case this really ought to be

assert(data !=NULL);

Also, it's safe to call free() on a NULL pointer.  According to the C
standard, freeing a NULL pointer does nothing.


 +  return;
 +   }
 +
 +   if (data) {


Similarly, this if-statement is unnecessary.


 +  memcpy(binary, data, size);
 +  free(data);
 +   }
 +
 +   if (length != NULL)
 +  *length = size;
 +
 +   *binaryFormat = 0;
  }

  void GLAPIENTRY
 @@ -1647,10 +1666,23 @@ _mesa_ProgramBinary(GLuint program, GLenum
 binaryFormat,
 if (!shProg)
return;

 -   (void) binaryFormat;
 -   (void) binary;
 -   (void) length;
 -   _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
 +   if (length = 0)
 +  return;


In the case of an invalid length, we need to make sure to set the program's
LinkStatus to false.  Also, the information log needs to be cleared, in
accordance with this text from OES_get_program_binary:

If ProgramBinaryOES failed, any information about a previous link or
load of
that program object is lost.  Thus, a failed load does not restore the
old
state of program.



 +
 +   /* free possible existing data and initialize structure */
 +   _mesa_free_shader_program_data(ctx, shProg);
 +   _mesa_init_shader_program(ctx, shProg);
 +
 +   /* fill structure from a binary blob */
 +   if (mesa_program_deserialize(shProg, binary, length)) {
 +  _mesa_error(ctx, GL_INVALID_VALUE, glProgramBinary(binary
 incompatible));


This seems wrong to me.  From the OES_get_program_binary spec:

... binaryFormat and binary must be
those returned by a previous call to GetProgramBinaryOES, and length
must
be the length of the program binary as returned by GetProgramBinaryOES
or
GetProgramiv with pname PROGRAM_BINARY_LENGTH_OES.  The program binary
will fail to load if these conditions are not met.

...

A program object's program binary is replaced by calls to LinkProgram or
ProgramBinaryOES.  Either command sets the program object's LINK_STATUS
to
TRUE or FALSE, as queried with GetProgramiv, to reflect success or
failure.
Either command also updates its information log, queried with
GetProgramInfoLog, to provide details about warnings or errors.

I believe this means that if deserialization fails, it should not be a GL
error.  It should simply be treated as a link failure.


 +  return;
 +   }
 +
 +   /* driver specific link, optimizations and what not */
 +   ctx-Driver.LinkShader(ctx, shProg);


Now I'm confused.  I thought a major part of the purpose of this extension
was that it would store the post-link program, so that not only does
glProgramBinary() avoid the runtime penalty of parsing and compiling, it
also avoids the runtime penalty of link-time optimizations.  Calling
ctx-Driver.LinkShader seems like it defeats that purpose.  It seems like
what we ought to be doing is store the data that ctx-Driver.LinkShader
*produces* in the binary blob, so that once it's loaded there's no further
linking necessary.  If there is a small amount of driver-specific hook
necessary, that should be placed in a new ctx-Driver function rather than
incurring the overhead of another link.


 +
 +   _mesa_ValidateProgram(program);


I don't think this is correct.  ValidateProgram() doesn't mean check that
the program is well-formed.  It means check whether the program can
execute given the current GL state.  A lot of GL apps compile all their
shaders during startup, long before they've established the correct GL
state for running those programs.  It's reasonable to assume that apps
using this extension will make their calls to glProgramBinary() during
startup as well.  So calling ValidateProgram() from glProgramBinary() will

Re: [Mesa-dev] [wip 0/9] GL_OES_get_program_binary extension

2014-01-15 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Hi;

 Here's another take on the get_program_binary extension. I've rewritten big
 chunks of it based on Paul's review and comments earlier. Here's a brief
 list
 out of my head of things changed:

 - uses mesa compilation time as verification method for cache, not git sha
 - much smaller serialization code size, now as part of IR classes
 - smaller binary blob size
 - does not dump separate 'prototypes block' for functions but instead
  deserialization iterates over the whole blob for functions first
 - hashtables written faster than previously (iterated only once)
 - no 'unique_id' for ir_variables but using the address as identifier
 - patch set broken down to more individual patches/changes

 Mesa branch with these patches applied on top:
 http://cgit.freedesktop.org/~tpalli/mesa/log/?h=oes_get_program_binary

 I have plans for the addition of driver backend data but I was hoping this
 to
 be as a separate addition on top. Basically I extend gl_shader_program
 struct
 and offer API for drivers to dump their binary data there, some proto here:

 http://cgit.freedesktop.org/~tpalli/mesa/log/?h=driver_interface
 (in order to work more data needed like the keys but this is just test)

 I have also 'automatic' cache implementation here that can be used to
 verify
 the cache functionality with any application (without extension):
 http://cgit.freedesktop.org/~tpalli/mesa/log/?h=new_serialization

 Any comments appreciated;

 Tapani Pälli (9):
   glsl: memory_writer helper class for data serialization
   glsl: serialize methods for IR instructions
   glsl: memory_map helper class for data deserialization
   glsl: add MESA_SHADER_CACHE_MAGIC string for shader binary cache
   glsl: export populate_symbol_table function
   glsl: ir_deserializer class for the binary shader cache
   mesa: iterate method for string_to_uint_map
   glsl: functions to serialize gl_shader and gl_shader_program
   mesa: OES_get_program_binary extension functionality


I sent comments on all patches but 4, 5, and 7.  Those patches are:

Reviewed-by: Paul Berry stereotype...@gmail.com

I see that you've already responded to some of my comments.  I'm on a
business trip this week so I have limited time to answer email, but I'll
try to respond to your responses as soon as I can.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program

2014-01-14 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 diff --git a/src/glsl/shader_cache.cpp b/src/glsl/shader_cache.cpp
 new file mode 100644
 index 000..4b5de45
 --- /dev/null
 +++ b/src/glsl/shader_cache.cpp
 @@ -0,0 +1,489 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#include main/shaderobj.h
 +#include main/uniforms.h
 +#include main/macros.h
 +#include program/hash_table.h
 +#include ir_serialize.h
 +#include ir_deserializer.h
 +#include shader_cache_magic.h
 +
 +/**
 + * It is currently unknown if we need these to be available.
 + * There can be calls in mesa/main (like glGetAttachedShaders) that use
 + * the Shaders list.
 + */
 +const bool STORE_UNLINKED_SHADERS = false;


I think it really exaggerates our level of uncertainty to say that it is
currently unknown whether we need unlinked shaders to be available.
Speaking for myself at least, I'm quite convinced that we don't.  AFAICT,
OES_get_program_binary was purposefully architected so taht we don't need
to store the unlinked shaders.  GL has always maintained two independent
collections of state about any given program: pre-link state and post-link
state.  There's a one-way flow of information from the pre-link state to
the post-link state--once you've linked a program using glLinkShader(),
there's no way to get the unlinked information back.  This isn't a problem
for most users of GL because most people don't try to modify the pre-link
state after linking the program.  However it's perfectly permissible for
someone to create a program, attach shaders, link, and then detatch all the
shaders.  Since detatching the shaders affects pre-link state, the program
would still work after this point (since the post-link information would
not have changed since the link) but it would be impossible to query
information about the individual shaders anymore.

A similar situation exists when the client uses OES_get_attached_shaders.
glProgramBinaryOES() bypasses the pre-link state completely, and drops the
compiled binary directly into the post-link state.  From the
OES_get_program_binary spec:

Note that ProgramBinaryOES disregards any shader objects attached to the
program object, as these shader objects are used only by LinkProgram.

And later:

7. Can BindAttribLocation be called after ProgramBinaryOES to remap an
   attribute location used by the program binary?

RESOLVED: No.  BindAttribLocation only affects the result of a
subsequent
call to LinkProgram.  LinkProgram operates on the attached shader
objects
and replaces any program binary loaded prior to LinkProgram.  So there
is no
mechanism to remap an attribute location after loading a program binary.

However, an application is free to remap an attribute location prior to
retrieving the program binary.  By calling BindAttribLocation followed
by
LinkProgram, an application can remap the attribute location.  If this
is
followed by a call to GetProgramBinaryOES, the retrieved program binary
will
include the desired attribute location assignment.

So if the user creates a program and calls glProgramBinaryOES() followed by
glGetAttachedShaders(), they will see no shaders, since the pre-link state
is still in its initial configuration of having no shaders attached.

IMHO, trying to plan for the contingency case where we're wrong about this
is just going to lead to confusion and make the code hard to maintain.  I
think we should drop this const, and remove the code that would have been
executed if STORE_UNLINKED_SHADERS were true.  In the unlikely event that
it turns out we were wrong about this (or Khronos makes a change to the
spec that makes it necessary to store unlinked shaders) we can always
change the code in a future patch.

Also, incidentally, the way you've 

Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Class will be used by the shader binary cache implementation.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/memory_writer.h | 147
 +++
  1 file changed, 147 insertions(+)
  create mode 100644 src/glsl/memory_writer.h

 diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
 new file mode 100644
 index 000..a6c6b55
 --- /dev/null
 +++ b/src/glsl/memory_writer.h
 @@ -0,0 +1,147 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#pragma once
 +#ifndef MEMORY_WRITER_H
 +#define MEMORY_WRITER_H
 +
 +#include stdlib.h
 +#include unistd.h
 +#include string.h
 +
 +#ifdef __cplusplus
 +/**
 + * Helper class for writing data to memory
 + *
 + * This class maintains a dynamically-sized memory buffer and allows
 + * for data to be efficiently appended to it with automatic resizing.
 + */
 +class memory_writer
 +{
 +public:
 +   memory_writer() :
 +  memory(NULL),
 +  curr_size(0),
 +  pos(0) {}
 +
 +   ~memory_writer()
 +   {
 +  free(memory);
 +   }
 +
 +   /* user wants to claim the memory */
 +   char *release_memory(size_t *size)
 +   {
 +  /* final realloc to free allocated but unused memory */
 +  char *result = (char *) realloc(memory, pos);
 +  *size = pos;
 +  memory = NULL;
 +  curr_size = 0;
 +  pos = 0;
 +  return result;
 +   }
 +
 +/**
 + * write functions per type
 + */
 +#define DECL_WRITER(type) int write_ ##type (const type data) {\
 +   return write(data, sizeof(type));\
 +}
 +
 +   DECL_WRITER(int32_t);
 +   DECL_WRITER(int64_t);
 +   DECL_WRITER(uint8_t);
 +   DECL_WRITER(uint32_t);
 +
 +   int write_bool(bool data)
 +   {
 +  uint8_t val = data;
 +  return write_uint8_t(val);
 +   }
 +
 +   /* write function that reallocates more memory if required */
 +   int write(const void *data, int32_t size)
 +   {
 +  if (!memory || pos  (int32_t)(curr_size - size))
 + if (grow(size))
 +return -1;
 +
 +  memcpy(memory + pos, data, size);
 +
 +  pos += size;
 +  return 0;
 +   }
 +
 +   int overwrite(const void *data, int32_t size, int32_t offset)
 +   {
 +  if (offset  0 || offset + size  pos)
 + return -1;
 +  memcpy(memory + offset, data, size);
 +  return 0;
 +   }
 +
 +   int write_string(const char *str)
 +   {
 +  if (!str)
 + return -1;
 +  char terminator = '\0';
 +  write(str, strlen(str));
 +  write(terminator, 1);


C strings include a terminator, so there's no reason to write out the
string contents and the terminator separtely.  You can just do:

write(str, strlen(str) + 1);

Also, don't forget to propagate the return code to the caller:

return write(str, strlen(str) + 1);


 +  return 0;
 +   }
 +
 +   inline int32_t position() { return pos; }
 +
 +
 +private:
 +
 +   /* reallocate more memory */
 +   int grow(int32_t size)
 +   {
 +  int32_t new_size = 2 * (curr_size + size);
 +  char *more_mem = (char *) realloc(memory, new_size);
 +  if (more_mem == NULL) {
 + free(memory);
 + memory = NULL;
 + return -1;
 +  } else {
 + memory = more_mem;
 + curr_size = new_size;
 + return 0;
 +  }
 +   }
 +
 +   /* allocated memory */
 +   char *memory;
 +
 +   /* current size of the whole allocation */
 +   int32_t curr_size;
 +
 +   /* write position / size of the data written */
 +   int32_t pos;
 +};
 +
 +#endif /* ifdef __cplusplus */
 +
 +#endif /* MEMORY_WRITER_H */
 --
 1.8.3.1


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 2/9] glsl: serialize methods for IR instructions

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 diff --git a/src/glsl/ir_serialize.cpp b/src/glsl/ir_serialize.cpp
 new file mode 100644
 index 000..30ca018
 --- /dev/null
 +++ b/src/glsl/ir_serialize.cpp
 @@ -0,0 +1,392 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#include ir_serialize.h
 +
 +
 +/**
 + * Wraps serialization of an ir instruction, writes ir_type
 + * and length of each instruction package as a header for it
 + */
 +void
 +ir_instruction::serialize(memory_writer mem)
 +{
 +   uint32_t data_len = 0;
 +   uint8_t ir_type = this-ir_type;
 +   mem.write_uint8_t(ir_type);
 +
 +   int32_t start_pos = mem.position();
 +   mem.write_uint32_t(data_len);
 +
 +   this-serialize_data(mem);
 +
 +   data_len = mem.position() - start_pos - sizeof(data_len);
 +   mem.overwrite(data_len, sizeof(data_len), start_pos);


This function isn't checking the return values from mem.write_*(), so
there's no way for it to detect failure.  Also, since this function returns
void, there's no way for it to notify the caller of failure.  A similar
comment applies to all of the other serialize*() functions in this patch.
(Of course, considering our previous discussion about potentially removing
these int return values, this issue may be moot).


 +}
 +
 +
 +
 +
 +static void
 +serialize_glsl_type(const glsl_type *type, memory_writer mem)


The last time I reviewed this series, I mentioned the idea of making a
hashtable that maps each glsl_type to a small integer, so that we could
serialize each type just once (see
http://lists.freedesktop.org/archives/mesa-dev/2013-November/047740.html).
At the time, it sounded like you liked that idea.  Have you made that
change?  It looks to me like you've stopped serializing the built-in types,
but user-defined types are still serialized each time they occur.

With those two issues addressed, the patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 3/9] glsl: memory_map helper class for data deserialization

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 Class will be used by the shader binary cache implementation.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/memory_map.h | 174
 ++
  1 file changed, 174 insertions(+)
  create mode 100644 src/glsl/memory_map.h

 diff --git a/src/glsl/memory_map.h b/src/glsl/memory_map.h
 new file mode 100644
 index 000..1b68b72
 --- /dev/null
 +++ b/src/glsl/memory_map.h
 @@ -0,0 +1,174 @@
 +/* -*- c++ -*- */
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#pragma once
 +#ifndef MEMORY_MAP_H
 +#define MEMORY_MAP_H
 +
 +#include fcntl.h
 +#include unistd.h
 +#include sys/mman.h
 +#include sys/stat.h
 +
 +#ifdef __cplusplus
 +
 +/**
 + * Helper class to read data
 + *
 + * Class can read either from user given memory or from a file. On Linux
 + * file reading wraps around the Posix functions for mapping a file into
 + * the process's address space. Other OS may need different
 implementation.
 + */
 +class memory_map
 +{
 +public:
 +   memory_map() :
 +  mode(memory_map::READ_MEM),
 +  fd(0),
 +  cache_size(0),
 +  cache_mmap(NULL),
 +  cache_mmap_p(NULL)
 +   {
 +  /* only used by read_string() */
 +  mem_ctx = ralloc_context(NULL);
 +   }
 +
 +   /* read from disk */
 +   int map(const char *path)
 +   {
 +  struct stat stat_info;
 +  if (stat(path, stat_info) != 0)
 + return -1;


As before, I'm not thrilled with the use of -1 to mean failure and 0 to
mean success, because it forces the caller to use counterintuitive if
statements.  I'd prefer for map() to return a bool with true meaning
success and false meaning failure.


 +
 +  mode = memory_map::READ_MAP;
 +  cache_size = stat_info.st_size;
 +
 +  fd = open(path, O_RDONLY);
 +  if (fd) {
 + cache_mmap_p = cache_mmap = (char *)
 +mmap(NULL, cache_size, PROT_READ, MAP_PRIVATE, fd, 0);
 + return (cache_mmap == MAP_FAILED) ? -1 : 0;


MAP_FAILED is a nonzero value, so if this error condition ever occurs, the
destructor will errneously try to call munmap().

What I'd recommend doing instead is:

void *mmap_result = mmap(...);
if (mmap_result == MAP_FAILED) {
   close(fd);
   return -1;
}
cache_mmap_p = cache_mmap = (char *) mmap_result;
return 0;


 +  }
 +  return -1;
 +   }
 +
 +   /* read from memory */
 +   int map(const void *memory, size_t size)
 +   {
 +  cache_mmap_p = cache_mmap = (char *) memory;
 +  cache_size = size;
 +  return 0;
 +   }


IMHO, functions that cannot fail should return void.


 +
 +   /* wrap a portion from another map */
 +   int map(memory_map map, size_t size)
 +   {
 +  cache_mmap_p = cache_mmap = map.cache_mmap_p;
 +  cache_size = size;
 +  map.ffwd(size);
 +  return 0;
 +   }
 +
 +   ~memory_map() {
 +  if (cache_mmap  mode == READ_MAP) {
 + munmap(cache_mmap, cache_size);
 + close(fd);
 +  }
 +  ralloc_free(mem_ctx);
 +   }
 +
 +   /* move read pointer forward */
 +   inline void ffwd(int len)
 +   {
 +  cache_mmap_p += len;
 +   }
 +
 +   inline void jump(unsigned pos)
 +   {
 +  cache_mmap_p = cache_mmap + pos;
 +   }
 +
 +
 +   /* position of read pointer */
 +   inline uint32_t position()
 +   {
 +  return cache_mmap_p - cache_mmap;
 +   }
 +
 +   inline char *read_string()
 +   {
 +  char *str = ralloc_strdup(mem_ctx, cache_mmap_p);
 +  ffwd(strlen(str)+1);
 +  return str;


This is problematic from a security perspective.  If the client provides
corrupted data that ends in a truncated string (lacking a null terminator)
that could cause ralloc_strdup() to try to read beyond the end of the
file.  We need to make sure the code doesn't try to read beyond the end of
file, even if 

Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote:

 +
 +
 +/**
 + * Reads header part of the binary blob. Main purpose of this header is to
 + * validate that cached shader was produced with same Mesa driver version.
 + */
 +int
 +ir_deserializer::read_header(struct gl_shader *shader)
 +{
 +   char *cache_magic_id = map-read_string();
 +   char *driver_vendor = map-read_string();
 +   char *driver_renderer = map-read_string();
 +
 +   /* only used or debug output, silence compiler warning */
 +   (void) driver_vendor;
 +   (void) driver_renderer;


A single version of Mesa potentially supports many different hardware
types, and those different hardware types may define different values of
GLSL built-in constants.  They also may require core Mesa to do different
sets of lowering passes during compilation.  So we can't just ignore
driver_vendor and driver_renderer.  We need to reject the binary blob if
they don't match.


 +
 +   shader-Version = map-read_uint32_t();
 +   shader-Type = map-read_uint32_t();
 +   shader-IsES = map-read_uint8_t();
 +
 +   CACHE_DEBUG(%s: version %d, type 0x%x, %s (mesa %s)\n[%s %s]\n,
 +   __func__,  shader-Version, shader-Type,
 +   (shader-IsES) ? glsl es : desktop glsl,
 +   cache_magic_id, driver_vendor, driver_renderer);
 +
 +   const char *magic = mesa_get_shader_cache_magic();
 +
 +   if (memcmp(cache_magic_id, magic, strlen(magic)))
 +  return DIFFERENT_MESA_VER;


If cache_magic_id is foobar and magic is foo, this will erroneusly
consider them equal.  The correct way to do this is to use strcmp().


 +
 +   /* post-link data */
 +   shader-num_samplers = map-read_uint32_t();
 +   shader-active_samplers = map-read_uint32_t();
 +   shader-shadow_samplers = map-read_uint32_t();
 +   shader-num_uniform_components = map-read_uint32_t();
 +   shader-num_combined_uniform_components = map-read_uint32_t();
 +   shader-uses_builtin_functions = map-read_uint8_t();
 +
 +   map-read(shader-Geom, sizeof(shader-Geom));
 +
 +   for (unsigned i = 0; i  MAX_SAMPLERS; i++)
 +  shader-SamplerUnits[i] = map-read_uint8_t();
 +
 +   for (unsigned i = 0; i  MAX_SAMPLERS; i++)
 +  shader-SamplerTargets[i] = (gl_texture_index) map-read_int32_t();
 +
 +   return 0;
 +}
 +
 +
 +const glsl_type *
 +ir_deserializer::read_glsl_type()
 +{
 +   char *name = map-read_string();
 +   uint32_t type_size = map-read_uint32_t();
 +
 +   const glsl_type *existing_type =
 +  state-symbols-get_type(name);
 +
 +   /* if type exists, move read pointer forward and return type */
 +   if (existing_type) {
 +  map-ffwd(type_size);
 +  return existing_type;
 +   }
 +
 +   uint8_t base_type = map-read_uint8_t();
 +   uint32_t length = map-read_uint32_t();
 +   uint8_t vector_elms = map-read_uint8_t();
 +   uint8_t matrix_cols = map-read_uint8_t();
 +   uint8_t interface_packing = map-read_uint8_t();
 +
 +   /* array type has additional element_type information */
 +   if (base_type == GLSL_TYPE_ARRAY) {
 +  const glsl_type *element_type = read_glsl_type();
 +  if (!element_type) {
 + CACHE_DEBUG(error reading array element type\n);
 + return NULL;
 +  }
 +  return glsl_type::get_array_instance(element_type, length);
 +   }
 +
 +   /* structures have fields containing of names and types */
 +   else if (base_type == GLSL_TYPE_STRUCT ||
 +  base_type == GLSL_TYPE_INTERFACE) {
 +  glsl_struct_field *fields = ralloc_array(mem_ctx,
 + glsl_struct_field, length);
 +
 +  if (!fields)
 + return glsl_type::error_type;
 +
 +  for (unsigned k = 0; k  length; k++) {
 + uint8_t row_major, interpolation, centroid;
 + int32_t location;
 + char *field_name = map-read_string();
 + fields[k].name = _mesa_strdup(field_name);
 + fields[k].type = read_glsl_type();
 + row_major = map-read_uint8_t();
 + location = map-read_int32_t();
 + interpolation = map-read_uint8_t();
 + centroid = map-read_uint8_t();
 + fields[k].row_major = row_major;
 + fields[k].location = location;
 + fields[k].interpolation = interpolation;
 + fields[k].centroid = centroid;


Another security issue: if the binary blob is corrupted, length may be
outrageously large (e.g. 0x).  We need a way for this loop to bail
out and exit if it tries to read past the end of the binary blob.


 +  }
 +
 +  const glsl_type *ret_type = NULL;
 +
 +  if (base_type == GLSL_TYPE_STRUCT)
 + ret_type = glsl_type::get_record_instance(fields, length, name);
 +  else if (base_type == GLSL_TYPE_INTERFACE)
 + ret_type = glsl_type::get_interface_instance(fields,
 +length, (glsl_interface_packing) interface_packing, name);
 +
 +  /* free allocated memory */
 +  for (unsigned k = 0; k  length; k++)
 + free((void *)fields[k].name);
 +  ralloc_free(fields);
 +
 +  return ret_type;

Re: [Mesa-dev] [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages.

2014-01-10 Thread Paul Berry
On 9 January 2014 22:17, Chris Forbes chr...@ijw.co.nz wrote:

 This is a nice cleanup; I like that this brings both writes to
 prog-LastClipDistanceArraySize together -- but it looks like the
 behavior changes slightly.

 Previously, if there was no VS and no GS, then we would never write
 prog-LastClipDistanceArraySize. Now we'll read an old junk value
 (potentially from a previous linking of the same program object with
 different shaders attached) from prog-Vert.ClipDistanceArraySize
 (since we never called validate_vertex_shader_executable) -- but we'll
 never end up actually using it, since it's only used for transform
 feedback of gl_ClipDistance.


Yeah, that's a good point.  I agree that it's completely benign, but I
think I could have made things clearer.  I've changed it to this:

   if (num_shaders[MESA_SHADER_GEOMETRY]  0)
  prog-LastClipDistanceArraySize = prog-Geom.ClipDistanceArraySize;
   else if (num_shaders[MESA_SHADER_VERTEX]  0)
  prog-LastClipDistanceArraySize = prog-Vert.ClipDistanceArraySize;
   else
  prog-LastClipDistanceArraySize = 0; /* Not used */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.

2014-01-10 Thread Paul Berry
On 9 January 2014 20:03, Chris Forbes chr...@ijw.co.nz wrote:

 This is a slightly odd construction (although copied from the existing
 code):

   +  if ((shProg == NULL) || (shProg-_LinkedShaders[stage] == NULL))
   +shProg = NULL;


You're right.  This would be much better as:

   if ((shProg != NULL)  (shProg-_LinkedShaders[stage] == NULL))
  shProg = NULL;

I'll update it in a follow-up patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().

2014-01-10 Thread Paul Berry
On 9 January 2014 22:32, Chris Forbes chr...@ijw.co.nz wrote:

 Minor nit, but could CS be done after the ordered pipeline stages, for
 consistency?


Sure, no problem.

In fact, I just realized that if I change the type parameter of
use_shader_program to gl_shader_stage, I can call it in a loop.  I'll do
that in a follow-up patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.

2014-01-10 Thread Paul Berry
On 10 January 2014 11:41, Kenneth Graunke kenn...@whitecape.org wrote:

 On 01/09/2014 06:19 PM, Paul Berry wrote:
  This is possible now that ctx-Shader.CurrentProgram is an array.
  ---
   src/mesa/main/texstate.c | 75
 +++-
   1 file changed, 29 insertions(+), 46 deletions(-)
 
  diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c
  index b9c76da..905a9d5 100644
  --- a/src/mesa/main/texstate.c
  +++ b/src/mesa/main/texstate.c
  @@ -526,27 +526,20 @@ static void
   update_texture_state( struct gl_context *ctx )
   {
  GLuint unit;
  -   struct gl_program *fprog = NULL;
  -   struct gl_program *vprog = NULL;
  -   struct gl_program *gprog = NULL;
  +   struct gl_program *prog[MESA_SHADER_STAGES];
  GLbitfield enabledFragUnits = 0x0;
  -
  -   if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] 
  -   ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkStatus) {
  -  vprog =
 ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-_LinkedShaders[MESA_SHADER_VERTEX]-Program;
  -   }
  -
  -   if (ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY] 
  -   ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-LinkStatus) {
  -  gprog =
 ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-_LinkedShaders[MESA_SHADER_GEOMETRY]-Program;
  -   }
  -
  -   if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] 
  -   ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-LinkStatus) {
  -  fprog =
 ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-_LinkedShaders[MESA_SHADER_FRAGMENT]-Program;
  -   }
  -   else if (ctx-FragmentProgram._Enabled) {
  -  fprog = ctx-FragmentProgram.Current-Base;
  +   int i;
  +
  +   for (i = 0; i  MESA_SHADER_STAGES; i++) {
  +  if (ctx-Shader.CurrentProgram[i] 
  +  ctx-Shader.CurrentProgram[i]-LinkStatus) {
  + prog[i] =
 ctx-Shader.CurrentProgram[i]-_LinkedShaders[i]-Program;
  +  } else {
  + if (i == MESA_SHADER_FRAGMENT  ctx-FragmentProgram._Enabled)
  +prog[i] = ctx-FragmentProgram.Current-Base;
  + else
  +prog[i] = NULL;
  +  }
  }
 
  /* TODO: only set this if there are actual changes */
  @@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx )
   */
  for (unit = 0; unit  ctx-Const.MaxCombinedTextureImageUnits;
 unit++) {
 struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit];
  -  GLbitfield enabledVertTargets = 0x0;
  -  GLbitfield enabledFragTargets = 0x0;
  -  GLbitfield enabledGeomTargets = 0x0;
  +  GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES];
 GLbitfield enabledTargets = 0x0;
 GLuint texIndex;
 
  @@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx )
  * by a fragment program/program.  When multiple flags are set,
 we'll
  * settle on the one with highest priority (see below).
  */
  -  if (vprog) {
  - enabledVertTargets |= vprog-TexturesUsed[unit];
  +  for (i = 0; i  MESA_SHADER_STAGES; i++) {
  + if (prog[i])
  +enabledTargetsByStage[i] = prog[i]-TexturesUsed[unit];
  + else if (i == MESA_SHADER_FRAGMENT)
  +enabledTargetsByStage[i] = texUnit-Enabled;
  + else
  +enabledTargetsByStage[i] = 0;
  + enabledTargets |= enabledTargetsByStage[i];

 This doesn't look equivalent.  Previously,
 enabled{Vert,Frag,Geom}Targets would accumulate bits over every
 iteration through units.  Now, enabledTargetsByStage only reflects the
 most recent unit.

 enabledTargets still accumulates properly, but you use
 enabledTargetsByStage below, so I think it needs to as well.

 You just need to use |= in both places, like the old code.


It's definitely possible that I got something wrong in this patch--I didn't
completely grok what the code was doing and instead strived to just
preserve the existing semantics.

But I'm pretty sure I got the enabledTargetsByStage part right.  Here's
what the code used to look like--the enabled{Vert,Frag,Geom}Targets
variables were all reset to 0 every time through the loop:

   for (unit = 0; unit  ctx-Const.MaxCombinedTextureImageUnits; unit++) {
  struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit];
  GLbitfield enabledVertTargets = 0x0;
  GLbitfield enabledFragTargets = 0x0;
  GLbitfield enabledGeomTargets = 0x0;
  GLbitfield enabledTargets = 0x0;
  GLuint texIndex;

  /* Get the bitmask of texture target enables.
   * enableBits will be a mask of the TEXTURE_*_BIT flags indicating
   * which texture targets are enabled (fixed function) or referenced
   * by a fragment program/program.  When multiple flags are set, we'll
   * settle on the one with highest priority (see below).
   */
  if (vprog) {
 enabledVertTargets |= vprog-TexturesUsed[unit];
  }

  if (gprog) {
 enabledGeomTargets |= gprog-TexturesUsed[unit];
  }

  if (fprog

[Mesa-dev] [PATCH] i965: Ensure that all necessary state is re-emitted if we run out of aperture.

2014-01-10 Thread Paul Berry
Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again.  However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state().  As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state.  (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).

This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.

Cc: 10.0 9.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_draw.c |  6 ++
 src/mesa/drivers/dri/i965/brw_state.h|  1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c | 14 ++
 3 files changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index b898cd3..ec56a21 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -499,6 +499,12 @@ retry:
}
 }
   }
+
+  /* Now that we know we haven't run out of aperture space, we can safely
+   * reset the dirty bits.
+   */
+  if (brw-state.dirty.brw)
+ brw_clear_dirty_bits(brw);
}
 
if (brw-always_flush_batch)
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index a148125..e9428fb 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -143,6 +143,7 @@ brw_depthbuffer_format(struct brw_context *brw);
  * brw_state.c
  */
 void brw_upload_state(struct brw_context *brw);
+void brw_clear_dirty_bits(struct brw_context *brw);
 void brw_init_state(struct brw_context *brw);
 void brw_destroy_state(struct brw_context *brw);
 
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 1eb3a79..7490df4 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -573,6 +573,20 @@ void brw_upload_state(struct brw_context *brw)
 fprintf(stderr, \n);
   }
}
+}
+
 
+/**
+ * Clear dirty bits to account for the fact that the state emitted by
+ * brw_upload_state() has been committed to the hardware.  This is a separate
+ * call from brw_upload_state() because it's possible that after the call to
+ * brw_upload_state(), we will discover that we've run out of aperture space,
+ * and need to rewind the batch buffer to the state it had before the
+ * brw_upload_state() call.
+ */
+void
+brw_clear_dirty_bits(struct brw_context *brw)
+{
+   struct brw_state_flags *state = brw-state.dirty;
memset(state, 0, sizeof(*state));
 }
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] mesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target.

2014-01-09 Thread Paul Berry
On 8 January 2014 12:21, Emil Velikov emil.l.veli...@gmail.com wrote:

 On 08/01/14 19:20, Paul Berry wrote:
  diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h
  index 4015b4c..648233c 100644
  --- a/src/mesa/program/program.h
  +++ b/src/mesa/program/program.h
  @@ -207,6 +207,24 @@ _mesa_program_enum_to_shader_stage(GLenum v)
  }
   }
 
  +
  +static inline GLenum
  +_mesa_shader_stage_to_program(gl_shader_stage stage)
  +{
  +   switch (stage) {
  +   case MESA_SHADER_VERTEX:
  +  return GL_VERTEX_PROGRAM_ARB;
  +   case MESA_SHADER_FRAGMENT:
  +  return GL_FRAGMENT_PROGRAM_ARB;
  +   case MESA_SHADER_GEOMETRY:
  +  return GL_GEOMETRY_PROGRAM_NV;
  +   }
  +
  +   ASSERT(0);
 Hi Paul

 Can you use a normal assert that prints a somewhat informative message -
 similar to what you did in shader_stage_to_ptarget()?

 Thanks
 Emil


Sure, no problem.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/30] mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.

2014-01-09 Thread Paul Berry
In my recent zeal to refactor Mesa's handling of the gl_shader_stage
enum, I accidentally wound up with two functions that do the same
thing: _mesa_program_index_to_target(), and
_mesa_shader_stage_to_program().

This patch keeps _mesa_shader_stage_to_program(), since its name is
more consistent with other related functions.  However, it changes the
signature so that it accepts an unsigned integer instead of a
gl_shader_stage--this avoids awkward casts when the function is called
from C++ code.
---
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  2 +-
 src/mesa/program/ir_to_mesa.cpp|  2 +-
 src/mesa/program/program.h | 19 +--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
 4 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index cf9ca4b..141f8a4 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -127,7 +127,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 continue;
 
   struct gl_program *prog =
-ctx-Driver.NewProgram(ctx, _mesa_program_index_to_target(stage),
+ctx-Driver.NewProgram(ctx, _mesa_shader_stage_to_program(stage),
 shader-base.Name);
   if (!prog)
return false;
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index f6c229c..af6f59f 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3053,7 +3053,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 _mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program,
 linked_prog);
  if (!ctx-Driver.ProgramStringNotify(ctx,
-  _mesa_program_index_to_target(i),
+  _mesa_shader_stage_to_program(i),
   linked_prog)) {
 return GL_FALSE;
  }
diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h
index 0e350cd..f666e30 100644
--- a/src/mesa/program/program.h
+++ b/src/mesa/program/program.h
@@ -209,7 +209,7 @@ _mesa_program_enum_to_shader_stage(GLenum v)
 
 
 static inline GLenum
-_mesa_shader_stage_to_program(gl_shader_stage stage)
+_mesa_shader_stage_to_program(unsigned stage)
 {
switch (stage) {
case MESA_SHADER_VERTEX:
@@ -225,23 +225,6 @@ _mesa_shader_stage_to_program(gl_shader_stage stage)
 }
 
 
-static inline GLenum
-_mesa_program_index_to_target(GLuint i)
-{
-   static const GLenum enums[] = {
-  GL_VERTEX_PROGRAM_ARB,
-  GL_GEOMETRY_PROGRAM_NV,
-  GL_FRAGMENT_PROGRAM_ARB
-   };
-   STATIC_ASSERT(Elements(enums) == MESA_SHADER_STAGES);
-   if(i = MESA_SHADER_STAGES) {
-  assert(!Unexpected program index);
-  return 0;
-   } else
-  return enums[i];
-}
-
-
 /* Cast wrappers from gl_program to gl_vertex/geometry/fragment_program */
 
 static inline struct gl_fragment_program *
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 73c39eb..a7dfa67 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5319,7 +5319,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 _mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program,
 linked_prog);
  if (!ctx-Driver.ProgramStringNotify(ctx,
-  _mesa_program_index_to_target(i),
+  _mesa_shader_stage_to_program(i),
   linked_prog)) {
_mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program,
NULL);
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/30] main: Allow ctx == NULL in _mesa_validate_shader_target().

2014-01-09 Thread Paul Berry
This will allow this function to be used in circumstances where there
is no context available, such as when building built-in GLSL
functions.
---
 src/mesa/main/shaderapi.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 716e659..2ab0a0c 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -174,13 +174,20 @@ _mesa_copy_string(GLchar *dst, GLsizei maxLength,
 bool
 _mesa_validate_shader_target(const struct gl_context *ctx, GLenum type)
 {
+   /* Note: when building built-in GLSL functions, this function may be
+* invoked with ctx == NULL.  In that case, we can only validate that it's
+* a shader target we recognize, not that it's supported in the current
+* context.  But that's fine--we don't need any further validation than
+* that when building built-in GLSL functions.
+*/
+
switch (type) {
case GL_FRAGMENT_SHADER:
-  return ctx-Extensions.ARB_fragment_shader;
+  return ctx == NULL || ctx-Extensions.ARB_fragment_shader;
case GL_VERTEX_SHADER:
-  return ctx-Extensions.ARB_vertex_shader;
+  return ctx == NULL || ctx-Extensions.ARB_vertex_shader;
case GL_GEOMETRY_SHADER_ARB:
-  return _mesa_has_geometry_shaders(ctx);
+  return ctx == NULL || _mesa_has_geometry_shaders(ctx);
default:
   return false;
}
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/30] mesa: Make validate_shader_target() non-static.

2014-01-09 Thread Paul Berry
---
 src/mesa/main/shaderapi.c | 8 
 src/mesa/main/shaderapi.h | 3 +++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 6042fa8..716e659 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -171,8 +171,8 @@ _mesa_copy_string(GLchar *dst, GLsizei maxLength,
  * \param type  Shader target
  *
  */
-static bool
-validate_shader_target(const struct gl_context *ctx, GLenum type)
+bool
+_mesa_validate_shader_target(const struct gl_context *ctx, GLenum type)
 {
switch (type) {
case GL_FRAGMENT_SHADER:
@@ -273,7 +273,7 @@ create_shader(struct gl_context *ctx, GLenum type)
struct gl_shader *sh;
GLuint name;
 
-   if (!validate_shader_target(ctx, type)) {
+   if (!_mesa_validate_shader_target(ctx, type)) {
   _mesa_error(ctx, GL_INVALID_ENUM, CreateShader(type));
   return 0;
}
@@ -1739,7 +1739,7 @@ _mesa_UseShaderProgramEXT(GLenum type, GLuint program)
GET_CURRENT_CONTEXT(ctx);
struct gl_shader_program *shProg = NULL;
 
-   if (!validate_shader_target(ctx, type)) {
+   if (!_mesa_validate_shader_target(ctx, type)) {
   _mesa_error(ctx, GL_INVALID_ENUM, glUseShaderProgramEXT(type));
   return;
}
diff --git a/src/mesa/main/shaderapi.h b/src/mesa/main/shaderapi.h
index 4822e32..10f810c 100644
--- a/src/mesa/main/shaderapi.h
+++ b/src/mesa/main/shaderapi.h
@@ -215,6 +215,9 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
const struct gl_shader_program *src,
struct gl_program *dst);
 
+extern bool
+_mesa_validate_shader_target(const struct gl_context *ctx, GLenum type);
+
 
 #ifdef __cplusplus
 }
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/30] mesa: Remove ad-hoc arrays of gl_shader_program.

2014-01-09 Thread Paul Berry
Now that we have a ctx-Shader.CurrentProgram array, we can just use
it directly.
---
 src/mesa/main/context.c | 6 +-
 src/mesa/state_tracker/st_draw.c| 6 +-
 src/mesa/state_tracker/st_program.c | 6 +-
 3 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 026d7aa..5855f15 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1853,13 +1853,9 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)
 
 #ifdef DEBUG
if (ctx-Shader.Flags  GLSL_LOG) {
-  struct gl_shader_program *shProg[MESA_SHADER_STAGES];
+  struct gl_shader_program **shProg = ctx-Shader.CurrentProgram;
   gl_shader_stage i;
 
-  shProg[MESA_SHADER_VERTEX] = 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
-  shProg[MESA_SHADER_GEOMETRY] = 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
-  shProg[MESA_SHADER_FRAGMENT] = 
ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT];
-
   for (i = 0; i  MESA_SHADER_STAGES; i++) {
 if (shProg[i] == NULL || shProg[i]-_Used
 || shProg[i]-_LinkedShaders[i] == NULL)
diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index 75a71f1..85677c3 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -131,11 +131,7 @@ setup_index_buffer(struct st_context *st,
 static void
 check_uniforms(struct gl_context *ctx)
 {
-   struct gl_shader_program *shProg[3] = {
-  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX],
-  ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY],
-  ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT],
-   };
+   struct gl_shader_program **shProg = ctx-Shader.CurrentProgram;
unsigned j;
 
for (j = 0; j  3; j++) {
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index fe3dafe..ef2abdb 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -1195,11 +1195,7 @@ st_get_gp_variant(struct st_context *st,
 void
 st_print_shaders(struct gl_context *ctx)
 {
-   struct gl_shader_program *shProg[3] = {
-  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX],
-  ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY],
-  ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT],
-   };
+   struct gl_shader_program **shProg = ctx-Shader.CurrentProgram;
unsigned j;
 
for (j = 0; j  3; j++) {
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/30] i965: Fix comments to refer to the new ctx-Shader.CurrentProgram array.

2014-01-09 Thread Paul Berry
---
 src/mesa/drivers/dri/i965/brw_wm_state.c  | 4 ++--
 src/mesa/drivers/dri/i965/gen6_wm_state.c | 4 ++--
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_state.c
index 303a2eb..514dfb2 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_state.c
@@ -112,8 +112,8 @@ brw_upload_wm_unit(struct brw_context *brw)
wm-thread1.depth_coef_urb_read_offset = 1;
/* Use ALT floating point mode for ARB fragment programs, because they
 * require 0^0 == 1.  Even though _CurrentFragmentProgram is used for
-* rendering, CurrentFragmentProgram is used for this check to
-* differentiate between the GLSL and non-GLSL cases.
+* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check
+* to differentiate between the GLSL and non-GLSL cases.
 */
if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL)
   wm-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
b/src/mesa/drivers/dri/i965/gen6_wm_state.c
index 5188aa8..585c0c5 100644
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
@@ -137,8 +137,8 @@ upload_wm_state(struct brw_context *brw)
 
/* Use ALT floating point mode for ARB fragment programs, because they
 * require 0^0 == 1.  Even though _CurrentFragmentProgram is used for
-* rendering, CurrentFragmentProgram is used for this check to
-* differentiate between the GLSL and non-GLSL cases.
+* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check
+* to differentiate between the GLSL and non-GLSL cases.
 */
if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL)
   dw2 |= GEN6_WM_FLOATING_POINT_MODE_ALT;
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
index 7f2a50c..284f8b6 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
@@ -170,8 +170,8 @@ upload_ps_state(struct brw_context *brw)
 
/* Use ALT floating point mode for ARB fragment programs, because they
 * require 0^0 == 1.  Even though _CurrentFragmentProgram is used for
-* rendering, CurrentFragmentProgram is used for this check to
-* differentiate between the GLSL and non-GLSL cases.
+* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check
+* to differentiate between the GLSL and non-GLSL cases.
 */
/* BRW_NEW_FRAGMENT_PROGRAM */
if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL)
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/30] glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.

2014-01-09 Thread Paul Berry
Linker loops that iterate through all the stages in the pipeline need
to use MESA_SHADER_FRAGMENT as a bound, so that we can add an
additional MESA_SHADER_COMPUTE stage, without it being erroneously
included in the pipeline.
---
 src/glsl/linker.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index f3fd66f..7461b17 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2094,7 +2094,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
unsigned prev;
 
-   for (prev = 0; prev  MESA_SHADER_STAGES; prev++) {
+   for (prev = 0; prev = MESA_SHADER_FRAGMENT; prev++) {
   if (prog-_LinkedShaders[prev] != NULL)
  break;
}
@@ -2102,7 +2102,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
/* Validate the inputs of each stage with the output of the preceding
 * stage.
 */
-   for (unsigned i = prev + 1; i  MESA_SHADER_STAGES; i++) {
+   for (unsigned i = prev + 1; i = MESA_SHADER_FRAGMENT; i++) {
   if (prog-_LinkedShaders[i] == NULL)
  continue;
 
@@ -2197,7 +2197,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 
unsigned first;
-   for (first = 0; first  MESA_SHADER_STAGES; first++) {
+   for (first = 0; first = MESA_SHADER_FRAGMENT; first++) {
   if (prog-_LinkedShaders[first] != NULL)
 break;
}
@@ -2229,7 +2229,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 * eliminated if they are (transitively) not used in a later stage.
 */
int last, next;
-   for (last = MESA_SHADER_STAGES-1; last = 0; last--) {
+   for (last = MESA_SHADER_FRAGMENT; last = 0; last--) {
   if (prog-_LinkedShaders[last] != NULL)
  break;
}
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages.

2014-01-09 Thread Paul Berry
Rather than maintain separately named arrays and counts for vertex,
geometry, and fragment shaders, just maintain these as arrays indexed
by the gl_shader_type enum.
---
 src/glsl/linker.cpp | 114 ++--
 1 file changed, 39 insertions(+), 75 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index e820f0f..f3fd66f 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1994,19 +1994,14 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
/* Separate the shaders into groups based on their type.
 */
-   struct gl_shader **vert_shader_list;
-   unsigned num_vert_shaders = 0;
-   struct gl_shader **frag_shader_list;
-   unsigned num_frag_shaders = 0;
-   struct gl_shader **geom_shader_list;
-   unsigned num_geom_shaders = 0;
-
-   vert_shader_list = (struct gl_shader **)
-  calloc(prog-NumShaders, sizeof(struct gl_shader *));
-   frag_shader_list = (struct gl_shader **)
-  calloc(prog-NumShaders, sizeof(struct gl_shader *));
-   geom_shader_list = (struct gl_shader **)
-  calloc(prog-NumShaders, sizeof(struct gl_shader *));
+   struct gl_shader **shader_list[MESA_SHADER_STAGES];
+   unsigned num_shaders[MESA_SHADER_STAGES];
+
+   for (int i = 0; i  MESA_SHADER_STAGES; i++) {
+  shader_list[i] = (struct gl_shader **)
+ calloc(prog-NumShaders, sizeof(struct gl_shader *));
+  num_shaders[i] = 0;
+   }
 
unsigned min_version = UINT_MAX;
unsigned max_version = 0;
@@ -2022,20 +2017,9 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 goto done;
   }
 
-  switch (prog-Shaders[i]-Stage) {
-  case MESA_SHADER_VERTEX:
-vert_shader_list[num_vert_shaders] = prog-Shaders[i];
-num_vert_shaders++;
-break;
-  case MESA_SHADER_FRAGMENT:
-frag_shader_list[num_frag_shaders] = prog-Shaders[i];
-num_frag_shaders++;
-break;
-  case MESA_SHADER_GEOMETRY:
-geom_shader_list[num_geom_shaders] = prog-Shaders[i];
-num_geom_shaders++;
-break;
-  }
+  gl_shader_stage shader_type = prog-Shaders[i]-Stage;
+  shader_list[shader_type][num_shaders[shader_type]] = prog-Shaders[i];
+  num_shaders[shader_type]++;
}
 
/* In desktop GLSL, different shader versions may be linked together.  In
@@ -2052,7 +2036,8 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
/* Geometry shaders have to be linked with vertex shaders.
 */
-   if (num_geom_shaders  0  num_vert_shaders == 0) {
+   if (num_shaders[MESA_SHADER_GEOMETRY]  0 
+   num_shaders[MESA_SHADER_VERTEX] == 0) {
   linker_error(prog, Geometry shader must be linked with 
   vertex shader\n);
   goto done;
@@ -2067,55 +2052,37 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
/* Link all shaders for a particular stage and validate the result.
 */
-   if (num_vert_shaders  0) {
-  gl_shader *const sh =
-link_intrastage_shaders(mem_ctx, ctx, prog, vert_shader_list,
-num_vert_shaders);
-
-  if (!prog-LinkStatus)
-goto done;
-
-  validate_vertex_shader_executable(prog, sh);
-  if (!prog-LinkStatus)
-goto done;
-  prog-LastClipDistanceArraySize = prog-Vert.ClipDistanceArraySize;
+   for (int stage = 0; stage  MESA_SHADER_STAGES; stage++) {
+  if (num_shaders[stage]  0) {
+ gl_shader *const sh =
+link_intrastage_shaders(mem_ctx, ctx, prog, shader_list[stage],
+num_shaders[stage]);
 
-  _mesa_reference_shader(ctx, prog-_LinkedShaders[MESA_SHADER_VERTEX],
-sh);
-   }
-
-   if (num_frag_shaders  0) {
-  gl_shader *const sh =
-link_intrastage_shaders(mem_ctx, ctx, prog, frag_shader_list,
-num_frag_shaders);
-
-  if (!prog-LinkStatus)
-goto done;
+ if (!prog-LinkStatus)
+goto done;
 
-  validate_fragment_shader_executable(prog, sh);
-  if (!prog-LinkStatus)
-goto done;
+ switch (stage) {
+ case MESA_SHADER_VERTEX:
+validate_vertex_shader_executable(prog, sh);
+break;
+ case MESA_SHADER_GEOMETRY:
+validate_geometry_shader_executable(prog, sh);
+break;
+ case MESA_SHADER_FRAGMENT:
+validate_fragment_shader_executable(prog, sh);
+break;
+ }
+ if (!prog-LinkStatus)
+goto done;
 
-  _mesa_reference_shader(ctx, prog-_LinkedShaders[MESA_SHADER_FRAGMENT],
-sh);
+ _mesa_reference_shader(ctx, prog-_LinkedShaders[stage], sh);
+  }
}
 
-   if (num_geom_shaders  0) {
-  gl_shader *const sh =
-link_intrastage_shaders(mem_ctx, ctx, prog, geom_shader_list,
-num_geom_shaders);
-
-   

[Mesa-dev] [PATCH 07/30] mesa: Fold long lines introduced by the previous patch.

2014-01-09 Thread Paul Berry
---
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 6 --
 src/mesa/drivers/dri/i965/brw_vec4_gs.c  | 5 +++--
 src/mesa/drivers/dri/i965/brw_vs.c   | 5 +++--
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 6 --
 src/mesa/main/context.c  | 6 --
 src/mesa/main/state.c| 9 ++---
 src/mesa/main/transformfeedback.c| 3 ++-
 src/mesa/state_tracker/st_atom_constbuf.c| 9 ++---
 8 files changed, 32 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
index 1ccf2e2..0795e56 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
@@ -69,7 +69,8 @@ brw_upload_gs_ubo_surfaces(struct brw_context *brw)
struct gl_context *ctx = brw-ctx;
 
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
+   struct gl_shader_program *prog =
+  ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
 
if (!prog)
   return;
@@ -93,7 +94,8 @@ brw_upload_gs_abo_surfaces(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
+   struct gl_shader_program *prog =
+  ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
 
if (prog) {
   /* CACHE_NEW_GS_PROG */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 7a803cb..e2a4a38 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -290,8 +290,9 @@ brw_upload_gs_prog(struct brw_context *brw)
if (!brw_search_cache(brw-cache, BRW_GS_PROG,
  key, sizeof(key),
  stage_state-prog_offset, brw-gs.prog_data)) {
-  bool success = do_gs_prog(brw, 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY],
-gp, key);
+  bool success =
+ do_gs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], gp,
+key);
   assert(success);
}
brw-gs.base.prog_data = brw-gs.prog_data-base.base;
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 351a83e..971b2d1 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -481,8 +481,9 @@ static void brw_upload_vs_prog(struct brw_context *brw)
if (!brw_search_cache(brw-cache, BRW_VS_PROG,
 key, sizeof(key),
 brw-vs.base.prog_offset, brw-vs.prog_data)) {
-  bool success = do_vs_prog(brw, 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX],
-   vp, key);
+  bool success =
+ do_vs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX], vp,
+key);
   (void) success;
   assert(success);
}
diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
index 2fbcdf9..e707d32 100644
--- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
@@ -130,7 +130,8 @@ brw_upload_vs_ubo_surfaces(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
+   struct gl_shader_program *prog =
+  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
 
if (!prog)
   return;
@@ -154,7 +155,8 @@ brw_upload_vs_abo_surfaces(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
+   struct gl_shader_program *prog =
+  ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
 
if (prog) {
   /* CACHE_NEW_VS_PROG */
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 8e978eb..026d7aa 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1780,7 +1780,8 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)

ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY],
 errMsg)) {
 _mesa_warning(ctx, Shader program %u is invalid: %s,
-  
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-Name, errMsg);
+  
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-Name,
+  errMsg);
  }
   }
 #endif
@@ -1801,7 +1802,8 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)

ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT],
 errMsg)) {
 _mesa_warning(ctx, Shader program %u is 

[Mesa-dev] [PATCH 18/30] glsl/cs: update main.cpp to use the .comp extension for compute shaders.

2014-01-09 Thread Paul Berry
---
 src/glsl/main.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index afc15cb..864c929 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -364,6 +364,8 @@ main(int argc, char **argv)
 shader-Type = GL_GEOMETRY_SHADER;
   else if (strncmp(.frag, ext, 5) == 0)
 shader-Type = GL_FRAGMENT_SHADER;
+  else if (strncmp(.comp, ext, 5) == 0)
+ shader-Type = GL_COMPUTE_SHADER;
   else
 usage_fail(argv[0]);
   shader-Stage = _mesa_shader_enum_to_shader_stage(shader-Type);
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/30] mesa: Replace ctx-Shader.Current{Vertex, Fragment, Geometry}Program with an array.

2014-01-09 Thread Paul Berry
These are replaced with
ctx-Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}].
In patches to follow, this will allow us to replace a lot of ad-hoc
logic with a variable index into the array.

With the exception of the changes to mtypes.h, this patch was
generated entirely by the command:

find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \
-print0 | xargs -0 sed -i \
-e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \
-e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \
-e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g'
---
 src/mesa/drivers/common/meta.c   |  6 ++---
 src/mesa/drivers/dri/i965/brw_gs.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_gs.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_vs.c   |  4 ++--
 src/mesa/drivers/dri/i965/brw_vs_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_wm_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen6_sol.c |  6 ++---
 src/mesa/drivers/dri/i965/gen6_vs_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen6_wm_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen7_sol_state.c   |  4 ++--
 src/mesa/drivers/dri/i965/gen7_vs_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c|  2 +-
 src/mesa/main/api_validate.c | 10 
 src/mesa/main/context.c  | 30 
 src/mesa/main/ff_fragment_shader.cpp |  8 +++
 src/mesa/main/mtypes.h   |  7 +++---
 src/mesa/main/shaderapi.c| 12 +-
 src/mesa/main/state.c|  8 +++
 src/mesa/main/texstate.c | 18 +++---
 src/mesa/main/transformfeedback.c|  8 +++
 src/mesa/state_tracker/st_atom_clip.c|  2 +-
 src/mesa/state_tracker/st_atom_constbuf.c|  6 ++---
 src/mesa/state_tracker/st_cb_drawpixels.c|  2 +-
 src/mesa/state_tracker/st_draw.c |  6 ++---
 src/mesa/state_tracker/st_program.c  |  6 ++---
 src/mesa/swrast/s_fragprog.c |  2 +-
 28 files changed, 84 insertions(+), 85 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 7b41876..5643e3c 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -618,11 +618,11 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
   }
 
   _mesa_reference_shader_program(ctx, save-VertexShader,
- ctx-Shader.CurrentVertexProgram);
+ 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]);
   _mesa_reference_shader_program(ctx, save-GeometryShader,
- ctx-Shader.CurrentGeometryProgram);
+ 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]);
   _mesa_reference_shader_program(ctx, save-FragmentShader,
- ctx-Shader.CurrentFragmentProgram);
+ 
ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]);
   _mesa_reference_shader_program(ctx, save-ActiveShader,
  ctx-Shader.ActiveProgram);
 
diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index faa8f94..1ba26de 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -187,7 +187,7 @@ static void populate_key(struct brw_context *brw,
   /* BRW_NEW_TRANSFORM_FEEDBACK */
   if (_mesa_is_xfb_active_and_unpaused(ctx)) {
  const struct gl_shader_program *shaderprog =
-ctx-Shader.CurrentVertexProgram;
+ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX];
  const struct gl_transform_feedback_info *linked_xfb_info =
 shaderprog-LinkedTransformFeedback;
  int i;
diff --git a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
index 5661941..1ccf2e2 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
@@ -69,7 +69,7 @@ brw_upload_gs_ubo_surfaces(struct brw_context *brw)
struct gl_context *ctx = brw-ctx;
 
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = ctx-Shader.CurrentGeometryProgram;
+   struct gl_shader_program *prog = 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
 
if (!prog)
   return;
@@ -93,7 +93,7 @@ brw_upload_gs_abo_surfaces(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = ctx-Shader.CurrentGeometryProgram;
+   struct gl_shader_program *prog = 

[Mesa-dev] [PATCH 20/30] mesa/cs: Create the gl_compute_program struct, and the code to initialize it.

2014-01-09 Thread Paul Berry
---
 src/mesa/main/mtypes.h |  7 +++
 src/mesa/program/program.c | 20 
 src/mesa/program/program.h |  5 +
 3 files changed, 32 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 8b88d75..e6c3a22 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2167,6 +2167,13 @@ struct gl_fragment_program
 };
 
 
+/** Compute program object */
+struct gl_compute_program
+{
+   struct gl_program Base;   /** base class */
+};
+
+
 /**
  * State common to vertex and fragment programs.
  */
diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c
index 3c19e8c..d21bfa0 100644
--- a/src/mesa/program/program.c
+++ b/src/mesa/program/program.c
@@ -279,6 +279,21 @@ _mesa_init_vertex_program( struct gl_context *ctx, struct 
gl_vertex_program *pro
 
 
 /**
+ * Initialize a new compute program object.
+ */
+struct gl_program *
+_mesa_init_compute_program(struct gl_context *ctx,
+   struct gl_compute_program *prog, GLenum target,
+   GLuint id)
+{
+   if (prog)
+  return _mesa_init_program_struct( ctx, prog-Base, target, id );
+   else
+  return NULL;
+}
+
+
+/**
  * Initialize a new geometry program object.
  */
 struct gl_program *
@@ -324,6 +339,11 @@ _mesa_new_program(struct gl_context *ctx, GLenum target, 
GLuint id)
  CALLOC_STRUCT(gl_geometry_program),
  target, id);
   break;
+   case GL_COMPUTE_PROGRAM_NV:
+  prog = _mesa_init_compute_program(ctx,
+CALLOC_STRUCT(gl_compute_program),
+target, id);
+  break;
default:
   _mesa_problem(ctx, bad target in _mesa_new_program);
   prog = NULL;
diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h
index 84aa8cb..cab7d71 100644
--- a/src/mesa/program/program.h
+++ b/src/mesa/program/program.h
@@ -84,6 +84,11 @@ _mesa_init_geometry_program(struct gl_context *ctx,
 GLenum target, GLuint id);
 
 extern struct gl_program *
+_mesa_init_compute_program(struct gl_context *ctx,
+   struct gl_compute_program *prog,
+   GLenum target, GLuint id);
+
+extern struct gl_program *
 _mesa_new_program(struct gl_context *ctx, GLenum target, GLuint id);
 
 extern void
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.

2014-01-09 Thread Paul Berry
This is possible now that ctx-Shader.CurrentProgram is an array.
---
 src/mesa/main/texstate.c | 75 +++-
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c
index b9c76da..905a9d5 100644
--- a/src/mesa/main/texstate.c
+++ b/src/mesa/main/texstate.c
@@ -526,27 +526,20 @@ static void
 update_texture_state( struct gl_context *ctx )
 {
GLuint unit;
-   struct gl_program *fprog = NULL;
-   struct gl_program *vprog = NULL;
-   struct gl_program *gprog = NULL;
+   struct gl_program *prog[MESA_SHADER_STAGES];
GLbitfield enabledFragUnits = 0x0;
-
-   if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] 
-   ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkStatus) {
-  vprog = 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-_LinkedShaders[MESA_SHADER_VERTEX]-Program;
-   }
-
-   if (ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY] 
-   ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-LinkStatus) {
-  gprog = 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-_LinkedShaders[MESA_SHADER_GEOMETRY]-Program;
-   }
-
-   if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] 
-   ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-LinkStatus) {
-  fprog = 
ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-_LinkedShaders[MESA_SHADER_FRAGMENT]-Program;
-   }
-   else if (ctx-FragmentProgram._Enabled) {
-  fprog = ctx-FragmentProgram.Current-Base;
+   int i;
+
+   for (i = 0; i  MESA_SHADER_STAGES; i++) {
+  if (ctx-Shader.CurrentProgram[i] 
+  ctx-Shader.CurrentProgram[i]-LinkStatus) {
+ prog[i] = ctx-Shader.CurrentProgram[i]-_LinkedShaders[i]-Program;
+  } else {
+ if (i == MESA_SHADER_FRAGMENT  ctx-FragmentProgram._Enabled)
+prog[i] = ctx-FragmentProgram.Current-Base;
+ else
+prog[i] = NULL;
+  }
}
 
/* TODO: only set this if there are actual changes */
@@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx )
 */
for (unit = 0; unit  ctx-Const.MaxCombinedTextureImageUnits; unit++) {
   struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit];
-  GLbitfield enabledVertTargets = 0x0;
-  GLbitfield enabledFragTargets = 0x0;
-  GLbitfield enabledGeomTargets = 0x0;
+  GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES];
   GLbitfield enabledTargets = 0x0;
   GLuint texIndex;
 
@@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx )
* by a fragment program/program.  When multiple flags are set, we'll
* settle on the one with highest priority (see below).
*/
-  if (vprog) {
- enabledVertTargets |= vprog-TexturesUsed[unit];
+  for (i = 0; i  MESA_SHADER_STAGES; i++) {
+ if (prog[i])
+enabledTargetsByStage[i] = prog[i]-TexturesUsed[unit];
+ else if (i == MESA_SHADER_FRAGMENT)
+enabledTargetsByStage[i] = texUnit-Enabled;
+ else
+enabledTargetsByStage[i] = 0;
+ enabledTargets |= enabledTargetsByStage[i];
   }
 
-  if (gprog) {
- enabledGeomTargets |= gprog-TexturesUsed[unit];
-  }
-
-  if (fprog) {
- enabledFragTargets |= fprog-TexturesUsed[unit];
-  }
-  else {
- /* fixed-function fragment program */
- enabledFragTargets |= texUnit-Enabled;
-  }
-
-  enabledTargets = enabledVertTargets | enabledFragTargets |
-   enabledGeomTargets;
-
   texUnit-_ReallyEnabled = 0x0;
 
   if (enabledTargets == 0x0) {
@@ -624,7 +606,7 @@ update_texture_state( struct gl_context *ctx )
   }
 
   if (!texUnit-_ReallyEnabled) {
- if (fprog) {
+ if (prog[MESA_SHADER_FRAGMENT]) {
 /* If we get here it means the shader is expecting a texture
  * object, but there isn't one (or it's incomplete).  Use the
  * fallback texture.
@@ -654,25 +636,26 @@ update_texture_state( struct gl_context *ctx )
 
   ctx-Texture._EnabledUnits |= (1  unit);
 
-  if (enabledFragTargets)
+  if (enabledTargetsByStage[MESA_SHADER_FRAGMENT])
  enabledFragUnits |= (1  unit);
 
-  if (!fprog)
+  if (!prog[MESA_SHADER_FRAGMENT])
  update_tex_combine(ctx, texUnit);
}
 
 
/* Determine which texture coordinate sets are actually needed */
-   if (fprog) {
+   if (prog[MESA_SHADER_FRAGMENT]) {
   const GLuint coordMask = (1  MAX_TEXTURE_COORD_UNITS) - 1;
   ctx-Texture._EnabledCoordUnits
- = (fprog-InputsRead  VARYING_SLOT_TEX0)  coordMask;
+ = (prog[MESA_SHADER_FRAGMENT]-InputsRead  VARYING_SLOT_TEX0) 
+ coordMask;
}
else {
   ctx-Texture._EnabledCoordUnits = enabledFragUnits;
}
 
-   if (!fprog || !vprog)
+   if (!prog[MESA_SHADER_FRAGMENT] || !prog[MESA_SHADER_VERTEX])
   update_texgen(ctx);
 }
 
-- 
1.8.5.2

___
mesa-dev mailing list

[Mesa-dev] [PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.

2014-01-09 Thread Paul Berry
This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader.  This
avoids unhandled switch case compiler warnings.
---
 src/glsl/ast_to_hir.cpp   |  9 +
 src/glsl/builtin_variables.cpp| 17 +
 src/glsl/standalone_scaffolding.h |  2 ++
 src/mesa/main/context.c   |  8 
 src/mesa/main/mtypes.h|  3 ++-
 src/mesa/main/shaderapi.c |  5 +
 src/mesa/main/shaderobj.h |  2 ++
 src/mesa/program/prog_print.c |  3 +++
 src/mesa/program/program.h|  4 
 9 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index ef8e699..a26745d 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2093,6 +2093,12 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
 
   fail = true;
   break;
+
+   case MESA_SHADER_COMPUTE:
+  _mesa_glsl_error(loc, state,
+   compute shader variables cannot be given 
+   explicit locations);
+  return;
};
 
if (fail) {
@@ -2275,6 +2281,9 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
 if (var-data.mode == ir_var_shader_in)
var-data.invariant = true;
 break;
+  case MESA_SHADER_COMPUTE:
+ /* Invariance isn't meaningful in compute shaders. */
+ break;
   }
}
 
diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index f630923..17ae087 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -356,6 +356,7 @@ public:
void generate_vs_special_vars();
void generate_gs_special_vars();
void generate_fs_special_vars();
+   void generate_cs_special_vars();
void generate_varyings();
 
 private:
@@ -866,6 +867,16 @@ builtin_variable_generator::generate_fs_special_vars()
 
 
 /**
+ * Generate variables which only exist in compute shaders.
+ */
+void
+builtin_variable_generator::generate_cs_special_vars()
+{
+   /* TODO: finish this. */
+}
+
+
+/**
  * Add a single varying variable.  The variable's type and direction (input
  * or output) are adjusted as appropriate for the type of shader being
  * compiled.  For geometry shaders using {ARB,EXT}_geometry_shader4,
@@ -886,6 +897,9 @@ builtin_variable_generator::add_varying(int slot, const 
glsl_type *type,
case MESA_SHADER_FRAGMENT:
   add_input(slot, type, name);
   break;
+   case MESA_SHADER_COMPUTE:
+  /* Compute shaders don't have varyings. */
+  break;
}
 }
 
@@ -973,5 +987,8 @@ _mesa_glsl_initialize_variables(exec_list *instructions,
case MESA_SHADER_FRAGMENT:
   gen.generate_fs_special_vars();
   break;
+   case MESA_SHADER_COMPUTE:
+  gen.generate_cs_special_vars();
+  break;
}
 }
diff --git a/src/glsl/standalone_scaffolding.h 
b/src/glsl/standalone_scaffolding.h
index 327fef2..df783af 100644
--- a/src/glsl/standalone_scaffolding.h
+++ b/src/glsl/standalone_scaffolding.h
@@ -58,6 +58,8 @@ _mesa_shader_enum_to_shader_stage(GLenum v)
   return MESA_SHADER_FRAGMENT;
case GL_GEOMETRY_SHADER:
   return MESA_SHADER_GEOMETRY;
+   case GL_COMPUTE_SHADER:
+  return MESA_SHADER_COMPUTE;
default:
   assert(!bad value in _mesa_shader_enum_to_shader_stage());
   return MESA_SHADER_VERTEX;
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 5855f15..b0cf5da 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -498,6 +498,14 @@ init_program_limits(struct gl_context *ctx, 
gl_shader_stage stage,
   prog-MaxInputComponents = 16 * 4; /* old limit not to break tnl and 
swrast */
   prog-MaxOutputComponents = 16 * 4; /* old limit not to break tnl and 
swrast */
   break;
+   case MESA_SHADER_COMPUTE:
+  prog-MaxParameters = 0; /* not meaningful for compute shaders */
+  prog-MaxAttribs = 0; /* not meaningful for compute shaders */
+  prog-MaxAddressRegs = 0; /* not meaningful for compute shaders */
+  prog-MaxUniformComponents = 4 * MAX_UNIFORMS;
+  prog-MaxInputComponents = 0; /* not meaningful for compute shaders */
+  prog-MaxOutputComponents = 0; /* not meaningful for compute shaders */
+  break;
default:
   assert(0  Bad shader stage in init_program_limits());
}
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ba7b10..8b88d75 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -404,9 +404,10 @@ typedef enum
MESA_SHADER_VERTEX = 0,
MESA_SHADER_GEOMETRY = 1,
MESA_SHADER_FRAGMENT = 2,
+   MESA_SHADER_COMPUTE = 3,
 } gl_shader_stage;
 
-#define MESA_SHADER_STAGES (MESA_SHADER_FRAGMENT + 1)
+#define MESA_SHADER_STAGES (MESA_SHADER_COMPUTE + 1)
 
 
 /**
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 61ac0e3..519b200 

[Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().

2014-01-09 Thread Paul Berry
---
 src/mesa/main/shaderapi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 519b200..5188e9c 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -993,6 +993,7 @@ _mesa_use_program(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 {
use_shader_program(ctx, GL_VERTEX_SHADER, shProg);
use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, shProg);
+   use_shader_program(ctx, GL_COMPUTE_SHADER, shProg);
use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg);
_mesa_active_program(ctx, shProg, glUseProgram);
 
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/30] glsl/cs: Populate default values for ctx-Const.Program[MESA_SHADER_COMPUTE].

2014-01-09 Thread Paul Berry
---
 src/glsl/main.cpp   | 4 
 src/glsl/standalone_scaffolding.cpp | 4 
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index 03b7c78..afc15cb 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -50,6 +50,10 @@ initialize_context(struct gl_context *ctx, gl_api api)
 */
ctx-Const.GLSLVersion = glsl_version;
ctx-Extensions.ARB_ES3_compatibility = true;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not 
used */
 
switch (ctx-Const.GLSLVersion) {
case 100:
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index fe66067..ab92da8 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -140,6 +140,10 @@ void initialize_context_to_defaults(struct gl_context 
*ctx, gl_api api)
ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32;
 
ctx-Const.MaxDrawBuffers = 1;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
+   ctx-Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not 
used */
 
/* Set up default shader compiler options. */
struct gl_shader_compiler_options options;
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.

2014-01-09 Thread Paul Berry
---
 src/mesa/drivers/dri/i965/brw_context.h |  8 
 src/mesa/drivers/dri/i965/brw_program.c | 11 +++
 2 files changed, 19 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index df32ccb..abc1783 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -316,6 +316,14 @@ struct brw_fragment_program {
GLuint id;  /** serial no. to identify frag progs, never re-used */
 };
 
+
+/** Subclass of Mesa compute program */
+struct brw_compute_program {
+   struct gl_compute_program program;
+   unsigned id;  /** serial no. to identify frag progs, never re-used */
+};
+
+
 struct brw_shader {
struct gl_shader base;
 
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 90844e5..2d92acb 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct gl_context 
*ctx,
   }
}
 
+   case GL_COMPUTE_PROGRAM_NV: {
+  struct brw_compute_program *prog = CALLOC_STRUCT(brw_compute_program);
+  if (prog) {
+ prog-id = get_new_program_id(brw-intelScreen);
+
+ return _mesa_init_compute_program(ctx, prog-program, target, id);
+  } else {
+ return NULL;
+  }
+   }
+
default:
   assert(!Unsupported target in brwNewProgram());
   return NULL;
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/30] meta: Replace save_state::{Vertex, Geometry, Fragment}Shader with an array.

2014-01-09 Thread Paul Berry
Since ctx-Shader.Current{Vertex,Geometry,Fragment}Program is an
array, this allows some meta code to be rolled up into loops.
---
 src/mesa/drivers/common/meta.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 5643e3c..2eeb09b 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -138,9 +138,7 @@ struct save_state
GLboolean FragmentProgramEnabled;
struct gl_fragment_program *FragmentProgram;
GLboolean ATIFragmentShaderEnabled;
-   struct gl_shader_program *VertexShader;
-   struct gl_shader_program *GeometryShader;
-   struct gl_shader_program *FragmentShader;
+   struct gl_shader_program *Shader[MESA_SHADER_STAGES];
struct gl_shader_program *ActiveShader;
 
/** MESA_META_STENCIL_TEST */
@@ -617,12 +615,10 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
  _mesa_set_enable(ctx, GL_FRAGMENT_SHADER_ATI, GL_FALSE);
   }
 
-  _mesa_reference_shader_program(ctx, save-VertexShader,
- 
ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]);
-  _mesa_reference_shader_program(ctx, save-GeometryShader,
- 
ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]);
-  _mesa_reference_shader_program(ctx, save-FragmentShader,
- 
ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]);
+  for (int i = 0; i  MESA_SHADER_STAGES; i++) {
+ _mesa_reference_shader_program(ctx, save-Shader[i],
+ ctx-Shader.CurrentProgram[i]);
+  }
   _mesa_reference_shader_program(ctx, save-ActiveShader,
  ctx-Shader.ActiveProgram);
 
@@ -829,6 +825,7 @@ _mesa_meta_end(struct gl_context *ctx)
 {
struct save_state *save = ctx-Meta-Save[ctx-Meta-SaveStackDepth - 1];
const GLbitfield state = save-SavedState;
+   int i;
 
/* After starting a new occlusion query, initialize the results to the
 * values saved previously. The driver will then continue to increment
@@ -960,23 +957,24 @@ _mesa_meta_end(struct gl_context *ctx)
   save-ATIFragmentShaderEnabled);
   }
 
-  if (ctx-Extensions.ARB_vertex_shader)
-_mesa_use_shader_program(ctx, GL_VERTEX_SHADER, save-VertexShader);
+  if (ctx-Extensions.ARB_vertex_shader) {
+_mesa_use_shader_program(ctx, GL_VERTEX_SHADER,
+  save-Shader[MESA_SHADER_VERTEX]);
+  }
 
   if (_mesa_has_geometry_shaders(ctx))
 _mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB,
- save-GeometryShader);
+ save-Shader[MESA_SHADER_GEOMETRY]);
 
   if (ctx-Extensions.ARB_fragment_shader)
 _mesa_use_shader_program(ctx, GL_FRAGMENT_SHADER,
- save-FragmentShader);
+ save-Shader[MESA_SHADER_FRAGMENT]);
 
   _mesa_reference_shader_program(ctx, ctx-Shader.ActiveProgram,
 save-ActiveShader);
 
-  _mesa_reference_shader_program(ctx, save-VertexShader, NULL);
-  _mesa_reference_shader_program(ctx, save-GeometryShader, NULL);
-  _mesa_reference_shader_program(ctx, save-FragmentShader, NULL);
+  for (i = 0; i  MESA_SHADER_STAGES; i++)
+ _mesa_reference_shader_program(ctx, save-Shader[i], NULL);
   _mesa_reference_shader_program(ctx, save-ActiveShader, NULL);
}
 
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader.

2014-01-09 Thread Paul Berry
---
 src/mapi/glapi/gen/ARB_compute_shader.xml | 40 +++
 src/mapi/glapi/gen/Makefile.am|  1 +
 src/mapi/glapi/gen/gl_API.xml |  4 ++-
 src/mapi/glapi/gen/gl_genexec.py  |  1 +
 src/mesa/Makefile.sources |  1 +
 src/mesa/SConscript   |  1 +
 src/mesa/main/compute.c   | 54 +++
 src/mesa/main/compute.h   | 38 ++
 src/mesa/main/tests/dispatch_sanity.cpp   |  4 +--
 9 files changed, 141 insertions(+), 3 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_compute_shader.xml
 create mode 100644 src/mesa/main/compute.c
 create mode 100644 src/mesa/main/compute.h

diff --git a/src/mapi/glapi/gen/ARB_compute_shader.xml 
b/src/mapi/glapi/gen/ARB_compute_shader.xml
new file mode 100644
index 000..1db373e
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_compute_shader.xml
@@ -0,0 +1,40 @@
+?xml version=1.0?
+!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd
+
+!-- Note: no GLX protocol info yet. --
+
+
+OpenGLAPI
+
+category name=GL_ARB_compute_shader number=122
+  enum name=COMPUTE_SHADER  value=0x91B9/
+  enum name=MAX_COMPUTE_UNIFORM_BLOCKS  value=0x91BB/
+  enum name=MAX_COMPUTE_TEXTURE_IMAGE_UNITS value=0x91BC/
+  enum name=MAX_COMPUTE_IMAGE_UNIFORMS  value=0x91BD/
+  enum name=MAX_COMPUTE_SHARED_MEMORY_SIZE  value=0x8262/
+  enum name=MAX_COMPUTE_UNIFORM_COMPONENTS  value=0x8263/
+  enum name=MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS  value=0x8264/
+  enum name=MAX_COMPUTE_ATOMIC_COUNTERS value=0x8265/
+  enum name=MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS value=0x8266/
+  enum name=MAX_COMPUTE_WORK_GROUP_INVOCATIONS  value=0x90EB/
+  enum name=MAX_COMPUTE_WORK_GROUP_COUNTvalue=0x91BE/
+  enum name=MAX_COMPUTE_WORK_GROUP_SIZE value=0x91BF/
+  enum name=COMPUTE_WORK_GROUP_SIZE value=0x8267/
+  enum name=UNIFORM_BLOCK_REFERENCED_BY_COMPUTE_SHADER  value=0x90EC/
+  enum name=ATOMIC_COUNTER_BUFFER_REFERENCED_BY_COMPUTE_SHADER  
value=0x90ED/
+  enum name=DISPATCH_INDIRECT_BUFFERvalue=0x90EE/
+  enum name=DISPATCH_INDIRECT_BUFFER_BINDINGvalue=0x90EF/
+  enum name=COMPUTE_SHADER_BIT  
value=0x0020/
+
+  function name=DispatchCompute offset=assign
+param name=num_groups_x type=GLuint/
+param name=num_groups_y type=GLuint/
+param name=num_groups_z type=GLuint/
+  /function
+
+  function name=DispatchComputeIndirect offset=assign
+param name=indirect type=GLintptr/
+  /function
+/category
+
+/OpenGLAPI
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 65bd913..a9cf2a31 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -91,6 +91,7 @@ API_XML = \
ARB_base_instance.xml \
ARB_blend_func_extended.xml \
ARB_color_buffer_float.xml \
+   ARB_compute_shader.xml \
ARB_copy_buffer.xml \
ARB_debug_output.xml \
ARB_depth_buffer_float.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 697b2ec..d6cd1f5 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8464,7 +8464,9 @@
 
 xi:include href=ARB_clear_buffer_object.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
 
-!-- ARB extensions #122...#123 --
+xi:include href=ARB_compute_shader.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+
+!-- ARB extension #123 --
 
 xi:include href=ARB_texture_view.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
 
diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py
index b557b3b..e376da3 100644
--- a/src/mapi/glapi/gen/gl_genexec.py
+++ b/src/mapi/glapi/gen/gl_genexec.py
@@ -57,6 +57,7 @@ header = /**
 #include main/clear.h
 #include main/clip.h
 #include main/colortab.h
+#include main/compute.h
 #include main/condrender.h
 #include main/context.h
 #include main/convolve.h
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 39525bc..2fdc2b6 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -23,6 +23,7 @@ MAIN_FILES = \
$(SRCDIR)main/clear.c \
$(SRCDIR)main/clip.c \
$(SRCDIR)main/colortab.c \
+   $(SRCDIR)main/compute.c \
$(SRCDIR)main/condrender.c \
$(SRCDIR)main/context.c \
$(SRCDIR)main/convolve.c \
diff --git a/src/mesa/SConscript b/src/mesa/SConscript
index bb9b304..24264c8 100644
--- a/src/mesa/SConscript
+++ b/src/mesa/SConscript
@@ -51,6 +51,7 @@ main_sources = [
 'main/clear.c',
 'main/clip.c',
 'main/colortab.c',
+'main/compute.c',
 'main/condrender.c',
 'main/context.c',
 'main/convolve.c',
diff --git a/src/mesa/main/compute.c 

[Mesa-dev] [PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders.

2014-01-09 Thread Paul Berry
Fixes piglit test:
spec/ARB_compute_shader/linker/mix_compute_and_non_compute
---
 src/glsl/linker.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 11e0651..f1344ea 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2107,6 +2107,13 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   goto done;
}
 
+   /* Compute shaders have additional restrictions. */
+   if (num_shaders[MESA_SHADER_COMPUTE]  0 
+   num_shaders[MESA_SHADER_COMPUTE] != prog-NumShaders) {
+  linker_error(prog, Compute shaders may not be linked with any other 
+   type of shader\n);
+   }
+
for (unsigned int i = 0; i  MESA_SHADER_STAGES; i++) {
   if (prog-_LinkedShaders[i] != NULL)
 ctx-Driver.DeleteShader(ctx, prog-_LinkedShaders[i]);
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/30] mesa: Start implementing compute shaders.

2014-01-09 Thread Paul Berry
This is the first of several planned patch series to implement the
extension ARB_compute_shader in Mesa.  This series allows the Mesa
front-end to parse and compile a do-nothing compute shader--that is,
one which contains nothing but a compute shader input layout
declaration and an empty main() function.

Since compute shader support is not yet complete, I haven't enabled
it.  To try it out, set the environment variable
INTEL_COMPUTE_SHADER=1.

With that environment variable set, the series passes all of the
ARB_compute_shader tests I recently sent to the Piglit mailing list,
except for the minmax test.

Patches 01-12 do preparatory refactoring in order to make adding
compute shaders (and other future shader stages) easier.

Patch 13 adds the extension enable flags for compute shaders, both in
the context and in the GLSL compiler.

Patch 14 adds the ARB_compute_shader functions and enums to Mesa's
dispatch code generation logic.  The functions don't do anything yet.

Patch 15 changes the linker so that once the compute shader stage has
been added, it won't mistakenly consider it to be part of the rest of
the graphics pipeline.

Patches 16-20 add MESA_SHADER_COMPUTE to the gl_shader_stage enum, and
update the rest of the code base to accommodate its presence in the
enum.

Patches 21-23 implement some compute shader constants.

Patches 24-26 implement logic to handle compute shader work group
sizes (also known as local sizes).

Patches 27-28 implement two other basic rules of compute shaders (they
don't have user-defined ins/outs, and they can't be mixed with other
shader types).

Finally, patches 29-30 implement the necessary back-end code for the
i965 driver to allow compute shaders to be turned on with an
environment variable.

Once these patches land, I plan to start working on the i965 back end.

[PATCH 01/30] mesa: Replace _mesa_program_index_to_target with 
_mesa_shader_stage_to_program.
[PATCH 02/30] mesa: Make validate_shader_target() non-static.
[PATCH 03/30] main: Allow ctx == NULL in _mesa_validate_shader_target().
[PATCH 04/30] mesa: use _mesa_validate_shader_target() more frequently.
[PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader 
stages.
[PATCH 06/30] mesa: Replace 
ctx-Shader.Current{Vertex,Fragment,Geometry}Program with an array.
[PATCH 07/30] mesa: Fold long lines introduced by the previous patch.
[PATCH 08/30] i965: Fix comments to refer to the new ctx-Shader.CurrentProgram 
array.
[PATCH 09/30] meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with 
an array.
[PATCH 10/30] mesa: Remove ad-hoc arrays of gl_shader_program.
[PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.
[PATCH 12/30] mesa: Change redundant code into loops in texstate.c.
[PATCH 13/30] mesa/cs: Add extension enable flags for ARB_compute_shader.
[PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader.
[PATCH 15/30] glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as 
a bound.
[PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update statements.
[PATCH 17/30] glsl/cs: Populate default values for 
ctx-Const.Program[MESA_SHADER_COMPUTE].
[PATCH 18/30] glsl/cs: update main.cpp to use the .comp extension for compute 
shaders.
[PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().
[PATCH 20/30] mesa/cs: Create the gl_compute_program struct, and the code to 
initialize it.
[PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
[PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
[PATCH 23/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.
[PATCH 24/30] glsl/cs: Handle compute shader local_size_{x,y,z} declaration.
[PATCH 25/30] mesa/cs: Handle compute shader local size during linking.
[PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
[PATCH 27/30] glsl/cs: Prohibit user-defined ins/outs in compute shaders.
[PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders.
[PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to 
initialize it.
[PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.

2014-01-09 Thread Paul Berry
---
 src/glsl/builtin_variables.cpp  | 27 +++
 src/glsl/glsl_parser_extras.cpp |  4 
 src/glsl/glsl_parser_extras.h   |  3 +++
 src/glsl/main.cpp   |  3 +++
 src/glsl/standalone_scaffolding.cpp |  3 +++
 src/mesa/main/context.c |  5 +
 src/mesa/main/get.c |  8 
 src/mesa/main/mtypes.h  |  3 +++
 8 files changed, 56 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 17ae087..171bf08 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -390,6 +390,7 @@ private:
  enum ir_variable_mode mode, int slot);
ir_variable *add_uniform(const glsl_type *type, const char *name);
ir_variable *add_const(const char *name, int value);
+   ir_variable *add_const_ivec3(const char *name, int x, int y, int z);
void add_varying(int slot, const glsl_type *type, const char *name,
 const char *name_as_gs_input);
 
@@ -530,6 +531,25 @@ builtin_variable_generator::add_const(const char *name, 
int value)
 }
 
 
+ir_variable *
+builtin_variable_generator::add_const_ivec3(const char *name, int x, int y,
+int z)
+{
+   ir_variable *const var = add_variable(name, glsl_type::ivec3_type,
+ ir_var_auto, -1);
+   ir_constant_data data;
+   memset(data, 0, sizeof(data));
+   data.i[0] = x;
+   data.i[1] = y;
+   data.i[2] = z;
+   var-constant_value = new(var) ir_constant(glsl_type::ivec3_type, data);
+   var-constant_initializer =
+  new(var) ir_constant(glsl_type::ivec3_type, data);
+   var-data.has_initializer = true;
+   return var;
+}
+
+
 void
 builtin_variable_generator::generate_constants()
 {
@@ -660,6 +680,13 @@ builtin_variable_generator::generate_constants()
   add_const(gl_MaxTessControlAtomicCounters, 0);
   add_const(gl_MaxTessEvaluationAtomicCounters, 0);
}
+
+   if (state-is_version(430, 0) || state-ARB_compute_shader_enable) {
+  add_const_ivec3(gl_MaxComputeWorkGroupSize,
+  state-Const.MaxComputeWorkGroupSize[0],
+  state-Const.MaxComputeWorkGroupSize[1],
+  state-Const.MaxComputeWorkGroupSize[2]);
+   }
 }
 
 
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 33a43c5..8524fc6 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -123,6 +123,10 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
this-Const.MaxCombinedAtomicCounters = 
ctx-Const.MaxCombinedAtomicCounters;
this-Const.MaxAtomicBufferBindings = ctx-Const.MaxAtomicBufferBindings;
 
+   /* Compute shader constants */
+   for (unsigned i = 0; i  Elements(this-Const.MaxComputeWorkGroupSize); i++)
+  this-Const.MaxComputeWorkGroupSize[i] = 
ctx-Const.MaxComputeWorkGroupSize[i];
+
this-current_function = NULL;
this-toplevel_ir = NULL;
this-found_return = false;
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 468707c..be34fd9 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -250,6 +250,9 @@ struct _mesa_glsl_parse_state {
   unsigned MaxFragmentAtomicCounters;
   unsigned MaxCombinedAtomicCounters;
   unsigned MaxAtomicBufferBindings;
+
+  /* ARB_compute_shader */
+  unsigned MaxComputeWorkGroupSize[3];
} Const;
 
/**
diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index 864c929..bb2054f 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -50,6 +50,9 @@ initialize_context(struct gl_context *ctx, gl_api api)
 */
ctx-Const.GLSLVersion = glsl_version;
ctx-Extensions.ARB_ES3_compatibility = true;
+   ctx-Const.MaxComputeWorkGroupSize[0] = 1024;
+   ctx-Const.MaxComputeWorkGroupSize[1] = 1024;
+   ctx-Const.MaxComputeWorkGroupSize[2] = 64;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index ab92da8..e8eb529 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -140,6 +140,9 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32;
 
ctx-Const.MaxDrawBuffers = 1;
+   ctx-Const.MaxComputeWorkGroupSize[0] = 1024;
+   ctx-Const.MaxComputeWorkGroupSize[1] = 1024;
+   ctx-Const.MaxComputeWorkGroupSize[2] = 64;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
diff --git 

[Mesa-dev] [PATCH 25/30] mesa/cs: Handle compute shader local size during linking.

2014-01-09 Thread Paul Berry
---
 src/glsl/linker.cpp   | 64 +++
 src/mesa/main/mtypes.h| 17 +
 src/mesa/main/shaderapi.c |  7 ++
 3 files changed, 88 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 7461b17..11e0651 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1284,6 +1284,69 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
prog-Geom.VerticesOut = linked_shader-Geom.VerticesOut;
 }
 
+
+/**
+ * Perform cross-validation of compute shader local_size_{x,y,z} layout
+ * qualifiers for the attached compute shaders, and propagate them to the
+ * linked CS and linked shader program.
+ */
+static void
+link_cs_input_layout_qualifiers(struct gl_shader_program *prog,
+struct gl_shader *linked_shader,
+struct gl_shader **shader_list,
+unsigned num_shaders)
+{
+   for (int i = 0; i  3; i++)
+  linked_shader-Comp.LocalSize[i] = 0;
+
+   /* This function is called for all shader stages, but it only has an effect
+* for compute shaders.
+*/
+   if (linked_shader-Stage != MESA_SHADER_COMPUTE)
+  return;
+
+   /* From the ARB_compute_shader spec, in the section describing local size
+* declarations:
+*
+* If multiple compute shaders attached to a single program object
+* declare local work-group size, the declarations must be identical;
+* otherwise a link-time error results. Furthermore, if a program
+* object contains any compute shaders, at least one must contain an
+* input layout qualifier specifying the local work sizes of the
+* program, or a link-time error will occur.
+*/
+   for (unsigned sh = 0; sh  num_shaders; sh++) {
+  struct gl_shader *shader = shader_list[sh];
+
+  if (shader-Comp.LocalSize[0] != 0) {
+ if (linked_shader-Comp.LocalSize[0] != 0) {
+for (int i = 0; i  3; i++) {
+   if (linked_shader-Comp.LocalSize[i] !=
+   shader-Comp.LocalSize[i]) {
+  linker_error(prog, compute shader defined with conflicting 
+   local sizes\n);
+  return;
+   }
+}
+ }
+ for (int i = 0; i  3; i++)
+linked_shader-Comp.LocalSize[i] = shader-Comp.LocalSize[i];
+  }
+   }
+
+   /* Just do the intrastage - interstage propagation right now,
+* since we already know we're in the right type of shader program
+* for doing it.
+*/
+   if (linked_shader-Comp.LocalSize[0] == 0) {
+  linker_error(prog, compute shader didn't declare local size\n);
+  return;
+   }
+   for (int i = 0; i  3; i++)
+  prog-Comp.LocalSize[i] = linked_shader-Comp.LocalSize[i];
+}
+
+
 /**
  * Combine a group of shaders for a single stage to generate a linked shader
  *
@@ -1389,6 +1452,7 @@ link_intrastage_shaders(void *mem_ctx,
ralloc_steal(linked, linked-UniformBlocks);
 
link_gs_inout_layout_qualifiers(prog, linked, shader_list, num_shaders);
+   link_cs_input_layout_qualifiers(prog, linked, shader_list, num_shaders);
 
populate_symbol_table(linked);
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index e0c88a7..56e120b 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2171,6 +2171,11 @@ struct gl_fragment_program
 struct gl_compute_program
 {
struct gl_program Base;   /** base class */
+
+   /**
+* Size specified using local_size_{x,y,z}.
+*/
+   unsigned LocalSize[3];
 };
 
 
@@ -2606,6 +2611,18 @@ struct gl_shader_program
  0 if not present. */
} Vert;
 
+   /**
+* Compute shader state - copied into gl_compute_program by
+* _mesa_copy_linked_program_data().
+*/
+   struct {
+  /**
+   * If this shader contains a compute stage, size specified using
+   * local_size_{x,y,z}.  Otherwise undefined.
+   */
+  unsigned LocalSize[3];
+   } Comp;
+
/* post-link info: */
unsigned NumUserUniformStorage;
struct gl_uniform_storage *UniformStorage;
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 5188e9c..053f27b 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1846,6 +1846,13 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
   dst_gp-UsesEndPrimitive = src-Geom.UsesEndPrimitive;
}
   break;
+   case MESA_SHADER_COMPUTE: {
+  struct gl_compute_program *dst_cp = (struct gl_compute_program *) dst;
+  int i;
+  for (i = 0; i  3; i++)
+ dst_cp-LocalSize[i] = src-Comp.LocalSize[i];
+   }
+  break;
default:
   break;
}
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >