Re: [Mesa-dev] [PATCH 3/4] i965: For color clears, only disable writes to components that exist.

2014-03-21 Thread Paul Berry
On 21 March 2014 04:47, Kenneth Graunke  wrote:

> The SIMD16 replicated FB write message only works if we don't need the
> color calculator to mask our framebuffer writes.  Previously, we bailed
> on it if color_mask wasn't .  However, this was
> needlessly strict for formats with fewer than four components - only the
> components that actually exist matter.
>
> WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set
> to .  This will work perfectly fine with the
> replicated data message; we just bailed unnecessarily.
>
> Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) and
> Bay Trail (at 2048x1152) by about 40% (using Chrome 24).
>
> Signed-off-by: Kenneth Graunke 
> Cc: Dylan Baker 
> Cc: Keith Packard 
> Cc: Eric Anholt 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

Nice find!  Series is:

Reviewed-by: Paul Berry 


>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
> index 45b9fa0..7e19c2c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
> @@ -231,7 +231,7 @@ brw_blorp_clear_params::brw_blorp_clear_params(struct
> brw_context *brw,
> /* Constant color writes ignore everyting in blend and color calculator
>  * state.  This is not documented.
>  */
> -   for (int i = 0; i < 4; i++) {
> +   for (int i = 0; i < _mesa_format_num_components(irb->mt->format); i++)
> {
>if (!color_mask[i]) {
>   color_write_disable[i] = true;
>   wm_prog_key.use_simd16_replicated_data = false;
> --
> 1.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Drop special case for edgeflag thanks to Marek's change to core.

2014-03-17 Thread Paul Berry
On 17 March 2014 14:54, Eric Anholt  wrote:

> ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c | 9 -
>  1 file changed, 9 deletions(-)
>

For future cross-referencing it would be nice to mention the SHA of Marek's
change in the commit message.


>
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index d42c074..d08e56a 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -717,12 +717,6 @@ static void brw_emit_vertices(struct brw_context *brw)
>uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC;
>uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC;
>
> -  /* The gen4 driver expects edgeflag to come in as a float, and
> passes
> -   * that float on to the tests in the clipper.  Mesa's current vertex
> -   * attribute value for EdgeFlag is stored as a float, which works
> out.
> -   * glEdgeFlagPointer, on the other hand, gives us an unnormalized
> -   * integer ubyte.  Just rewrite that to convert to a float.
> -   */
>if (input->attrib == VERT_ATTRIB_EDGEFLAG) {
>   /* Gen6+ passes edgeflag as sideband along with the vertex,
> instead
>* of in the VUE.  We have to upload it sideband as the last
> vertex
> @@ -732,9 +726,6 @@ static void brw_emit_vertices(struct brw_context *brw)
>  gen6_edgeflag_input = input;
>  continue;
>   }
> -
> - if (format == BRW_SURFACEFORMAT_R8_UINT)
> -format = BRW_SURFACEFORMAT_R8_SSCALED;
>}
>
>switch (input->glarray->Size) {
> --
> 1.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.

2014-03-11 Thread Paul Berry
On 10 March 2014 17:23, Ian Romanick  wrote:

> I had a pretty similar patch on the top of my pow-optimization branch.
> I also expand x**3 and x**4.  I had hoped that would enable some cases
> to expand then merge to MADs.  It should also be faster on older GENs
> where POW perf sucks.  I didn't send it out because I wanted to add a
> similar optimization in the back end that would turn x*x*x*x back into
> x**4 on GPUs where the POW would be faster.
>

Be careful with that one, though: pow is undefined when x < 0, so x*x*x*x
=> pow(x, 4) isn't a valid conversion in general--it only works if you know
the operation started off as a pow in the first place.  (Note: you can do
x*x*x*x => pow(abs(x), 4), but that doesn't generalize to odd powers).


>
> I also didn't have anything in shader-db that benefitted from x**2 or
> x**3.  It seems like there were a couple that would be modified by a
> x**5 flattening, but I think that would universally be slower
>
> On 03/10/2014 03:54 PM, Matt Turner wrote:
> > Cuts two instructions out of SynMark's Gl32VSInstancing benchmark.
> > ---
> >  src/glsl/opt_algebraic.cpp | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
> > index 5c49a78..8494bd9 100644
> > --- a/src/glsl/opt_algebraic.cpp
> > +++ b/src/glsl/opt_algebraic.cpp
> > @@ -528,6 +528,14 @@
> ir_algebraic_visitor::handle_expression(ir_expression *ir)
> >if (is_vec_two(op_const[0]))
> >   return expr(ir_unop_exp2, ir->operands[1]);
> >
> > +  if (is_vec_two(op_const[1])) {
> > + ir_variable *x = new(ir) ir_variable(ir->operands[1]->type,
> "x",
> > +  ir_var_temporary);
> > + base_ir->insert_before(x);
> > + base_ir->insert_before(assign(x, ir->operands[0]));
> > + return mul(x, x);
> > +  }
> > +
> >break;
> >
> > case ir_unop_rcp:
> >
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] i965/gs: add snb support

2014-02-24 Thread Paul Berry
On 24 February 2014 00:36, Ilia Mirkin  wrote:

> Before you read any further, this is nowhere close to working. However
> it's in
> a state where I think most of the structure is there, albeit with a lot of
> XXX
> comments. And I haven't actually implemented the new opcodes I've added.
>
> I was hoping one or two Intel people could take a look at this and let me
> know
> of any pitfalls I'm likely to run into. I've already gotten a lot of help
> and
> advice from Ken, but wanted to put something out publicly.
>
> Any and all feedback much appreciated!
>
> Not-Signed-off-by: Ilia Mirkin 
>

I'm really excited to see you're working on this!  A lot of people are
going to be really happy to see geometry shader support added to Gen6.

AFAICT, your approach is good.  I plan to look at it in more detail later
today or sometime tomorrow.  But before I do, here are some general
comments:

1. For long-term readability, it might be useful to have 3 classes:
vec4_gs_visitor (for stuff that's common to all gs visitors),
gen6_gs_visitor (for the stuff that's specific to gen6 geometry shaders),
and gen7plus_gs_visitor (for the stuff that's specific to gen7+ geometry
shaders).  What you currently have is a little harder to read, because it
means that to understand gen6 geometry shaders, you have to keep in mind
that some of the code in vec4_gs_visitor is never going to be executed
because it will always be overridden by gen6_gs_visitor.

2. Stream output (a.k.a. transform feedback): As you've probably already
figured out, transform feedback on gen6 is tricky because the hardware
doesn't have a dedicated pipeline stage for it; you have to emit code in
the geometry shader to do it.  This is a problem because the GL spec says
that two things are supposed to happen in between the GS stage and
transform feedback: dangling vertex removal (ignoring line strips
containing just a single vertex and ignoring triangle strips containing
less than 2 vertices), and expanding line/triangle strips into individual
line segments/triangles.  In Gen6, these operations happen after the GS
stage, so the code you emit in the geometry shader to do transform feedback
needs to simulate them.

Back when I first implemented transform feedback for Gen6, supporting
user-defined geometry shaders was the furthest thing from my mind, so I
implemented the transform feedback code in raw assembly rather than
re-using the vec4_visitor/vec4_generator infrastructure (see
gen6_sol_program()).  Unfortunately that's going to make it hard to re-use
that code in user-defined geometry shaders.  Also, when I implemented that
code, I was able to make some major simplifying assumptions because there
was no user-defined geometry shader--that meant that the number of vertices
to process was known at compile time, and I didn't have to worry about
dangling vertex removal or expanding line/triangle strips into individual
line segments/triangles (because the hardware automatically performs those
operations before invoking the GS).  You're not going to have that luxury.
My recommendation is: feel free to use gen6_sol_program() as a reference
point, but don't bother trying to re-use it--the simplifying assumptions
are just going to make it too different from what you want.  I'd recommend
leaving gen6_sol_program() around to handle the case where there's no
user-defined geometry shader, and then you can write fresh code for
handling transform feedback when there *is* a user-defined geometry shader.

Before you go too far on the implementation, I'd also recommend writing
some new piglit tests in tests/spec/glsl-1.50/execution/geometry/ to verify
that dangling vertex removal and strip expansion are handled properly for
transform feedback data.  If you don't have gs-supporting hardware at your
disposal to validate those tests against, I (or one of the Intel folks)
would be happy to validate the tests for you.  Note in particular that in
order to expand a triangle strip to triangles, you have to be careful with
the order to make sure that both the provoking vertex and the triangle
orientation are preserved (it's not necessary to follow
ARB_provoking_vertex, though; you can just assume the GL default of
LAST_VERTEX_CONVENTION applies).  For example, if the pre-expansion vertex
order is ABCDEF, then the only correct post-expansion order is
ABC,CBD,CDE,EDF.  I'm not sure whether Gen7+ hardware gets this order
exactly right in all circumstances, but it seems like it should be easy to
get it right on Gen6, because the order is entirely determined by software
:)

I'd also recommend adding some piglit tests to make sure that transform
feedback overflow conditions are handled properly when a geometry shader is
in use, since overflow handling on Gen6 is (partially) the responsibility
of the shader code you generate, and you're not going to be able to
piggy-back on the overflow handling in gen6_sol_program().

One other note: check out this text from the Gen6 programmer's reference
manual (

Re: [Mesa-dev] [PATCH] i965: Don't try to use the hardware blitter for multisampled miptrees.

2014-02-21 Thread Paul Berry
On 21 February 2014 19:15, Kenneth Graunke  wrote:

> The blitter is completely ignorant of MSAA buffer layouts, so any
> attempt to use BLT paths with MSAA buffers is likely to break
> spectacularly.
>
> In most cases, BLORP handles MSAA blits, so we never hit this bug.
> Until recently, it also wasn't worth fixing, since Meta couldn't handle
> MSAA either, so there was nothing to fall back to.  But now there is.
>
> +143 piglit tests on Broadwell (which doesn't have BLORP support).
> Surprisingly, three also start failing.  No changes on Ivybridge.
>

That actually doesn't surprise me too much.  Since color buffers use an
array-like sample layout (sample N in the Nth layer), trying to use the
blitter on a multisampled buffer just gets you sample 0 from each pixel.
 Which for some tests is good enough to pass (even though the blit is
actually doing the wrong thing).  So it's possible that a few of these
tests that were passing by luck with the blitter don't yet have proper Meta
support :)


>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index df85dc9..cedb230 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -165,6 +165,10 @@ intel_miptree_blit(struct brw_context *brw,
> mesa_format src_format = _mesa_get_srgb_format_linear(src_mt->format);
> mesa_format dst_format = _mesa_get_srgb_format_linear(dst_mt->format);
>
> +   /* The blitter doesn't understand multisampling at all. */
> +   if (src_mt->num_samples > 0 || dst_mt->num_samples > 0)
> +  return false;
> +
>

You might consider moving this check up to the top of the function, since
it doesn't depend on src_format or dst_format.  But it's not a big deal.
 Either way, the patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 8/9] i965: support gl_InvocationID for gen7

2014-02-06 Thread Paul Berry
On 4 February 2014 15:07, Jordan Justen  wrote:

> v2:
>  * Make gl_InstanceID a system value
>
> v3:
>  * Properly shift from R0.1 into DST.4 by adding
>GS_OPCODE_GET_INSTANCE_ID
>
> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h   | 12 
>  src/mesa/drivers/dri/i965/brw_shader.cpp  |  2 ++
>  src/mesa/drivers/dri/i965/brw_vec4.h  |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 20
> 
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 16 +---
>  5 files changed, 48 insertions(+), 3 deletions(-)
>

Hopefully I will get a chance to look at this in more detail soon, but for
now, consider this:

Acked-by: Paul Berry 


>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 75d09fc..199a699 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -902,6 +902,13 @@ enum opcode {
>  *   form the final channel mask.
>  */
> GS_OPCODE_SET_CHANNEL_MASKS,
> +
> +   /**
> +* Get the "Instance ID" fields from the payload.
> +*
> +* - dst is the GRF for gl_InvocationID.
> +*/
> +   GS_OPCODE_GET_INSTANCE_ID,
>  };
>
>  enum brw_urb_write_flags {
> @@ -1538,6 +1545,11 @@ enum brw_message_target {
>  # define BRW_GS_EDGE_INDICATOR_0   (1 << 8)
>  # define BRW_GS_EDGE_INDICATOR_1   (1 << 9)
>
> +/* GS Thread Payload
> + */
> +/* R0 */
> +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27
> +
>  /* 3DSTATE_GS "Output Vertex Size" has an effective maximum of 62.  It's
>   * counted in multiples of 16 bytes.
>   */
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 6cc2595..5e3005a 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -525,6 +525,8 @@ brw_instruction_name(enum opcode op)
>return "prepare_channel_masks";
> case GS_OPCODE_SET_CHANNEL_MASKS:
>return "set_channel_masks";
> +   case GS_OPCODE_GET_INSTANCE_ID:
> +  return "get_instance_id";
>
> default:
>/* Yes, this leaks.  It's in debug code, it should never occur, and
> if
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index e17b5cd..182a1e1 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -616,6 +616,7 @@ private:
> void generate_gs_set_dword_2_immed(struct brw_reg dst, struct brw_reg
> src);
> void generate_gs_prepare_channel_masks(struct brw_reg dst);
> void generate_gs_set_channel_masks(struct brw_reg dst, struct brw_reg
> src);
> +   void generate_gs_get_instance_id(struct brw_reg dst);
> void generate_oword_dual_block_offsets(struct brw_reg m1,
>   struct brw_reg index);
> void generate_scratch_write(vec4_instruction *inst,
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 94d1e79..a48d829 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -639,6 +639,22 @@ vec4_generator::generate_gs_set_channel_masks(struct
> brw_reg dst,
>  }
>
>  void
> +vec4_generator::generate_gs_get_instance_id(struct brw_reg dst)
> +{
> +   /* We want to right shift R0.0 & R0.1 by
> GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT
> +* and store into dst.0 & dst.4. So generate the instruction:
> +*
> +* shr(8) dst<1> R0<1,4,0> GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT {
> align1 WE_all }
> +*/
> +   brw_push_insn_state(p);
> +   brw_set_access_mode(p, BRW_ALIGN_1);
> +   dst = retype(dst, BRW_REGISTER_TYPE_UD);
> +   struct brw_reg r0(retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
> +   brw_SHR(p, dst, stride(r0, 1, 4, 0),
> brw_imm_ud(GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT));
> +   brw_pop_insn_state(p);
> +}
> +
> +void
>  vec4_generator::generate_oword_dual_block_offsets(struct brw_reg m1,
>struct brw_reg index)
>  {
> @@ -1218,6 +1234,10 @@
> vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
>generate_gs_set_channel_masks(dst, src[0]);
>break;
>
> +   case GS_OPCODE_GET_INSTANCE_ID:
> +  generate_gs_get_instance_id(dst);
> +  break;
> +
> case SHADER_OPCODE_SHADER_TIME_ADD

Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-05 Thread Paul Berry
On 4 February 2014 10:03, Jordan Justen  wrote:

> >>
> >> What about trying to make use of
> >> MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?
> >>
> >> We could add
> >> extensions.c:bool _mesa_is_extension_override_enabled(char *)
> >>
> >> And then
> >> if (_mesa_is_extension_override_enabled("GL_ARB_compute_shader"))
> >>
> >> Or, similarly, get overrides shoved into ctx->Extensions to allow
> >> checking for overrides at this early stage.
> >>
> >> -Jordan
> >
> >
> > I looked into what would be necessary to do this, and it's unfortunately
> > more complicated than it should be due to some flukes about
> initialization
> > order (perhaps this is what you were alluding to when you said "get
> > overrides shoved into ctx->Extensions to allow checking for overrides at
> > this early stage").
>
> Yeah, I'll admit I looked at it a bit, and decided to keep my feedback
> 'hand wavy'. :)
>
> >  Currently, our initialization order is:
> >
> > 1. brwCreateContext() calls brw_initialize_context_constants(), which
> needs
> > to know whether compute shaders are supported in order to set constants
> like
> > MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS
> correctly.
> >
> > 2. Later, brwCreateContext() calls intelInitExtensions(), which is the
> > function where we'll set ctx->Extensions.ARB_compute_shader to true once
> > compute shaders are fully supported.
> >
> > 3. Much later in initialization, when the context is being made current
> for
> > the first time, core Mesa calls _mesa_make_extension_string(), which
> calls
> > get_extension_override() to modify the the ctx->Extensions table based on
> > the environment variable override.
> >
> > To do what you suggested, I would either have to:
> >
> > A. change 1 to call _mesa_is_extension_override_enabled(); that
> function, in
> > turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
> > couldn't reuse the parsing code in _mesa_make_extension_string(), since
> that
> > parsing code updates ctx->Extensions as a side effect, and it's not time
> to
> > update ctx->Extensions yet.  In addition to the code duplication, we
> would
> > have a danger of bugs, since the override takes effect so late in
> > initialization--if we ever accidentally introduced some code in between
> > steps 2 and 3 that checked the value of
> ctx->Extensions.ARB_compute_shader,
> > it would see false even if compute shaders were enabled by override.
> >
> > Or:
> >
> > B. change the order of initialization so that 2 happens first, followed
> by 3
> > and then 1.  In the long run I think this would be a good thing, but
> it's a
> > big change (since it affects all back ends) and I'm not sure it would be
> a
> > good idea to hold up this patch series waiting for it.
> >
> >
> > How would you feel if I landed the patch series as is, and then worked
> on a
> > follow up patch to do B?
>
> Sure, you can have my r-b for this then.
>
> -Jordan
>

Great, thanks.  I've pushed the patches.  I'll work on "B" as soon as I can.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] glsl: serialize methods for IR instructions

2014-02-04 Thread Paul Berry
> +  }
> +   }
> +
> +   if (constant_value)
> +  constant_value->serialize(mem);
> +
> +   if (constant_initializer)
> +  constant_initializer->serialize(mem);
> +
> +   uint8_t has_interface_type = get_interface_type() ? 1 : 0;
> +
> +   mem.write_uint8_t(has_interface_type);
> +   if (has_interface_type)
> +  serialize_glsl_type(get_interface_type(), mem);
> +}
>

Everything's a nit pick except for the suggestion to use make_unique_id().
With that fixed, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
On 4 February 2014 19:52, Paul Berry  wrote:

> Whoops, I discovered another issue:
>
>
> On 29 January 2014 01:24, Tapani Pälli  wrote:
>
>> Class will be used by the shader binary cache implementation.
>>
>> Signed-off-by: Tapani Pälli 
>> ---
>>  src/glsl/memory_writer.h | 188
>> +++
>>  1 file changed, 188 insertions(+)
>>  create mode 100644 src/glsl/memory_writer.h
>>
>> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
>> new file mode 100644
>> index 000..979169f
>> --- /dev/null
>> +++ b/src/glsl/memory_writer.h
>> @@ -0,0 +1,188 @@
>> +/* -*- c++ -*- */
>> +/*
>> + * Copyright © 2013 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#pragma once
>> +#ifndef MEMORY_WRITER_H
>> +#define MEMORY_WRITER_H
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "main/hash_table.h"
>> +
>> +#ifdef __cplusplus
>> +/**
>> + * Helper class for writing data to memory
>> + *
>> + * This class maintains a dynamically-sized memory buffer and allows
>> + * for data to be efficiently appended to it with automatic resizing.
>> + */
>> +class memory_writer
>> +{
>> +public:
>> +   memory_writer() :
>> +  memory(NULL),
>> +  curr_size(0),
>> +  pos(0)
>> +   {
>> +  data_hash = _mesa_hash_table_create(0, int_equal);
>> +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
>> +   }
>> +
>> +   ~memory_writer()
>> +   {
>> +  free(memory);
>> +  _mesa_hash_table_destroy(data_hash, NULL);
>> +   }
>> +
>> +   /* user wants to claim the memory */
>> +   char *release_memory(size_t *size)
>> +   {
>> +  /* final realloc to free allocated but unused memory */
>> +  char *result = (char *) realloc(memory, pos);
>> +  *size = pos;
>> +  memory = NULL;
>> +  curr_size = 0;
>> +  pos = 0;
>> +  return result;
>> +   }
>> +
>> +/**
>> + * write functions per type
>> + */
>> +#define DECL_WRITER(type) void write_ ##type (const type data) {\
>> +   write(&data, sizeof(type));\
>> +}
>> +
>> +   DECL_WRITER(int32_t);
>> +   DECL_WRITER(int64_t);
>> +   DECL_WRITER(uint8_t);
>> +   DECL_WRITER(uint32_t);
>> +
>> +   void write_bool(bool data)
>> +   {
>> +  uint8_t val = data;
>> +  write_uint8_t(val);
>> +   }
>> +
>> +   /* write function that reallocates more memory if required */
>> +   void write(const void *data, int size)
>> +   {
>> +  if (!memory || pos > (curr_size - size))
>> + if (!grow(size))
>> +return;
>> +
>> +  memcpy(memory + pos, data, size);
>> +
>> +  pos += size;
>> +   }
>> +
>> +   void overwrite(const void *data, int size, int offset)
>> +   {
>> +  if (offset < 0 || offset + size > pos)
>> + return;
>> +  memcpy(memory + offset, data, size);
>> +   }
>> +
>> +   /* length is written to make reading safe */
>> +   void write_string(const char *str)
>> +   {
>>

Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
Whoops, I discovered another issue:


On 29 January 2014 01:24, Tapani Pälli  wrote:

> Class will be used by the shader binary cache implementation.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/memory_writer.h | 188
> +++
>  1 file changed, 188 insertions(+)
>  create mode 100644 src/glsl/memory_writer.h
>
> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
> new file mode 100644
> index 000..979169f
> --- /dev/null
> +++ b/src/glsl/memory_writer.h
> @@ -0,0 +1,188 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#pragma once
> +#ifndef MEMORY_WRITER_H
> +#define MEMORY_WRITER_H
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "main/hash_table.h"
> +
> +#ifdef __cplusplus
> +/**
> + * Helper class for writing data to memory
> + *
> + * This class maintains a dynamically-sized memory buffer and allows
> + * for data to be efficiently appended to it with automatic resizing.
> + */
> +class memory_writer
> +{
> +public:
> +   memory_writer() :
> +  memory(NULL),
> +  curr_size(0),
> +  pos(0)
> +   {
> +  data_hash = _mesa_hash_table_create(0, int_equal);
> +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
> +   }
> +
> +   ~memory_writer()
> +   {
> +  free(memory);
> +  _mesa_hash_table_destroy(data_hash, NULL);
> +   }
> +
> +   /* user wants to claim the memory */
> +   char *release_memory(size_t *size)
> +   {
> +  /* final realloc to free allocated but unused memory */
> +  char *result = (char *) realloc(memory, pos);
> +  *size = pos;
> +  memory = NULL;
> +  curr_size = 0;
> +  pos = 0;
> +  return result;
> +   }
> +
> +/**
> + * write functions per type
> + */
> +#define DECL_WRITER(type) void write_ ##type (const type data) {\
> +   write(&data, sizeof(type));\
> +}
> +
> +   DECL_WRITER(int32_t);
> +   DECL_WRITER(int64_t);
> +   DECL_WRITER(uint8_t);
> +   DECL_WRITER(uint32_t);
> +
> +   void write_bool(bool data)
> +   {
> +  uint8_t val = data;
> +  write_uint8_t(val);
> +   }
> +
> +   /* write function that reallocates more memory if required */
> +   void write(const void *data, int size)
> +   {
> +  if (!memory || pos > (curr_size - size))
> + if (!grow(size))
> +return;
> +
> +  memcpy(memory + pos, data, size);
> +
> +  pos += size;
> +   }
> +
> +   void overwrite(const void *data, int size, int offset)
> +   {
> +  if (offset < 0 || offset + size > pos)
> + return;
> +  memcpy(memory + offset, data, size);
> +   }
> +
> +   /* length is written to make reading safe */
> +   void write_string(const char *str)
> +   {
> +  uint32_t len = str ? strlen(str) : 0;
> +  write_uint32_t(len);
> +
> +  if (str)
> + write(str, len + 1);
> +   }
> +
> +   unsigned position()
> +   {
> +  return pos;
> +   }
> +
> +   /**
> +* check if some data was written
> +*/
> +   bool data_was_written(void *data, uint32_t id)
> +   {
> +  hash_entry *entry =
> + _mesa_hash_table_search(data_hash, hash_value,
> +(void*) (intptr_t) id);
>

This isn't an efficient way to use a hashtable.  You're always passing the
same hash_value (the one computed in the constructor), which means that
every single hash entry will be stored in the same chain, and lookups will
be really slow.

Since you're only using this hashtable to detect duplicate glsl_types,
there's a much easier approach: take advantage of the fact that glsl_type
is a flyweight class (in other words, code elsewhere in Mesa ensures that
no duplicate glsl_type objects exist anywhere in memory, so any two
glsl_type pointers can be compared simply using pointer equality).  Because
of that, you should be abl

Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
f (offset < 0 || offset + size > pos)
> + return;
>

Similarly, there should be an assertion in this return path, since this
should never happen except in the case of a bug elsewhere in Mesa.


> +  memcpy(memory + offset, data, size);
> +   }
> +
> +   /* length is written to make reading safe */
> +   void write_string(const char *str)
> +   {
> +  uint32_t len = str ? strlen(str) : 0;
> +  write_uint32_t(len);
>

There's a problem here: write_string("") and write_string(NULL) have
indistinguishable effects on the output file.  That *might* be benign for
now (I haven't looked through all the callers to write_string to see
whether the distinction between "" and NULL matters) but even so, this is
the sort of thing that can lead to subtle bugs in the future.

I'd suggest that instead, we store strlen + 1, so that the reader can tell
the difference between NULL and "".


> +
> +  if (str)
> + write(str, len + 1);
>

It's surprising to see you writing len + 1 bytes (so surprising that at
first I thought it must be a mistake).  Reading ahead to patch 3/10, it
looks like you did this so that you could call ralloc_strdup() in
read_string().  But that seems a little wasteful to me, not because we're
wasting a byte but because it means that read_string() traverses the string
twice:

- First it reads the length
- Then it checks that a null terminator exists in the expected place
- Then it calls ralloc_strdup(), which walks the string to see how long it
is (totally unnecessary, since the length is known)
- ralloc_strdup() calls ralloc_array() to allocate memory for the result
- ralloc_strdup() calls memcpy() to make a copy of the string
- ralloc_strdup() stores a null terminator at the end of the copied string.

Personally, I would just call ralloc_array() directly from read_string(),
then memcpy() the contents of the string and store the null terminator.  In
which case there is no need to store the null terminator when writing the
string.

But I'm nit picking at this point, so I'm not going to be a stickler about
it.  If you would prefer to just add a comment explaining why you're
serializing the null terminator, I would be ok with that too.


> +   }
> +
> +   unsigned position()
> +   {
> +  return pos;
> +   }
> +
> +   /**
> +* check if some data was written
> +*/
> +   bool data_was_written(void *data, uint32_t id)
>

For functions that ask questions, I recommend using a word order that's
unambiguously a question (e.g. "was_data_written").  "data_was_written"
sounds more like a statement, so someone might misinterpret this function
as a synonym for mark_data_written().

But I'm nit picking again here.  With the other issues addressed, this
patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-04 Thread Paul Berry
On 1 February 2014 23:21, Jordan Justen  wrote:

> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
> wrote:
> > This will allow testing of compute shader functionality before it is
> > completed.
> >
> > To enable ARB_compute_shader functionality in the i965 driver, set
> > INTEL_COMPUTE_SHADER=1.
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c  | 11 ++-
> >  src/mesa/drivers/dri/i965/intel_extensions.c |  2 ++
> >  2 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> > index 1b42751..76dd9be 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -298,10 +298,17 @@ brw_initialize_context_constants(struct
> brw_context *brw)
> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits =
> BRW_MAX_TEX_UNIT;
> > else
> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0;
> > +   if (getenv("INTEL_COMPUTE_SHADER")) {
>
> What about trying to make use of
> MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?
>
> We could add
> extensions.c:bool _mesa_is_extension_override_enabled(char *)
>
> And then
> if (_mesa_is_extension_override_enabled("GL_ARB_compute_shader"))
>
> Or, similarly, get overrides shoved into ctx->Extensions to allow
> checking for overrides at this early stage.
>
> -Jordan
>

I looked into what would be necessary to do this, and it's unfortunately
more complicated than it should be due to some flukes about initialization
order (perhaps this is what you were alluding to when you said "get
overrides shoved into ctx->Extensions to allow checking for overrides at
this early stage").  Currently, our initialization order is:

1. brwCreateContext() calls brw_initialize_context_constants(), which needs
to know whether compute shaders are supported in order to set constants
like MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS
correctly.

2. Later, brwCreateContext() calls intelInitExtensions(), which is the
function where we'll set ctx->Extensions.ARB_compute_shader to true once
compute shaders are fully supported.

3. Much later in initialization, when the context is being made current for
the first time, core Mesa calls _mesa_make_extension_string(), which calls
get_extension_override() to modify the the ctx->Extensions table based on
the environment variable override.

To do what you suggested, I would either have to:

A. change 1 to call _mesa_is_extension_override_enabled(); that function,
in turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
couldn't reuse the parsing code in _mesa_make_extension_string(), since
that parsing code updates ctx->Extensions as a side effect, and it's not
time to update ctx->Extensions yet.  In addition to the code duplication,
we would have a danger of bugs, since the override takes effect so late in
initialization--if we ever accidentally introduced some code in between
steps 2 and 3 that checked the value of ctx->Extensions.ARB_compute_shader,
it would see false even if compute shaders were enabled by override.

Or:

B. change the order of initialization so that 2 happens first, followed by
3 and then 1.  In the long run I think this would be a good thing, but it's
a big change (since it affects all back ends) and I'm not sure it would be
a good idea to hold up this patch series waiting for it.


How would you feel if I landed the patch series as is, and then worked on a
follow up patch to do B?


>
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits =
> BRW_MAX_TEX_UNIT;
> > +  ctx->Const.MaxUniformBufferBindings += 12;
> > +   } else {
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0;
> > +   }
> > ctx->Const.MaxCombinedTextureImageUnits =
> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits +
> > -  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits;
> > +  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
> >
> > ctx->Const.MaxTextureLevels = 14; /* 8192 */
> > if (ctx->Const.MaxTextureLevels > MAX_TEXTURE_LEVELS)
> > @@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context
> *brw)
> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters =
> MAX_ATOMIC_COUNTERS;
> >ctx->Const.Program[MESA_SHADER_VERTEX]

Re: [Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.

2014-02-03 Thread Paul Berry
On 1 February 2014 22:37, Jordan Justen  wrote:

> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
> wrote:
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h |  8 
> >  src/mesa/drivers/dri/i965/brw_program.c | 11 +++
> >  2 files changed, 19 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> > index df32ccb..abc1783 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -316,6 +316,14 @@ struct brw_fragment_program {
> > GLuint id;  /**< serial no. to identify frag progs, never re-used */
> >  };
> >
> > +
> > +/** Subclass of Mesa compute program */
> > +struct brw_compute_program {
> > +   struct gl_compute_program program;
> > +   unsigned id;  /**< serial no. to identify frag progs, never re-used
> */
>
> "frag" in comment
>

Whoops, thanks.  I've changed it to "serial no. to identify compute
progs...".


>
> Reviewed-by: Jordan Justen 
>
> > +};
> > +
> > +
> >  struct brw_shader {
> > struct gl_shader base;
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_program.c
> b/src/mesa/drivers/dri/i965/brw_program.c
> > index 90844e5..2d92acb 100644
> > --- a/src/mesa/drivers/dri/i965/brw_program.c
> > +++ b/src/mesa/drivers/dri/i965/brw_program.c
> > @@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct
> gl_context *ctx,
> >}
> > }
> >
> > +   case GL_COMPUTE_PROGRAM_NV: {
> > +  struct brw_compute_program *prog =
> CALLOC_STRUCT(brw_compute_program);
> > +  if (prog) {
> > + prog->id = get_new_program_id(brw->intelScreen);
> > +
> > + return _mesa_init_compute_program(ctx, &prog->program, target,
> id);
> > +  } else {
> > + return NULL;
> > +  }
> > +   }
> > +
> > default:
> >assert(!"Unsupported target in brwNewProgram()");
> >return NULL;
> > --
> > 1.8.5.2
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.

2014-02-03 Thread Paul Berry
On 1 February 2014 22:28, Jordan Justen  wrote:

> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
> wrote:
> > ---
> >  src/mesa/main/shaderapi.c | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> > index 053f27b..680d449 100644
> > --- a/src/mesa/main/shaderapi.c
> > +++ b/src/mesa/main/shaderapi.c
> > @@ -663,6 +663,24 @@ get_programiv(struct gl_context *ctx, GLuint
> program, GLenum pname, GLint *param
> >
> >*params = shProg->NumAtomicBuffers;
> >return;
> > +   case GL_COMPUTE_WORK_GROUP_SIZE: {
> > +  int i;
> > +  if (!_mesa_is_desktop_gl(ctx) ||
> !ctx->Extensions.ARB_compute_shader)
> > + break;
> > +  if (!shProg->LinkStatus) {
> > + _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(program
> not "
> > + "linked successfully)");
>
> Nit, but dropping " successfully" sounds better to me. It seems more
> likely that they tried to get the value before linking than that they
> ignored a link error.
>

Sure, I can go along with that.  I've changed it to just
"glGetProgramiv(program not linked)".


>
> 24-26 Reviewed-by: Jordan Justen 
>
> > + return;
> > +  }
> > +  if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
> > + _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no
> compute "
> > + "shaders)");
> > + return;
> > +  }
> > +  for (i = 0; i < 3; i++)
> > + params[i] = shProg->Comp.LocalSize[i];
> > +  return;
> > +   }
> > default:
> >break;
> > }
> > --
> > 1.8.5.2
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.

2014-02-03 Thread Paul Berry
On 1 February 2014 21:25, Jordan Justen  wrote:

> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
> wrote:
> > ---
> >  src/glsl/main.cpp   | 1 +
> >  src/glsl/standalone_scaffolding.cpp | 1 +
> >  src/mesa/main/context.c | 1 +
> >  src/mesa/main/get.c | 1 +
> >  src/mesa/main/get_hash_params.py| 3 +++
> >  src/mesa/main/mtypes.h  | 1 +
> >  6 files changed, 8 insertions(+)
> >
> > diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
> > index bb2054f..94bc1cc 100644
> > --- a/src/glsl/main.cpp
> > +++ b/src/glsl/main.cpp
> > @@ -53,6 +53,7 @@ initialize_context(struct gl_context *ctx, gl_api api)
> > ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[2] = 64;
> > +   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /*
> not used */
> > diff --git a/src/glsl/standalone_scaffolding.cpp
> b/src/glsl/standalone_scaffolding.cpp
> > index e8eb529..0c83ea3 100644
> > --- a/src/glsl/standalone_scaffolding.cpp
> > +++ b/src/glsl/standalone_scaffolding.cpp
> > @@ -143,6 +143,7 @@ void initialize_context_to_defaults(struct
> gl_context *ctx, gl_api api)
> > ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[2] = 64;
> > +   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
> > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /*
> not used */
> > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> > index ebe27b4..942f247 100644
> > --- a/src/mesa/main/context.c
> > +++ b/src/mesa/main/context.c
> > @@ -695,6 +695,7 @@ _mesa_init_constants(struct gl_context *ctx)
> > ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
> > ctx->Const.MaxComputeWorkGroupSize[2] = 64;
> > +   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
> >  }
> >
> >
> > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> > index 6b914f4..bcbb5d5 100644
> > --- a/src/mesa/main/get.c
> > +++ b/src/mesa/main/get.c
> > @@ -382,6 +382,7 @@ EXTRA_EXT(ARB_texture_multisample);
> >  EXTRA_EXT(ARB_texture_gather);
> >  EXTRA_EXT(ARB_shader_atomic_counters);
> >  EXTRA_EXT(ARB_draw_indirect);
> > +EXTRA_EXT(ARB_compute_shader);
> >
> >  static const int
> >  extra_ARB_color_buffer_float_or_glcore[] = {
> > diff --git a/src/mesa/main/get_hash_params.py
> b/src/mesa/main/get_hash_params.py
> > index 7f025a9..c18e848 100644
> > --- a/src/mesa/main/get_hash_params.py
> > +++ b/src/mesa/main/get_hash_params.py
> > @@ -742,6 +742,9 @@ descriptor=[
> >  # GL_ARB_vertex_attrib_binding
> >[ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET",
> "CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
> >[ "MAX_VERTEX_ATTRIB_BINDINGS",
> "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
> > +
> > +# GL_ARB_compute_shader
> > +  [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS",
> "CONTEXT_ENUM(Const.MaxComputeWorkGroupInvocations),
> extra_ARB_compute_shader" ],
>
> CONTEXT_ENUM and CONTEXT_INT seem to follow the same code paths, but
> would CONTEXT_INT be a better fit here?
>

You're right, CONTEXT_INT makes more sense here.  I've changed it.


>
> Also, how do we decide between adding to gl_constants and just using
> CONST(1024)? My thought would be to use CONST unless we are pretty
> sure drivers will want to vary the value.
>

My reasoning for using gl_constants is that the value of the constant is
determined by a hardware limitation rather than a core Mesa software
limitation, hence it's likely to vary from one back end to the next.  I had
a quick look through our uses of CONST and it looks like most of the time
we use CONST it's when we're talking about a core Mesa software limitation
(e.g. MAX_MODELVIEW_STACK_DEPTH is CONST(MAX_MODELVIEW_STACK_DEPTH) because
that's the size of the per-context data structure we use to ke

Re: [Mesa-dev] [PATCH v2 7/7] i965: support instanced GS on gen7

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen  wrote:

> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   | 2 ++
>  src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
>  src/mesa/drivers/dri/i965/brw_vec4_gs.c   | 2 ++
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 --
>  src/mesa/drivers/dri/i965/gen7_gs_state.c | 2 ++
>  5 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 8d098e6..ade4458 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -639,6 +639,8 @@ struct brw_gs_prog_data
>
> bool include_primitive_id;
>
> +   int invocations;
> +
> /**
>  * True if the thread should be dispatched in DUAL_INSTANCE mode,
> false if
>  * it should be dispatched in DUAL_OBJECT mode.
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 5fe1aba..eaf6e8f 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1478,6 +1478,7 @@ enum brw_message_target {
>  # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT 0
>  # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID 1
>  # define GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT20
> +# define GEN7_GS_INSTANCE_CONTROL_SHIFT15
>  # define GEN7_GS_DISPATCH_MODE_SINGLE  (0 << 11)
>  # define GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE   (1 << 11)
>  # define GEN7_GS_DISPATCH_MODE_DUAL_OBJECT (2 << 11)
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> index abc181b..3c6393f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> @@ -48,6 +48,8 @@ do_gs_prog(struct brw_context *brw,
> c.prog_data.include_primitive_id =
>(gp->program.Base.InputsRead & VARYING_BIT_PRIMITIVE_ID) != 0;
>
> +   c.prog_data.invocations = gp->program.Invocations;
> +
> /* Allocate the references to the uniforms that will end up in the
>  * prog_data associated with the compiled program, and which will be
> freed
>  * by the state cache.
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> index 12e137c..21dbc29 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> @@ -596,9 +596,11 @@ brw_gs_emit(struct brw_context *brw,
> }
>
> /* Compile the geometry shader in DUAL_OBJECT dispatch mode, if we can
> do
> -* so without spilling.
> +* so without spilling. If the GS invocations count > 1, then we can't
> use
> +* dual object mode.
>  */
> -   if (likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
> +   if (c->prog_data.invocations <= 1 ||
> +   likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
>

This should be &&, not ||.

With that fixed, this patch is:

Reviewed-by: Paul Berry 

I sent comments on patches 1, 2, 4, 6, and 7.  Patches 3 and 5 are:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/7] i965: support gl_InvocationID for gen7

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen  wrote:

> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h   |  5 +
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 24
> ---
>  2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 7f4cd10..5fe1aba 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1500,6 +1500,11 @@ enum brw_message_target {
>  # define BRW_GS_EDGE_INDICATOR_0   (1 << 8)
>  # define BRW_GS_EDGE_INDICATOR_1   (1 << 9)
>
> +/* GS Thread Payload
> + */
> +/* R0 */
> +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27
> +
>  /* 3DSTATE_GS "Output Vertex Size" has an effective maximum of 62.  It's
>   * counted in multiples of 16 bytes.
>   */
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> index 40743cc..12e137c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> @@ -51,9 +51,27 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context
> *brw,
>  dst_reg *
>  vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
>  {
> -   /* Geometry shaders don't use any system values. */
> -   assert(!"Unreached");
> -   return NULL;
> +   dst_reg *reg;
> +   src_reg r0(retype(brw_vec4_grf(0, 0), BRW_REGISTER_TYPE_UD));
> +
> +   switch (ir->data.location) {
> +   case SYSTEM_VALUE_INVOCATION_ID:
> +  this->current_annotation = "initialize gl_InvocationID";
> +
> +  reg = new(mem_ctx) dst_reg(this, ir->type);
> +
> +  /* Copy and shift gen7 instance id from R0 into the
> +   * gl_InvocationID register.
> +   */
> +  emit(SHR(*reg, src_reg(r0),
> +   (uint32_t) GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT));
> +  break;
> +   default:
> +  assert(!"not reached");
> +  break;
> +   }
> +
> +   return reg;
>  }
>

This would do what you want if invocation ID 0 was delivered in the upper
bits of R0.0 and invocation ID 1 was delivered in the upper bits of R0.4.
But that's not the case.  They are delivered in R0.0 and R0.1
respectively.  I think you need to generate this instruction:

SHR (8) dst<1> R0<1;4,0> GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT

Note that this instruction requires ALIGN1 mode and a goofy register
region, so you can't generate it directly from the visitor; what you'll
have to do instead is create a new opcode to do it.  If you want to see
examples where we've done similar things, grep for
GS_OPCODE_SET_WRITE_OFFSET, GS_OPCODE_SET_VERTEX_COUNT,
GS_OPCODE_SET_DWORD_2_IMMED, GS_OPCODE_PREPARE_CHANNEL_MASKS, or
GS_OPCODE_SET_CHANNEL_MASKS.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/7] main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen  wrote:

> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/main/shaderapi.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index a8336c9..fb107d5 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -603,6 +603,12 @@ get_programiv(struct gl_context *ctx, GLuint program,
> GLenum pname, GLint *param
>if (check_gs_query(ctx, shProg))
>   *params = shProg->Geom.VerticesOut;
>return;
> +   case GL_GEOMETRY_SHADER_INVOCATIONS:
> +  if (!has_core_gs)
>

This needs to be:

   if (!has_core_gs || !ctx->Extensions.ARB_gpu_shader5)

so that the query will generate the proper error message on platforms that
don't support ARB_gpu_shader5.


> + break;
> +  if (check_gs_query(ctx, shProg))
> + *params = shProg->Geom.Invocations;
> +  return;
> case GL_GEOMETRY_INPUT_TYPE:
>if (!has_core_gs)
>   break;
> --
> 1.8.5.3
>

With that fixed, the patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/7] glsl/linker: produce gl_shader_program Geom.Invocations

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen  wrote:

> Grab the parsed invocation count, check for consistency
> during linking, and finally save the result in
> gl_shader_program Geom.Invocations.
>
> Signed-off-by: Jordan Justen 
> ---
>  src/glsl/glsl_parser_extras.cpp |  2 ++
>  src/glsl/linker.cpp | 18 ++
>  src/mesa/main/mtypes.h  |  2 ++
>  3 files changed, 22 insertions(+)
>
> diff --git a/src/glsl/glsl_parser_extras.cpp
> b/src/glsl/glsl_parser_extras.cpp
> index 87784ed..3e98966 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -1339,6 +1339,8 @@ set_shader_inout_layout(struct gl_shader *shader,
> if (state->out_qualifier->flags.q.max_vertices)
>shader->Geom.VerticesOut = state->out_qualifier->max_vertices;
>
> +   shader->Geom.Invocations = state->gs_invocations;
> +
>

Considering my comment on patch 1, I think this will change to:

shader->Geom.Invocations = 0;
if (state->in_qualifier->flags.q.invocations)
   shader->Geom.Invocations = state->in_qualifier->invocations;



> if (state->gs_input_prim_type_specified) {
>shader->Geom.InputType = state->gs_input_prim_type;
> } else {
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 93b4754..800de0b 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -1206,6 +1206,7 @@ link_gs_inout_layout_qualifiers(struct
> gl_shader_program *prog,
> unsigned num_shaders)
>  {
> linked_shader->Geom.VerticesOut = 0;
> +   linked_shader->Geom.Invocations = 0;
> linked_shader->Geom.InputType = PRIM_UNKNOWN;
> linked_shader->Geom.OutputType = PRIM_UNKNOWN;
>
> @@ -1259,6 +1260,18 @@ link_gs_inout_layout_qualifiers(struct
> gl_shader_program *prog,
>  }
>  linked_shader->Geom.VerticesOut = shader->Geom.VerticesOut;
>}
> +
> +  if (shader->Geom.Invocations != 0) {
> +if (linked_shader->Geom.Invocations != 0 &&
> +linked_shader->Geom.Invocations != shader->Geom.Invocations) {
> +   linker_error(prog, "geometry shader defined with conflicting "
> +"invocation count (%d and %d)\n",
> +linked_shader->Geom.Invocations,
> +shader->Geom.Invocations);
> +   return;
> +}
> +linked_shader->Geom.Invocations = shader->Geom.Invocations;
> +  }
> }
>
> /* Just do the intrastage -> interstage propagation right now,
> @@ -1285,6 +1298,11 @@ link_gs_inout_layout_qualifiers(struct
> gl_shader_program *prog,
>return;
> }
> prog->Geom.VerticesOut = linked_shader->Geom.VerticesOut;
> +
> +   if (linked_shader->Geom.Invocations == 0)
> +  linked_shader->Geom.Invocations = 1;
>

I think it would be equivalent if we dropped these two lines and simply
initialized linked_shader->Geom.Invocations to 1 at the top of the
function--is that right?


> +
> +   prog->Geom.Invocations = linked_shader->Geom.Invocations;
>  }
>
>  /**
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 3d42a21..2d90b0a 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2401,6 +2401,7 @@ struct gl_shader
>  */
> struct {
>GLint VerticesOut;
> +  GLint Invocations;
>

It would be nice to have a comment above this declaration explaining that
if the shader didn't specify an invocation count, this value will be 1 (so
that back-end implementors know they don't have to coerce 0 to 1).


>/**
> * GL_POINTS, GL_LINES, GL_LINES_ADJACENCY, GL_TRIANGLES, or
> * GL_TRIANGLES_ADJACENCY, or PRIM_UNKNOWN if it's not set in this
> @@ -2599,6 +2600,7 @@ struct gl_shader_program
> struct {
>GLint VerticesIn;
>GLint VerticesOut;
> +  GLint Invocations;
>

Same comment applies here.


>GLenum InputType;  /**< GL_POINTS, GL_LINES, GL_LINES_ADJACENCY_ARB,
>GL_TRIANGLES, or GL_TRIANGLES_ADJACENCY_ARB
> */
>GLenum OutputType; /**< GL_POINTS, GL_LINE_STRIP or
> GL_TRIANGLE_STRIP */
> --
> 1.8.5.3
>

With those minor issues addressed, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/7] glsl: parse invocations layout qualifier for ARB_gpu_shader5

2014-01-31 Thread Paul Berry
On 28 January 2014 11:22, Jordan Justen  wrote:

> _mesa_glsl_parse_state::gs_invocations will store the
> invocation count.
>
> Signed-off-by: Jordan Justen 
>

This looks like it will work if the shader contains a line like this:

   layout(triangles, invocations=6) in;

But it won't handle this:

   layout(triangles) in;
   layout(invocations=6) in;

Although it's not explicitly mentioned in the GLSL spec, I believe that
this should be allowed too.  (The spec does explicitly allow similar
splitting of geometry shader output layout qualifiers.)

Note that the way we handle geometry shader input primitive types is
unusual, so it might not have been the best to imitate.  For other types of
shader-global layout qualifiers (i.e. geometry shader output primitive
count and geometry shader max_vertices), the code in glsl_parser.yy simply
merges the qualifiers into the global state->out_qualifier object at the
time that they are parsed.  The reason we handle geometry shader input
primitive types specially is because they have additional semantics (they
cause geometry shader input arrays to be resized), so we can't process them
until ast-to-hir time.  But that doesn't apply to the invocation count.

I'd recommend an in_qualifier field to _mesa_glsl_parse_state, and handling
invocations the same way that geometry shader output qualifiers are handled
(grep for "->out_qualifier" to how this works).  Unfortuneatly, because of
the extra semantics associated with input primitive types, we'll still have
to process them specially.

So I think the code in glsl_parser.yy will need to look something like this:

   | layout_qualifier IN_TOK ';'
   {
  void *ctx = state;
  $$ = NULL;
  if (state->stage != MESA_SHADER_GEOMETRY) {
 _mesa_glsl_error(& @1, state,
  "input layout qualifiers only valid in "
  "geometry shaders");
  } else {
 if ($1.flags.q.prim_type) {
/* Make sure this is a valid input primitive type. */
switch ($1.prim_type) {
case GL_POINTS:
case GL_LINES:
case GL_LINES_ADJACENCY:
case GL_TRIANGLES:
case GL_TRIANGLES_ADJACENCY:
   $$ = new(ctx) ast_gs_input_layout(@1, $1);
   break;
default:
   _mesa_glsl_error(&@1, state,
"invalid geometry shader input primitive
type");
   break;
}
 }
 if (!state->in_qualifier->merge_qualifier(& @1, state, $1))
YYERROR;
  }
   }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl: Make condition_to_hir() callable from outside ast_iteration_statement.

2014-01-31 Thread Paul Berry
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast.h  |  3 +--
 src/glsl/ast_to_hir.cpp | 10 +-
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 0bda28d..2d6f3a2 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -888,14 +888,13 @@ public:
 
ast_node *body;
 
-private:
/**
 * Generate IR from the condition of a loop
 *
 * This is factored out of ::hir because some loops have the condition
 * test at the top (for and while), and others have it at the end 
(do-while).
 */
-   void condition_to_hir(class ir_loop *, struct _mesa_glsl_parse_state *);
+   void condition_to_hir(exec_list *, struct _mesa_glsl_parse_state *);
 };
 
 
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1bfb4e5..950f513 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4369,14 +4369,14 @@ ast_case_label::hir(exec_list *instructions,
 }
 
 void
-ast_iteration_statement::condition_to_hir(ir_loop *stmt,
+ast_iteration_statement::condition_to_hir(exec_list *instructions,
  struct _mesa_glsl_parse_state *state)
 {
void *ctx = state;
 
if (condition != NULL) {
   ir_rvalue *const cond =
-condition->hir(& stmt->body_instructions, state);
+condition->hir(instructions, state);
 
   if ((cond == NULL)
  || !cond->type->is_boolean() || !cond->type->is_scalar()) {
@@ -4397,7 +4397,7 @@ ast_iteration_statement::condition_to_hir(ir_loop *stmt,
new(ctx) ir_loop_jump(ir_loop_jump::jump_break);
 
 if_stmt->then_instructions.push_tail(break_stmt);
-stmt->body_instructions.push_tail(if_stmt);
+instructions->push_tail(if_stmt);
   }
}
 }
@@ -4432,7 +4432,7 @@ ast_iteration_statement::hir(exec_list *instructions,
state->switch_state.is_switch_innermost = false;
 
if (mode != ast_do_while)
-  condition_to_hir(stmt, state);
+  condition_to_hir(&stmt->body_instructions, state);
 
if (body != NULL)
   body->hir(& stmt->body_instructions, state);
@@ -4441,7 +4441,7 @@ ast_iteration_statement::hir(exec_list *instructions,
   rest_expression->hir(& stmt->body_instructions, state);
 
if (mode == ast_do_while)
-  condition_to_hir(stmt, state);
+  condition_to_hir(&stmt->body_instructions, state);
 
if (mode != ast_do_while)
   state->symbols->pop_scope();
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: Fix continue statements in do-while loops.

2014-01-31 Thread Paul Berry
>From the GLSL 4.40 spec, section 6.4 (Jumps):

The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast_to_hir.cpp | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 950f513..8d096ad 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4029,17 +4029,22 @@ ast_jump_statement::hir(exec_list *instructions,
 _mesa_glsl_error(& loc, state,
  "break may only appear in a loop or a switch");
   } else {
-/* For a loop, inline the for loop expression again,
- * since we don't know where near the end of
- * the loop body the normal copy of it
- * is going to be placed.
+/* For a loop, inline the for loop expression again, since we don't
+ * know where near the end of the loop body the normal copy of it is
+ * going to be placed.  Same goes for the condition for a do-while
+ * loop.
  */
 if (state->loop_nesting_ast != NULL &&
-mode == ast_continue &&
-state->loop_nesting_ast->rest_expression) {
-   state->loop_nesting_ast->rest_expression->hir(instructions,
- state);
-}
+mode == ast_continue) {
+if (state->loop_nesting_ast->rest_expression) {
+   state->loop_nesting_ast->rest_expression->hir(instructions,
+ state);
+}
+if (state->loop_nesting_ast->mode ==
+ast_iteration_statement::ast_do_while) {
+   state->loop_nesting_ast->condition_to_hir(instructions, state);
+}
+ }
 
 if (state->switch_state.is_switch_innermost &&
 mode == ast_break) {
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 00/11] glsl: add Single Static Assignment (SSA)

2014-01-31 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott  wrote:

> This series enables GLSL IR support for SSA, including passes to convert
> to and from SSA form. SSA is a form of the intermediate representation
> of a compiler in which each variable is assigned exactly once. SSA form
> makes many optimizations faster and easier to write, and enables other
> more powerful optimizations. SSA is used in GCC [1] and LLVM [2] as well
> as various compiler backends within Mesa itself, such as r600g-sb and
> Nouveau. Adding support for SSA will allow the various optimizations
> these backends perform to be implemented in one place, instead of
> making each driver reinvent the wheel (as several have already done).
> Additionally, all new backends would recieve these optimizations,
> reducing the burden of writing a compiler backend for a new driver.
>
> Even though no optimization passes are now implemented, I am putting out
> this series to solicit feedback on the design, to make sure I don't have
> to rewrite things before I go ahead and write these new passes.
>
> There are no piglit regressions on Softpipe, except for the
> spec/OpenGL 2.0/max-samplers test, which only passed before because the
> compiler happened to unroll the loop; the extra copies caused by the
> conversion to and from SSA stop the compiler from unrolling, meaning
> that the resulting GLSL IR code contains an indirect sampler index which
> glsl-to-tgsi can't handle.
>

I had a detailed look at your patches and I really like where you're going
with this.  I sent a lot of feedback in response to patches 4, 5, 7, 9, and
10; I don't think any of my feedback would require a major change to your
overall plan.  Nice work!  Consider patches 1, 2, 3, 6, 8, and 11:

Reviewed-by: Paul Berry 

Note, however, that probably patch 11 should be postponed until we've
written something that makes use of the SSA form (i.e. the GVN-GCM
algorithm you mentioned, or a direct conversion from GLSL IR to an SSA
backend).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 10/11] glsl: add a pass to convert out of SSA form

2014-01-31 Thread Paul Berry
ere will cause the list visitor to visit the
new node; in fact what is happening is that we're compensating for the fact
that it won't.  How about instead saying something like "We need to do this
here because the list visitor uses foreach_list_safe(), so it will skip any
nodes we insert."


> +
> +   ir->insert_after(phi->dest);
> +   phi->dest->insert_after(assign(phi->dest, var));
> +   phi->dest->data.mode = ir_var_temporary;
>
+
> +   if (phi->if_src != NULL)
> +  ir->then_instructions.push_tail(assign(var, phi->if_src));
> +
> +   if (phi->else_src != NULL)
> +  ir->else_instructions.push_tail(assign(var, phi->else_src));
> +
> +   phi->remove();
> +}
> +
> +static void
> +eliminate_phi_loop_begin(ir_phi_loop_begin *phi, ir_loop *ir, exec_list
> *instrs)
> +{
> +   ir_variable *var = insert_decl(instrs, phi->dest->type,
> ralloc_parent(ir));
> +   ir->body_instructions.push_head(phi->dest);
> +   phi->dest->insert_after(assign(phi->dest, var));
> +   phi->dest->data.mode = ir_var_temporary;
> +
> +   if (phi->enter_src != NULL)
> +  ir->insert_before(assign(var, phi->enter_src));
> +
> +   if (phi->repeat_src != NULL)
> +  ir->body_instructions.push_tail(assign(var, phi->repeat_src));
> +
> +   foreach_list(n, &phi->continue_srcs) {
> +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
> +
> +  if (src->src != NULL)
> +src->jump->insert_before(assign(var, src->src));
> +   }
> +
> +   phi->remove();
> +}
> +
> +static void
> +eliminate_phi_loop_end(ir_phi_loop_end *phi, ir_loop *ir, exec_list
> *instrs)
> +{
> +   ir_variable *var = insert_decl(instrs, phi->dest->type,
> ralloc_parent(ir));
> +   ir->insert_after(phi->dest);
> +   phi->dest->insert_after(assign(phi->dest, var));
> +   phi->dest->data.mode = ir_var_temporary;
> +
> +   foreach_list(n, &phi->break_srcs) {
> +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
> +
> +  if (src->src != NULL)
> +src->jump->insert_before(assign(var, src->src));
> +   }
> +
> +   phi->remove();
> +}
> +
> +namespace {
> +
> +class ir_from_ssa_visitor : public ir_hierarchical_visitor
> +{
> +public:
> +   ir_from_ssa_visitor(exec_list *base_instrs) : base_instrs(base_instrs)
> +   {
> +   }
> +
> +   virtual ir_visitor_status visit_leave(ir_if *);
> +   virtual ir_visitor_status visit_enter(ir_loop *);
> +   virtual ir_visitor_status visit_leave(ir_loop *);
> +   virtual ir_visitor_status visit(ir_dereference_variable *);
> +
> +private:
> +   exec_list *base_instrs;
> +};
> +
> +};
> +
> +ir_visitor_status
> +ir_from_ssa_visitor::visit_leave(ir_if *ir)
> +{
> +   foreach_list_safe(n, &ir->phi_nodes) {
> +  eliminate_phi_if((ir_phi_if *) n, ir, this->base_instrs);
> +   }
> +
> +   return visit_continue;
> +}
> +
> +ir_visitor_status
> +ir_from_ssa_visitor::visit_enter(ir_loop *ir)
> +{
> +   foreach_list_safe(n, &ir->begin_phi_nodes) {
> +  eliminate_phi_loop_begin((ir_phi_loop_begin *) n, ir,
> this->base_instrs);
> +   }
> +
> +   return visit_continue;
> +}
> +
> +ir_visitor_status
> +ir_from_ssa_visitor::visit_leave(ir_loop *ir)
> +{
> +   foreach_list_safe(n, &ir->end_phi_nodes) {
> +  eliminate_phi_loop_end((ir_phi_loop_end *) n, ir,
> this->base_instrs);
> +   }
> +
> +   return visit_continue;
> +}
> +
> +ir_visitor_status
> +ir_from_ssa_visitor::visit(ir_dereference_variable *ir)
> +{
> +   if (this->in_assignee && ir->var->data.mode == ir_var_temporary_ssa) {
> +  this->base_ir->insert_before(ir->var);
> +  ir->var->data.mode = ir_var_temporary;
> +   }
> +
> +   return visit_continue;
> +}
> +
> +static void
> +convert_from_ssa_function(exec_list *instructions)
> +{
> +   ir_from_ssa_visitor v(instructions);
> +   v.run(instructions);
> +}
> +
> +void
> +convert_from_ssa(exec_list *instructions)
> +{
> +   foreach_list(node, instructions) {
> +  ir_instruction *ir = (ir_instruction *) node;
> +  ir_function *f = ir->as_function();
> +  if (f) {
> +foreach_list(sig_node, &f->signatures) {
> +   ir_function_signature *sig = (ir_function_signature *)
> sig_node;
> +
> +   convert_from_ssa_function(&sig->body);
> +}
> +  }
> +   }
> +}
>

It looks like the only reason you need to manually walk through the
functions and signatures is so that you can make sure that
ir_from_ssa_visitor::base_instrs always points to the correct function's
instruction list.  Normally the way we achieve that sort of thing is to do:

ir_visitor_status
ir_from_ssa_visitor::visit_enter(ir_function_signature *ir)
{
   this->base_instrs = &ir->body;
   return visit_continue;
}

ir_visitor_status
ir_from_ssa_visitor::visit_leave(ir_function_signature *ir)
{
   this->base_instrs = NULL;
   return visit_continue;
}

That way you could get rid of convert_from_ssa_function() and
convert_from_ssa() would just become:

void
convert_from_ssa(exec_list *instructions)
{
   ir_from_ssa_visitor v;
   v.run(instructions);
}


I don't have really strong feelings about it, though.  Provided that the
confusing comment in eliminate_phi_if() is fixed, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-28 Thread Paul Berry
On 28 January 2014 13:15, Carl Worth  wrote:

> Paul Berry  writes:
> > Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback
> > functions were using ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]
>
> I see that this commit was nominated for the stable branch when it was
> committed.
>
> However, it appears to depend at least on commit b22146dc714 which is on
>

I think you mean 3b22146dc714 (mesa: Replace
ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array.).


> master, but is not on the 10.0 branch. So this patch cannot be applied
> as-is to 10.0.
>
> My current inclination is to simply drop this patch from the list of
> candidates for 10.0. If you think differently, please recommend an
> alternate plan.
>
> -Carl
>

(Also from IRC:  stereotype441: In my recent mail, I should have
included 43e77215b13 as well. (It's another patch I plan to drop from
stable unless you guide me differently.))

In principle we could make versions of these patches that don't depend on
commit 3b22146dc714, but I don't think it's worth it--the bugs I've fixed
in these patches are unlikely to arise until ARB_separate_shader_objects is
supported.

So yeah, I'm ok with dropping these two patches from the list of candidates
for 10.0.

I believe it was Ken who initially suggested marking these patches as
candidates for stable.  Cc'ing him in case he wants to express a dissenting
opinion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure

2014-01-28 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott  wrote:

> diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
> index cb732a5..7075579 100644
> --- a/src/glsl/ir_clone.cpp
> +++ b/src/glsl/ir_clone.cpp
> @@ -40,7 +40,15 @@ ir_rvalue::clone(void *mem_ctx, struct hash_table *ht)
> const
>  ir_variable *
>  ir_variable::clone(void *mem_ctx, struct hash_table *ht) const
>  {
> -   ir_variable *var = new(mem_ctx) ir_variable(this->type, this->name,
> +   ir_variable *var;
> +
> +   if (ht) {
> +  var = (ir_variable *)hash_table_find(ht, this);
> +  if (var)
> +return var;
> +   }
>

Can you add a comment explaining why it's necessary to do a hashtable
lookup now?  (I'm guessing it's because a phi node can refer to an SSA
temporary that occurs later in the instruction stream, and therefore when
we go to clone an SSA temporary variable, it's possible that we've already
cloned it, but I'm not 100% certain about that).


> +
> +   var = new(mem_ctx) ir_variable(this->type, this->name,
>(ir_variable_mode)
> this->data.mode);
>

Let's fix up this line to align the opening paren with the initial "t" in
"this->type".


> +
> +ir_phi *
> +ir_phi::clone(void *mem_ctx, struct hash_table *ht) const
> +{
> +   /* shouldn't get here */
> +   assert(0);
> +   return new(mem_ctx) ir_phi();
> +}
>

There's no need to implement this function.  If we leave it out (and leave
out the corresponding declaration of clone() in ir.h's declaration of the
ir_phi class), then ir_phi will be an abstract base class, and the compiler
will make sure we never instantiate it.


> +static ir_phi_jump_src *
> +clone_phi_jump_src(ir_phi_jump_src *src, void *mem_ctx, struct hash_table
> *ht)
> +{
> +   ir_phi_jump_src *new_src = new(mem_ctx) ir_phi_jump_src();
> +   new_src->src = src->src->clone(mem_ctx, ht);
> +   new_src->jump = src->jump->clone(mem_ctx, ht);
> +   return new_src;
>

Let's make a constructor for ir_phi_jump_src that takes src and jump as
arguments, so that we can just do:

   return new(mem_ctx) ir_phi_jump_src(src->src->clone(mem_ctx, ht),
   src->jump->clone(mem_ctx, ht));

+
> +
> +void
> +ir_print_visitor::visit(ir_phi *ir)
> +{
> +   printf("error");
> +}
>

I think if we make ir_phi an abstract base class (like I suggested earlier)
we can drop this function entirely, as well as all the other visit(class
ir_phi *) functions introduced in this patch.


> +
> +
> +void
> +ir_print_visitor::visit(ir_phi_if *ir)
> +{
> +   printf("(phi\n");
>

For consistency with the class names, I'd prefer for this to be:

   printf("(phi_if\n");


> +
> +   indentation++;
> +   indent();
> +   ir->dest->accept(this);
> +   printf("\n");
> +
> +   indent();
> +   if (ir->if_src) {
> +  printf("%s ", unique_name(ir->if_src));
> +   } else {
> +  printf("(undefined) ");
> +   }
> +   if (ir->else_src) {
> +  printf("%s)", unique_name(ir->else_src));
> +   } else {
> +  printf("(undefined))");
> +   }
>

Would it be worth modifying unique_name() so that it returns "(undefined)"
if it's passed a null pointer?  That would allow us to simplify the 10
lines above to:

   printf("%s %s", unique_name(ir->if_src), unique_name(ir->else_src));

As well as similar simplifications in the functions below.


> +   indentation--;
> +}
> +
> +void
> +ir_print_visitor::print_phi_jump_src(ir_phi_jump_src *src)
> +{
> +   printf("(%s ", unique_name(src->jump));
>

I think this is going to make the output IR difficult to follow, since it
won't be obvious that "(break@1 ...)" represents a phi node rather than a
break instruction.  I'd recommend changing this to:

   printf("(phi_jump_src %s ", unique_name(src->jump));


> +   if (src->src) {
> +  printf(" %s)\n", unique_name(src->src));
> +   } else {
> +  printf(" (undefined))\n");
> +   }
> +}
> +
> +void
> +ir_print_visitor::visit(ir_phi_loop_begin *ir)
> +{
> +   printf("(phi\n");
>

As with ir_phi_if, I'd recommend changing this to:

   printf("(phi_loop_begin\n");

A similar comment applies to ir_print_visitor::visit(ir_phi_loop_end *).


> +
> +   indentation++;
> +   indent();
> +   ir->dest->accept(this);
> +   printf("\n");
> +
> +   indent();
> +   if (ir->enter_src) {
> +  printf("%s ", unique_name(ir->enter_src));
> +   } else {
> +  printf("(undefined) ");
> +   }
> +   if (ir->repeat_src) {
> +  printf("%s\n", unique_name(ir->repeat_src));
> +   } else {
> +  printf("(undefined)\n");
> +   }
> +
> +   foreach_list(n, &ir->continue_srcs) {
> +  ir_phi_jump_src *src = (ir_phi_jump_src *) n;
> +  indent();
> +  print_phi_jump_src(src);
> +   }
> +
> +   indentation--;
> +   indent();
> +   printf(")");
> +}
> +
> +
> +
> +
> +ir_visitor_status
> +ir_validate::visit(ir_phi_loop_begin *ir)
> +{
> +   validate_phi_dest(ir, this->ht);
> +
> +   /*
> +* note: ir_phi_loop_begin is a special case in that variables may be,
> and in
> +* fact normally are,

Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure

2014-01-28 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott  wrote:

> This patch introduces all the changes to the IR that are necessary for
> representing programs in the SSA form. This consists of a new variable
> mode, the SSA temporary, which is guarenteed to be written to exactly
> once, and classes to represent phi nodes in the IR.
>
> In the current code, variables are first declared using an ir_variable
> instruction inserted into the instruction stream, and then every
> dereference will point to the ir_variable declared earlier. SSA
> temporaries, however, do not work this way. Instead, the variable
> is declared when it is assigned. That is, the variable is "owned" by
> the one and only instruction where it is defined.
>
> In SSA, phi nodes may exist at the beginning of any "join nodes," or
> basic blocks with more than one predecessor. In our IR, this can happen
> in one of three places:
>
> - After an if statement, where the then branch and the else branch
> converge.
> - At the beginning of a loop, which can be reached from either before
> the loop (on the first iteration), the end of the loop (when we get to
> the end of the loop and jump back to the beginning), or any continue
> statement.
> - At the end of a loop, which can be reached from any break statement
> within the loop.
>
> Accordingly, there are three different types of phi nodes: if phi nodes,
> phi nodes at the beginning of a loop, and phi nodes at the end of a
> loop, all of which derive from the ir_phi base class.
> ---
>  src/glsl/ir.cpp|  56 +++
>  src/glsl/ir.h  | 196
> -
>  src/glsl/ir_clone.cpp  | 147 ---
>  src/glsl/ir_hierarchical_visitor.cpp   |  36 +
>  src/glsl/ir_hierarchical_visitor.h |  11 ++
>  src/glsl/ir_hv_accept.cpp  |  55 ++-
>  src/glsl/ir_print_visitor.cpp  | 196
> -
>  src/glsl/ir_print_visitor.h|  15 ++
>  src/glsl/ir_validate.cpp   | 158 +++-
>  src/glsl/ir_visitor.h  |   8 +
>  src/mesa/drivers/dri/i965/brw_fs.h |   4 +
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  28 
>  src/mesa/drivers/dri/i965/brw_vec4.h   |   4 +
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  24 +++
>  src/mesa/program/ir_to_mesa.cpp|  28 
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  29 
>  16 files changed, 956 insertions(+), 39 deletions(-)
>
> diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
> index 1a36bd6..f1ded80 100644
> --- a/src/glsl/ir.cpp
> +++ b/src/glsl/ir.cpp
> @@ -1229,6 +1229,37 @@ ir_loop::ir_loop()
>  }
>
>
> +ir_phi::ir_phi()
> +{
> +   this->dest = NULL;
> +}
>

Rather than have a no-argument constructor for ir_phi that sets this->dest
to NULL, and then have the derived classes set dest to the appropriate
value, let's just do:

ir_phi::ir_phi(ir_variable *dest)
   : dest(dest)
{
}

Then in the derived classes we can just pass dest on through, e.g.:

ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
 ir_variable *else_src)
   : ir_phi(dest), if_src(if_src), else_src(else_src)
{
   this->ir_type = ir_type_phi_if;
}



> +
> +
> +ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
> +ir_variable *else_src)
> +   : if_src(if_src), else_src(else_src)
> +{
> +   this->ir_type = ir_type_phi_if;
> +   this->dest = dest;
> +}
> +
> +
> +ir_phi_loop_begin::ir_phi_loop_begin(ir_variable* dest, ir_variable*
> enter_src,
> +ir_variable* repeat_src)
> +   : enter_src(enter_src), repeat_src(repeat_src)
> +{
> +   this->ir_type = ir_type_phi_loop_begin;
> +   this->dest = dest;
> +}
> +
> +
> +ir_phi_loop_end::ir_phi_loop_end(ir_variable *dest)
> +{
> +   this->ir_type = ir_type_phi_loop_end;
> +   this->dest = dest;
> +}
> +
> +
>  ir_dereference_variable::ir_dereference_variable(ir_variable *var)
>  {
> assert(var != NULL);
> @@ -1554,6 +1585,9 @@ ir_variable::ir_variable(const struct glsl_type
> *type, const char *name,
> this->data.max_array_access = 0;
> this->data.atomic.buffer_index = 0;
> this->data.atomic.offset = 0;
> +   this->ssa_assignment = NULL;
> +   this->ssa_phi = NULL;
> +   this->ssa_call = NULL;
>
> if (type != NULL) {
>if (type->base_type == GLSL_TYPE_SAMPLER)
> @@ -1722,12 +1756,19 @@ steal_memory(ir_instruction *ir, void *new_ctx)
>  {
> ir_variable *var = ir->as_variable();
> ir_constant *constant = ir->as_constant();
> +   ir_dereference_variable *deref = ir->as_dereference_variable();
> +   ir_phi *phi = ir->as_phi();
> +   ir_phi_loop_begin *phi_loop_begin = ir->as_phi_loop_begin();
> +   ir_phi_loop_end *phi_loop_end = ir->as_phi_loop_end();
> if (var != NULL && var->constant_value != NULL)
>steal_memory(var->constant_value,

Re: [Mesa-dev] [PATCH RFC 05/11] glsl: add loop jump visitor

2014-01-27 Thread Paul Berry
On 22 January 2014 09:16, Connor Abbott  wrote:

> +class ir_loop_jumps_visitor : public ir_hierarchical_visitor
> +{
> +public:
> +   ir_loop_jumps_visitor();
> +   ~ir_loop_jumps_visitor();
> +
> +   virtual ir_visitor_status visit_enter(ir_loop *);
> +   virtual ir_visitor_status visit(ir_loop_jump *);
> +
> +   ir_loop_jumps *get_loop_jumps(ir_loop *ir);
> +
> +private:
> +   ir_loop *outer_loop;
>

As with outer_if in the previous patch, I think we could save some
unnecessary hashtable lookups if we simply stored the ir_loop_jumps * here
instead of storing the ir_loop *.  But I don't feel terribly strongly about
it.

Either way, the patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 04/11] glsl: add dead branch analysis

2014-01-27 Thread Paul Berry
  if (constant == NULL || constant->is_zero())
> +return visit_continue;
> +   }
> +
> +   visit_return();
> +   return visit_continue;
> +}
> +
> +void
> +ir_dead_branches_visitor::visit_return()
>

It would be nice to have a comment above this function explaining that it
handles both return and discard statements.  (Or consider renaming it to
something like visit_return_or_discard()).


> +{
> +   if (this->outer_if) {
> +  ir_dead_branches *dead_branches =
> this->get_dead_branches(this->outer_if);
> +  if (this->in_then) {
> +dead_branches->then_dead = true;
> +dead_branches->then_dead_return = true;
> +  } else {
> +dead_branches->else_dead = true;
> +dead_branches->else_dead_return = true;
> +  }
> +   }
> +}
> diff --git a/src/glsl/ir_dead_branches.h b/src/glsl/ir_dead_branches.h
> new file mode 100644
> index 000..58688e6
> --- /dev/null
> +++ b/src/glsl/ir_dead_branches.h
> @@ -0,0 +1,78 @@
> +/*
> + * Copyright © 2013 Connor Abbott (con...@abbott.cx)
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +/**
> + * \file ir_dead_branches.h
> + *
> + * Provides a visitor which determines, for each if instruction, whether
> + * control will never flow the from the then-block or else-block
> + * to the next instruction due to jump statements (break, continue,
> return,
> + * discard).
> + */
> +
> +#include "ir.h"
> +#include "ir_visitor.h"
> +
> +class ir_dead_branches
>

It would be nice to have a comment above this class indicating that one
ir_dead_branches object is created for each ir_if.  (It's clear from
context, but for a casual user of this visitor it might not be obvious).


> +{
> +public:
> +   ir_dead_branches(ir_if *ir);
> +
> +   ir_if *ir;
> +
> +   /**
> +* whether a jump statement is guarenteed to be hit when the
> +* then_instructions are run, making the branch from the
> then_instructions
> +* "dead"
> +*/
> +   bool then_dead;
> +   bool else_dead; /** < ditto for the else_instructions */
> +
> +   /** whether the then branch is dead due to a return or discard */
>

This comment is ambiguous.  Some people interpret a boolean which is
documented as "whether A or B" as meaning "A if the bool is true; B if the
bool is false".  I'd recommend changing this comment to something like
"True if the then branch is dead due to a return or discard; false if the
then branch is dead due to a continue".


> +   bool then_dead_return;
> +   bool else_dead_return; /** < ditto for else branch */
> +};
> +
> +class ir_dead_branches_visitor : public ir_hierarchical_visitor
>

It would be nice to have a short comment above this class giving a brief
overview of how the class is intended to be used (i.e. construct it, visit
the IR using it, then call get_dead_branches() to retrieve the dead branch
information corresponding to a given ir_if block).


> +{
> +public:
> +   ir_dead_branches_visitor();
> +   ~ir_dead_branches_visitor();
> +
> +   virtual ir_visitor_status visit_enter(ir_if *);
> +   virtual ir_visitor_status visit_enter(ir_loop *);
> +   virtual ir_visitor_status visit(ir_loop_jump *);
> +   virtual ir_visitor_status visit_enter(ir_return *);
> +   virtual ir_visitor_status visit_enter(ir_discard *);
> +
> +   ir_dead_branches *get_dead_branches(ir_if *ir);
> +
> +private:
> +   void visit_return();
> +
> + 

[Mesa-dev] [PATCH] i965: Replace *_generator::shader with is_glsl boolean.

2014-01-22 Thread Paul Berry
The "shader" field in fs_generator, vec4_generator, and gen8_generator
was only used for one purpose; to figure out if we were compiling an
assembly shader or a GLSL shader.  And it wasn't being used properly:
in vec4 shaders we were always initializing it based on
prog->_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we
were compiling a geometry shader or a vertex shader.

This was a fairly benign problem, since it's unlikely that a
real-world program will try to mix and match GLSL and assembly shaders
using separate shader objects.  But it seems worth fixing.

This patch replaces the shader field with a new is_glsl boolean, and
initializes it based on information from the caller, so that it always
refers to the correct shader stage.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  6 --
 src/mesa/drivers/dri/i965/brw_fs.h|  8 +---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 12 ++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4.h  |  8 +---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 11 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  7 +--
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp   | 13 ++---
 src/mesa/drivers/dri/i965/gen8_generator.cpp  |  6 --
 src/mesa/drivers/dri/i965/gen8_generator.h|  5 +++--
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 10 +-
 11 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a0e4830..c0d65d5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3512,11 +3512,13 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
 
const unsigned *assembly = NULL;
if (brw->gen >= 8) {
-  gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != 
BAD_FILE);
+  gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE,
+  shader != NULL);
   assembly = g.generate_assembly(&v.instructions, simd16_instructions,
  final_assembly_size);
} else {
-  fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE);
+  fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE,
+ shader != NULL);
   assembly = g.generate_assembly(&v.instructions, simd16_instructions,
  final_assembly_size);
}
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index a903908..ad0aa99 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -512,7 +512,8 @@ public:
 struct brw_wm_compile *c,
 struct gl_shader_program *prog,
 struct gl_fragment_program *fp,
-bool dual_source_output);
+bool dual_source_output,
+bool is_glsl);
~fs_generator();
 
const unsigned *generate_assembly(exec_list *simd8_instructions,
@@ -615,7 +616,6 @@ private:
struct brw_wm_compile *c;
 
struct gl_shader_program *prog;
-   struct gl_shader *shader;
const struct gl_fragment_program *fp;
 
unsigned dispatch_width; /**< 8 or 16 */
@@ -623,6 +623,7 @@ private:
exec_list discard_halt_patches;
bool dual_source_output;
void *mem_ctx;
+   const bool is_glsl;
 };
 
 /**
@@ -637,7 +638,8 @@ public:
  struct brw_wm_compile *c,
  struct gl_shader_program *prog,
  struct gl_fragment_program *fp,
- bool dual_source_output);
+ bool dual_source_output,
+ bool is_glsl);
~gen8_fs_generator();
 
const unsigned *generate_assembly(exec_list *simd8_instructions,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index e701fc5..a8e81b8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -40,14 +40,14 @@ fs_generator::fs_generator(struct brw_context *brw,
struct brw_wm_compile *c,
struct gl_shader_program *prog,
struct gl_fragment_program *fp,
-   bool dual_source_output)
+   bool dual_source_output,
+   bool is_glsl)
 
-   : brw(brw), c(c), prog(prog), fp(fp), dual_source_output(dual_source_output)
+   : brw(brw), c(c), prog(prog), fp(fp),
+ dual_source_output(dual_source_output), is_glsl(is_glsl)
 {
ctx = &brw->ctx;
 
-   shader = prog ? prog->_LinkedShaders[MESA_SHADER_FRAGMENT] : NULL;
-
mem_ctx = c;
 
p = rzalloc(mem_ctx, struct brw_compile);
@@ -1301,7 +1301,7 @@ fs_generator::generate_code(exec_list *instructi

Re: [Mesa-dev] [PATCH V3 7/8] glsl: remove remaining is_array variables

2014-01-22 Thread Paul Berry
On 22 January 2014 03:33, Timothy Arceri  wrote:

> Previously the reason we needed is_array was because we used array_size ==
> NULL to
> represent both non-arrays and unsized arrays.  Now that we use a non-NULL
> array_specifier to represent an unsized array, is_array is redundant.
>
> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/ast.h  | 32 +++-
>  src/glsl/ast_to_hir.cpp | 10 +-
>  src/glsl/ast_type.cpp   |  6 ++
>  src/glsl/glsl_parser.yy | 30 +++---
>  src/glsl/glsl_parser_extras.cpp | 15 ++-
>  5 files changed, 35 insertions(+), 58 deletions(-)
>

When I try to build this patch I get the compile error:

  CXX  ast_to_hir.lo
../../src/glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue*
ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*)':
../../src/glsl/ast_to_hir.cpp:2877:12: error: 'class ast_declaration' has
no member named 'is_array'
   assert(!decl->is_array);
^

I'm guessing you missed this because you are building in release mode, so
asserts didn't get compiled.  If that's the case, I'd like to encourage you
to do a debug build and double-check that piglit tests still pass.  You can
do that by adding the "--enable-debug" option when you run ./autogen.sh.


With that fixed, the series is:

Reviewed-by: Paul Berry 

I'll try to push my patch "glsl: Simplify aggregate type inference to
prepare for ARB_arrays_of_arrays." within the next hour.  Do you have
commit access to Mesa or would you like me to push the series for you?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 23/23] i965/blorp: switch eu-emitter to use FS IR and fs_generator

2014-01-22 Thread Paul Berry
On 22 January 2014 09:17, Topi Pohjolainen wrote:

> No regressions on IVB (piglit quick + unit tests).
>
> v2 (Paul):
>   - no need to patch the unit tests anymore. Original logic
> was altered and unit tests updated to match the
> fs-generator
>   - lrp emission moves from the blorp compiler core into the
> emitter here (previously there was a separate refactoring
> patch which is not really needed anymore as the lrp logic
> got refactored when the original lrp logic got fixed).
>   - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the
> generator in fs_inst::target instead of hardcoding it
>
> CC: Paul Berry 
> Signed-off-by: Topi Pohjolainen 
>

Thanks, Topi.  The whole series is now:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-22 Thread Paul Berry
On 22 January 2014 08:20, Kenneth Graunke  wrote:

> On 01/22/2014 06:07 AM, Paul Berry wrote:
> > @@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices(
> >   **/
> >
> >
> > +/**
> > + * Figure out which stage of the pipeline is the source of transform
> feedback
> > + * data given the current context state, and return its
> gl_shader_program.
> > + *
> > + * If no active program can generate transform feedback data (i.e. no
> vertex
> > + * shader is active), returns NULL.
> > + */
> > +static struct gl_shader_program *
> > +get_xfb_source(struct gl_context *ctx)
> > +{
> > +   int i;
> > +   for (i = MESA_SHADER_FRAGMENT - 1; i >= MESA_SHADER_VERTEX; i--) {
>
> I think this would be clearer as:
>
> for (i = MESA_SHADER_GEOMETRY; i >= MESA_SHADER_VERTEX; i--) {
>...
> }
>
> Note that the pipeline ordering is:
> Vertex -> Tess. Control -> Tess. Eval -> Geometry -> Transform Feedback
> (http://www.opengl.org/wiki/Rendering_Pipeline_Overview)
>
> So either implementation would work even with tessellation shaders.
>
> Either way, this series is:
> Reviewed-by: Kenneth Graunke 
> Cc: 10.0 
>
>
That's a good point--I like your suggestion.  Thanks for the review!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/gen7: Use to the correct program when uploading transform feedback state.

2014-01-22 Thread Paul Berry
Transform feedback may come from either the geometry shader or the
vertex shader, so we can't use
ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the current
post-link transform feedback information.  Fortunately we can use
ctx->TransformFeedback.CurrentObject->shader_program.
---
 src/mesa/drivers/dri/i965/brw_state_upload.c |  1 +
 src/mesa/drivers/dri/i965/gen7_sol_state.c   | 15 +--
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index af554eb..ca256c2 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -305,6 +305,7 @@ void brw_init_state( struct brw_context *brw )
STATIC_ASSERT(BRW_NUM_STATE_BITS <= 8 * sizeof(brw->state.dirty.brw));
 
ctx->DriverFlags.NewTransformFeedback = BRW_NEW_TRANSFORM_FEEDBACK;
+   ctx->DriverFlags.NewTransformFeedbackProg = BRW_NEW_TRANSFORM_FEEDBACK;
ctx->DriverFlags.NewRasterizerDiscard = BRW_NEW_RASTERIZER_DISCARD;
ctx->DriverFlags.NewUniformBuffer = BRW_NEW_UNIFORM_BUFFER;
ctx->DriverFlags.NewAtomicBuffer = BRW_NEW_ATOMIC_BUFFER;
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 1ca2b74..5d2f243 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -39,14 +39,11 @@ static void
 upload_3dstate_so_buffers(struct brw_context *brw)
 {
struct gl_context *ctx = &brw->ctx;
-   /* BRW_NEW_VERTEX_PROGRAM */
-   const struct gl_shader_program *vs_prog =
-  ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX];
-   const struct gl_transform_feedback_info *linked_xfb_info =
-  &vs_prog->LinkedTransformFeedback;
/* BRW_NEW_TRANSFORM_FEEDBACK */
struct gl_transform_feedback_object *xfb_obj =
   ctx->TransformFeedback.CurrentObject;
+   const struct gl_transform_feedback_info *linked_xfb_info =
+  &xfb_obj->shader_program->LinkedTransformFeedback;
int i;
 
/* Set up the up to 4 output buffers.  These are the ranges defined in the
@@ -102,12 +99,11 @@ gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
  const struct brw_vue_map *vue_map)
 {
struct gl_context *ctx = &brw->ctx;
-   /* BRW_NEW_VERTEX_PROGRAM */
-   const struct gl_shader_program *vs_prog =
-  ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX];
/* BRW_NEW_TRANSFORM_FEEDBACK */
+   struct gl_transform_feedback_object *xfb_obj =
+  ctx->TransformFeedback.CurrentObject;
const struct gl_transform_feedback_info *linked_xfb_info =
-  &vs_prog->LinkedTransformFeedback;
+  &xfb_obj->shader_program->LinkedTransformFeedback;
uint16_t so_decl[128];
int buffer_mask = 0;
int next_offset[4] = {0, 0, 0, 0};
@@ -260,7 +256,6 @@ const struct brw_tracked_state gen7_sol_state = {
.dirty = {
   .mesa  = (_NEW_LIGHT),
   .brw   = (BRW_NEW_BATCH |
-   BRW_NEW_VERTEX_PROGRAM |
 BRW_NEW_VUE_MAP_GEOM_OUT |
 BRW_NEW_TRANSFORM_FEEDBACK)
},
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.

2014-01-22 Thread Paul Berry
Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback
functions were using ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to
find the program that would be the source of transform feedback data.
This isn't correct--if there's a geometry shader present it should be
ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY].  (These might be
different if separate shader objects are in use).

This patch creates a function get_xfb_source(), which figures out the
correct program to use based on GL state, and updates
_mesa_{Begin,Resume}TransformFeedback to call it.  get_xfb_source() is
written in terms of the gl_shader_stage enum, so it should not need
modification when we add tessellation shaders in the future.  It also
creates a new driver flag, NewTransformFeedbackProg, which is flagged
whenever this program changes.

To reduce future confusion, this patch also rewords some comments and
error message text to avoid referring to vertex shaders.
---
 src/mesa/main/mtypes.h|  8 --
 src/mesa/main/transformfeedback.c | 52 +--
 2 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 3dd9678..7fd3298 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1815,8 +1815,9 @@ struct gl_transform_feedback_object
 
/**
 * The shader program active when BeginTransformFeedback() was called.
-* When active and unpaused, this equals
-* ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX].
+* When active and unpaused, this equals ctx->Shader.CurrentProgram[stage],
+* where stage is the pipeline stage that is the source of data for
+* transform feedback.
 */
struct gl_shader_program *shader_program;
 
@@ -3779,6 +3780,9 @@ struct gl_driver_flags
/** gl_context::TransformFeedback::CurrentObject */
GLbitfield NewTransformFeedback;
 
+   /** gl_context::TransformFeedback::CurrentObject::shader_program */
+   GLbitfield NewTransformFeedbackProg;
+
/** gl_context::RasterDiscard */
GLbitfield NewRasterizerDiscard;
 
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 74897ba..9376a9e 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -24,7 +24,7 @@
 
 
 /*
- * Vertex transform feedback support.
+ * Transform feedback support.
  *
  * Authors:
  *   Brian Paul
@@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices(
  **/
 
 
+/**
+ * Figure out which stage of the pipeline is the source of transform feedback
+ * data given the current context state, and return its gl_shader_program.
+ *
+ * If no active program can generate transform feedback data (i.e. no vertex
+ * shader is active), returns NULL.
+ */
+static struct gl_shader_program *
+get_xfb_source(struct gl_context *ctx)
+{
+   int i;
+   for (i = MESA_SHADER_FRAGMENT - 1; i >= MESA_SHADER_VERTEX; i--) {
+  if (ctx->Shader.CurrentProgram[i] != NULL)
+ return ctx->Shader.CurrentProgram[i];
+   }
+   return NULL;
+}
+
+
 void GLAPIENTRY
 _mesa_BeginTransformFeedback(GLenum mode)
 {
struct gl_transform_feedback_object *obj;
-   struct gl_transform_feedback_info *info;
+   struct gl_transform_feedback_info *info = NULL;
+   struct gl_shader_program *source;
GLuint i;
unsigned vertices_per_prim;
GET_CURRENT_CONTEXT(ctx);
 
obj = ctx->TransformFeedback.CurrentObject;
 
-   if (ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) {
+   /* Figure out what pipeline stage is the source of data for transform
+* feedback.
+*/
+   source = get_xfb_source(ctx);
+   if (source == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "glBeginTransformFeedback(no program active)");
   return;
}
 
-   info =
-  &ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]->LinkedTransformFeedback;
+   info = &source->LinkedTransformFeedback;
 
if (info->NumOutputs == 0) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
@@ -452,7 +475,10 @@ _mesa_BeginTransformFeedback(GLenum mode)
   obj->GlesRemainingPrims = max_vertices / vertices_per_prim;
}
 
-   obj->shader_program = ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX];
+   if (obj->shader_program != source) {
+  ctx->NewDriverState |= ctx->DriverFlags.NewTransformFeedbackProg;
+  obj->shader_program = source;
+   }
 
assert(ctx->Driver.BeginTransformFeedback);
ctx->Driver.BeginTransformFeedback(ctx, mode, obj);
@@ -519,7 +545,7 @@ bind_buffer_range(struct gl_context *ctx, GLuint index,
 
 
 /**
- * Specify a buffer object to receive vertex shader results.  Plus,
+ * Specify a buffer object to receive transform feedback results.  Plus,
  * specify the starting offset to place the results, and max size.
  * Called from the glBindBufferRange() function.
  */
@@ -563,7 +589,7 @@ _mesa_bind_buffer_range_transform_feedback(struct 
gl_context *ctx,
 
 
 /**
- * Specify a buffer object to receive vertex shad

Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.

2014-01-22 Thread Paul Berry
On 22 January 2014 04:50, Francisco Jerez  wrote:

> Paul Berry  writes:
>
> > On 15 January 2014 11:42, Francisco Jerez  wrote:
> >
> >> v2: Reuse the glsl_sampler_dim enum for images.  Reuse the
> >> glsl_type::sampler_* fields instead of creating new ones specific
> >> to image types.  Reuse the same constructor as for samplers adding
> >> a new 'base_type' argument.
> >> ---
> >> Is this what you had in mind Paul?
> >>
> >
> > Yes, that seems reasonable.  Thanks.
>
> Does your previous reviewed-by still apply for v2 of this patch?
>
> Thanks.
>

Yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] i965/fs: Take into account reg_offset consistently for MRF regs.

2014-01-21 Thread Paul Berry
On 15 January 2014 14:01, Francisco Jerez  wrote:

> Paul Berry  writes:
>
> > On 2 December 2013 11:31, Francisco Jerez  wrote:
> >
> >> Until now it was only being taken into account in the VEC4 back-end
> >> but not in the FS back-end.  Do it in both cases.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_fs.h |  2 +-
> >>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++
> >>  src/mesa/drivers/dri/i965/brw_shader.h |  7 ---
> >>  3 files changed, 11 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
> >> b/src/mesa/drivers/dri/i965/brw_fs.h
> >> index 2c36d9f..f918f7e 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> >> @@ -615,4 +615,4 @@ bool brw_do_channel_expressions(struct exec_list
> >> *instructions);
> >>  bool brw_do_vector_splitting(struct exec_list *instructions);
> >>  bool brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program
> >> *prog);
> >>
> >> -struct brw_reg brw_reg_from_fs_reg(fs_reg *reg);
> >> +struct brw_reg brw_reg_from_fs_reg(fs_reg *reg, unsigned
> dispatch_width);
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> index 8d310a1..1de59eb 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> @@ -981,8 +981,9 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
> >>  }
> >>
> >>  struct brw_reg
> >> -brw_reg_from_fs_reg(fs_reg *reg)
> >> +brw_reg_from_fs_reg(fs_reg *reg, unsigned dispatch_width)
> >>  {
> >> +   const int reg_size = 4 * dispatch_width;
> >>
> >
> > What happens when reg.type is UW and dispatch_width is 16?  In that case,
> > we would compute reg_size == 64, but the correct value seems like it's
> > actually 32 in this case.
> >
> > Are we perhaps relying on reg.type being a 32-bit type?  If so, maybe we
> > should add an assertion:
> >
> >assert(type_sz(reg.type) == 4);
> >
>
> Nope, "reg_size" is supposed to be the size in bytes of a ::reg_offset
> unit, i.e. one hardware register in SIMD8 mode and two hardware
> registers in SIMD16 mode as the comment at the definition of
> ::reg_offset explains.  The fixed factor of four is intentional and
> correct no matter what the register type is.
>
> Thanks.
>

Ok, I see.  It appears that both this function *and* the comment above
reg_offset are assuming that the data type is 32 bit.  The comment above
reg_offset says:

* For pre-register-allocation GRFs and MRFs, this is in units of a
* float per pixel (1 hardware register for SIMD8 or SIMD4x2 mode,
* or 2 registers for SIMD16 mode).  For uniforms, this is in units
* of 1 float.

but when we get around to adding support for double-precision floats (a
feature of GL 4.0), this will no longer work; for double precision types
we'll need reg_offset to be measured in units of at least 4 hardware
registers in SIMD16 mode to avoid overlap.

Similarly, for types that are 16 bits, if we consider reg_offset to be
measured in units of 2 hardware registers in SIMD16 mode, we're actually
wasting registers, since all 16 values actually fit in a single hardware
register.  That's not really a big deal right now, since we use 16-bit
types so rarely in the FS back-end.  In fact, I bet we never use a nonzero
reg_offset on a 16-bit type.

It still seems to me that it would be worth putting an assertion here to
help alert us to what needs to change when we add double precision support
(or if someday we have hardware that supports half float computation).  I'm
not 100% sure what the assertion could be.  "assert(type_sz(reg->type) ==
4);" was an optimistic guess.  We might have to do
"assert(type_sz(reg->type) == 4 || reg->reg_offset == 0);" in order to
avoid tripping on the rare cases where we currently use 16-bit types in
fragment shaders.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.

2014-01-21 Thread Paul Berry
On 21 January 2014 19:21, Paul Berry  wrote:

>
> not realize that is is safe to call it on a register with 32 bits per
> component, because they might erroneously think "oh, this function only
> works on 16-component registers; that is, registers that
>

Oops, sorry about the extraneous sentence fragment there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.

2014-01-21 Thread Paul Berry
On 15 January 2014 13:53, Francisco Jerez  wrote:

> Paul Berry  writes:
>
> > On 2 December 2013 11:31, Francisco Jerez  wrote:
> >
> >> +/**
> >> + * Get either of the 8-component halves of a 16-component register.
> >> + */
> >> +static inline fs_reg
> >> +half(const fs_reg ®, unsigned idx)
> >> +{
> >> +   assert(idx == 0 || (reg.file != HW_REG && reg.file != IMM));
> >> +   return byte_offset(reg, 8 * idx * reg.stride * type_sz(reg.type));
> >> +}
> >> +
> >>
> >
> > I'd like to see a comment explaining that it's safe to call this function
> > on a register with 32 bits per component, since brw_reg.h's byte_offset()
> > handles byte offsets greater than 32 by incrementing the register number.
> >
> I've added a comment on the allowed range of fs_reg::subreg_offset here
> [1].  Do you think that's clear enough?
>

The comment you added is good, but it doesn't address my concern.  My
concern is that callers of half() might see the phrase "Get either of the
8-component halves of a 16-component register" and reasonably think this
means "get either of the 8-components halves of a register that stores 16
components in its 256 bits".  Since most 16-wide FS "registers" are
actually register pairs that store 16 components in 512 bits, they might be
misled into thinking that half() doesn't work on these register pairs.
Adding a comment to fs_reg::subreg_offset doesn't help that situation
because it doesn't change the misleading text above the half() function.

How about changing the comment above half() to say this:

/**
 * Get either of the 8-component halves of a 16-component register.
 *
 * Note: this also works if \c reg represents a SIMD16 pair of registers.
 */


not realize that is is safe to call it on a register with 32 bits per
component, because they might erroneously think "oh, this function only
works on 16-component registers; that is, registers that


>
> Thanks.
>
> > Also, for sanity's sake, it would be nice for this function to include
> the
> > assertion:
> >
> >assert(idx < 2);
> >
> > With those changes, this patch is:
> >
> > Reviewed-by: Paul Berry 
>
> [1]
> http://cgit.freedesktop.org/~currojerez/mesa/commit/?h=image-load-store&id=14ec9dfd42fec1ee5e5e7f48f98a36fe620cf4e6
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.

2014-01-21 Thread Paul Berry
On 15 January 2014 11:42, Francisco Jerez  wrote:

> v2: Reuse the glsl_sampler_dim enum for images.  Reuse the
> glsl_type::sampler_* fields instead of creating new ones specific
> to image types.  Reuse the same constructor as for samplers adding
> a new 'base_type' argument.
> ---
> Is this what you had in mind Paul?
>

Yes, that seems reasonable.  Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 3/8] glsl: Add array specifier to ast code

2014-01-21 Thread Paul Berry
On 21 January 2014 14:54, Timothy Arceri  wrote:

> Hi Paul
>
> Thanks for the thorough review and taking the time to explain how to make
> improvements. I can already see a number of improvements that can be made
> to patch 5 based on the suggestions you have already made so you could hold
> off reviewing that one if you havent started already and I've post a new
> series in a couple of days.
>

No problem.  As you can see, I already sent out my own ideas about patch
5/8.  Sorry about the mid-air collision on that--I didn't notice your reply
until I had already finished my review of the whole series.  Hopefully my
patch will do what you need, and you won't need to work on patch 5 any
further.

Thanks for working on this!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 8/8] docs: Mark ARB_arrays_of_arrays as started

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  docs/GL3.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 7/8] glsl: create type name for arrays of arrays

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> We need to insert outermost dimensions in the correct spot otherwise
> the dimension order will be backwards
>
> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/glsl_types.cpp | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
> index 12d4ac0..f7f3117 100644
> --- a/src/glsl/glsl_types.cpp
> +++ b/src/glsl/glsl_types.cpp
> @@ -300,8 +300,18 @@ glsl_type::glsl_type(const glsl_type *array, unsigned
> length) :
>
> if (length == 0)
>snprintf(n, name_length, "%s[]", array->name);
> -   else
> -  snprintf(n, name_length, "%s[%u]", array->name, length);
> +   else {
> +  /* insert outermost dimensions in the correct spot */
> +  const char *pos = strchr(array->name, '[');
> +  if (pos) {
> + int idx = pos - array->name;
> + snprintf(n, idx+1, "%s", array->name);
> + snprintf(n + idx, name_length, "[%u]", length);
> + snprintf(n + strlen(n), name_length, "%s", array->name + idx);
>

Technically the last two snprintf's are unsafe, since the second argument
is a length, so we need to subtract the offset we supplied to the first
argument.

Also, there's no need to split the second snprintf from the third; both of
them can be combined to:

snprintf(n + idx, name_length - idx, "[%u]%s", length, array->name + idx);

Good catch on noticing that it's necessary to reverse the order of the
dimensions.  With that fixed, this patch is:

Reviewed-by: Paul Berry 


> +  } else {
> + snprintf(n, name_length, "%s[%u]", array->name, length);
> +  }
> +   }
>
> this->name = n;
>  }
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader

2014-01-21 Thread Paul Berry
On 21 January 2014 18:23, Paul Berry  wrote:

> On 21 January 2014 04:19, Timothy Arceri  wrote:
>
>> Signed-off-by: Timothy Arceri 
>> ---
>>  src/glsl/ast_to_hir.cpp | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 226d128..62b7ec2 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions,
>>if (state->is_version(120, 300))
>>   break;
>>/* FALLTHROUGH */
>> +case GLSL_TYPE_ARRAY:
>> +   if (state->ARB_arrays_of_arrays_enable)
>> +  break;
>> +   /* FALLTHROUGH */
>> default:
>>_mesa_glsl_error(& loc, state,
>> "vertex shader input / attribute cannot
>> have "
>>
>
> I see two problems with this:
>
> 1. Since the case above has a fall-through, this has an unintended side
> effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes
> will be allowed prior to GLSL 1.20.  Although it's unlikely that anyone
> will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it
> would be nice to get the logic right.
>
> 2. When the type of the input is an array of arrays, we need to recurse
> through it to make sure it doesn't contain any prohibited types (e.g. an
> array of arrays of structs should still be prohibited).
>
> I think the easiest solution to this would be to replace the line:
>
> const glsl_type *check_type = var->type->is_array()
>? var->type->fields.array : var->type;
>
> with something like this:
>
> const glsl_type *check_type = var->type;
> while (check_type->is_array())
>check_type = check_type->element_type();
>
> Then we can leave the switch statement alone.
>
> With that fixed, this patch is:
>
> Reviewed-by: Paul Berry 
>

As an aside, this is just one example of what I suspect are many places
where the GLSL compiler makes the implicit assumption that an array can't
contain an array.  Once we think ARB_arrays_of_arrays support is complete,
it would probably be worth it for someone with a lot of familiarity with
the GLSL front end to grep for is_array() to make sure we haven't missed
anything :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/ast_to_hir.cpp | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 226d128..62b7ec2 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions,
>if (state->is_version(120, 300))
>   break;
>/* FALLTHROUGH */
> +case GLSL_TYPE_ARRAY:
> +   if (state->ARB_arrays_of_arrays_enable)
> +  break;
> +   /* FALLTHROUGH */
> default:
>_mesa_glsl_error(& loc, state,
> "vertex shader input / attribute cannot
> have "
>

I see two problems with this:

1. Since the case above has a fall-through, this has an unintended side
effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes
will be allowed prior to GLSL 1.20.  Although it's unlikely that anyone
will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it
would be nice to get the logic right.

2. When the type of the input is an array of arrays, we need to recurse
through it to make sure it doesn't contain any prohibited types (e.g. an
array of arrays of structs should still be prohibited).

I think the easiest solution to this would be to replace the line:

const glsl_type *check_type = var->type->is_array()
   ? var->type->fields.array : var->type;

with something like this:

const glsl_type *check_type = var->type;
while (check_type->is_array())
   check_type = check_type->element_type();

Then we can leave the switch statement alone.

With that fixed, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 5/8] glsl: Aggregate initializer support for arrays of array

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/ast.h  | 19 +++-
>  src/glsl/ast_function.cpp   | 14 +++--
>  src/glsl/ast_to_hir.cpp | 29 +-
>  src/glsl/glsl_parser.yy | 36 +-
>  src/glsl/glsl_parser_extras.cpp | 68
> +++--
>  5 files changed, 139 insertions(+), 27 deletions(-)
>

I believe I've found an easier solution to this: please see "[PATCH] glsl:
Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.",
which I believe makes this patch unnecessary.


>
> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
> index 4dda32e..8bccca7 100644
> --- a/src/glsl/ast.h
> +++ b/src/glsl/ast.h
> @@ -308,10 +308,16 @@ public:
>: ast_expression(ast_aggregate, NULL, NULL, NULL),
>  constructor_type(NULL)
> {
> -  /* empty */
> +  array_dimension = 1;
> +  is_array = false;
> }
>
> ast_type_specifier *constructor_type;
> +   unsigned array_dimension;
> +
> +   ast_array_specifier *array_specifier;
> +   bool is_array;
> +
> virtual ir_rvalue *hir(exec_list *instructions,
>struct _mesa_glsl_parse_state *state);
>  };
> @@ -588,6 +594,11 @@ public:
>  struct _mesa_glsl_parse_state *state)
>const;
>
> +   const struct glsl_type *glsl_type(const char **name,
> + struct _mesa_glsl_parse_state *state,
> + bool skip_outer_dimension)
> +  const;
> +
> virtual void print(void) const;
>
> ir_rvalue *hir(exec_list *, struct _mesa_glsl_parse_state *);
> @@ -1005,4 +1016,10 @@ extern void
>  check_builtin_array_max_size(const char *name, unsigned size,
>   YYLTYPE loc, struct _mesa_glsl_parse_state
> *state);
>
> +extern const glsl_type *
> +process_array_type(YYLTYPE *loc, const glsl_type *base,
> +   ast_array_specifier *array_specifier,
> +  struct _mesa_glsl_parse_state *state,
> +   bool skip_first_dim);
> +
>  #endif /* AST_H */
> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
> index 2d05d07..3b38cb6 100644
> --- a/src/glsl/ast_function.cpp
> +++ b/src/glsl/ast_function.cpp
> @@ -1693,8 +1693,18 @@ ast_aggregate_initializer::hir(exec_list
> *instructions,
>_mesa_glsl_error(&loc, state, "type of C-style initializer
> unknown");
>return ir_rvalue::error_value(ctx);
> }
> -   const glsl_type *const constructor_type =
> -  this->constructor_type->glsl_type(&name, state);
> +   const glsl_type *constructor_type =
> +  this->constructor_type->glsl_type(&name, state,
> +this->array_dimension > 1 &&
> !this->is_array);
> +
> +   /* This only handles "vec4 foo[..]".  The earlier
> constructor_type->glsl_type(...)
> +* call already handled the "vec4[..] foo" case.
> +*/
> +   if (this->is_array) {
> +  constructor_type =
> + process_array_type(&loc, constructor_type, this->array_specifier,
> +state, this->array_dimension > 1);
> +   }
>
> if (!state->ARB_shading_language_420pack_enable) {
>_mesa_glsl_error(&loc, state, "C-style initialization requires the "
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index d02c9ff..226d128 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -1817,10 +1817,11 @@ process_array_size(exec_node *node,
> return result;
>  }
>
> -static const glsl_type *
> +const glsl_type *
>  process_array_type(YYLTYPE *loc, const glsl_type *base,
> ast_array_specifier *array_specifier,
> -   struct _mesa_glsl_parse_state *state)
> +   struct _mesa_glsl_parse_state *state,
> +   bool skip_first_dim)
>  {
> const glsl_type *array_type = NULL;
> const glsl_type *element_type = base;
> @@ -1865,6 +1866,8 @@ process_array_type(YYLTYPE *loc, const glsl_type
> *base,
>
>unsigned array_size;
>for (/* nothing */; !node->is_head_sentinel(); node = node->prev) {
> + if (skip_first_dim && node->prev->is_head_sentinel())
> +break;
>   array_size = process_array_size(node, state);
>   array_type_temp = glsl_type::get_array_instance(array_type_temp,
>   array_size);
> @@ -1891,6 +1894,14 @@ const glsl_type *
>  ast_type_specifier::glsl_type(const char **name,
>   struct _mesa_glsl_parse_state *state) const
>  {
> +   return this->glsl_type(name, state, false);
> +}
> +
> +const glsl_type *
> +ast_type_specifier::glsl_type(const char **name,
> + struct _mesa_glsl_parse_state *state,
> +  bool skip_outer_dim) const
> +{
> 

[Mesa-dev] [PATCH] glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.

2014-01-21 Thread Paul Berry
Most of the time it is not necessary to perform type inference to
compile GLSL; the type of every expression can be inferred from the
contents of the expression itself (and previous type declarations).
The exception is aggregate initializers: their type is determined by
the LHS of the variable being assigned to.  For example, in the
statement:

   mat2 foo = { { 1, 2 }, { 3, 4 } };

the type of { 1, 2 } is only known to be vec2 (as opposed to, say,
ivec2, uvec2, int[2], or a struct) because of the fact that the result
is being assigned to a mat2.

Previous to this patch, we handled this situation by doing some type
inference during parsing: when parsing a declaration like the one
above, we would call _mesa_set_aggregate_type(), which would infer the
type of each aggregate initializer and store it in the corresponding
ast_aggregate_initializer::constructor_type field.  Since this
happened at parse time, we couldn't do the type inference using
glsl_type objects; we had to use ast_type_specifiers, which are much
more awkward to work with.  Things are about to get more complicated
when we add support for ARB_arrays_of_arrays.

This patch simplifies things by postponing the call to
_mesa_set_aggregate_type() until ast-to-hir time, when we have access
to glsl_type objects.  As a side benefit, we only need to have one
call to _mesa_set_aggregate_type() now, instead of six.
---

Timothy: I was inspired to write this patch by the complexities you
encountered during "[PATCH V2 5/8] glsl: Aggregate initializer support
for arrays of array".  Can you try rebasing your series on top of this
patch to see if it simplifies things?  I believe that with these
changes, you should be able to drop patch 5/8 entirely.

 src/glsl/ast.h  |  16 +++--
 src/glsl/ast_function.cpp   |   8 +--
 src/glsl/ast_to_hir.cpp |   7 +++
 src/glsl/glsl_parser.yy |  27 
 src/glsl/glsl_parser_extras.cpp | 136 +++-
 5 files changed, 46 insertions(+), 148 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 76911f0..b24052b 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -296,7 +296,16 @@ public:
   /* empty */
}
 
-   ast_type_specifier *constructor_type;
+   /**
+* glsl_type of the aggregate, which is inferred from the LHS of whatever
+* the aggregate is being used to initialize.  This can't be inferred at
+* parse time (since the parser deals with ast_type_specifiers, not
+* glsl_types), so the parser leaves it NULL.  However, the ast-to-hir
+* conversion code makes sure to fill it in with the appropriate type
+* before hir() is called.
+*/
+   const glsl_type *constructor_type;
+
virtual ir_rvalue *hir(exec_list *instructions,
   struct _mesa_glsl_parse_state *state);
 };
@@ -978,9 +987,8 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 YYLTYPE &loc, YYLTYPE &idx_loc);
 
 extern void
-_mesa_ast_set_aggregate_type(const ast_type_specifier *type,
- ast_expression *expr,
- _mesa_glsl_parse_state *state);
+_mesa_ast_set_aggregate_type(const glsl_type *type,
+ ast_expression *expr);
 
 void
 emit_function(_mesa_glsl_parse_state *state, ir_function *f);
diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 2d05d07..4c5b0e4 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -1687,14 +1687,12 @@ ast_aggregate_initializer::hir(exec_list *instructions,
 {
void *ctx = state;
YYLTYPE loc = this->get_location();
-   const char *name;
 
if (!this->constructor_type) {
   _mesa_glsl_error(&loc, state, "type of C-style initializer unknown");
   return ir_rvalue::error_value(ctx);
}
-   const glsl_type *const constructor_type =
-  this->constructor_type->glsl_type(&name, state);
+   const glsl_type *const constructor_type = this->constructor_type;
 
if (!state->ARB_shading_language_420pack_enable) {
   _mesa_glsl_error(&loc, state, "C-style initialization requires the "
@@ -1702,12 +1700,12 @@ ast_aggregate_initializer::hir(exec_list *instructions,
   return ir_rvalue::error_value(ctx);
}
 
-   if (this->constructor_type->is_array) {
+   if (constructor_type->is_array()) {
   return process_array_constructor(instructions, constructor_type, &loc,
&this->expressions, state);
}
 
-   if (this->constructor_type->structure) {
+   if (constructor_type->is_record()) {
   return process_record_constructor(instructions, constructor_type, &loc,
 &this->expressions, state);
}
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 4cc8eb1..8d13610 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2593,6 +2593,13 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
 

Re: [Mesa-dev] [PATCH V2 4/8] glsl: only call mark_max_array if we are assigning an array

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/ast_to_hir.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>

Can you say more in the commit message about why you're making this
change?  Is this fixing an existing bug?  Is it preparing for a later
change (which would otherwise create a bug if the change weren't made)?  Or
are you doing it just because it seems reasonable to do?


>
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 326aa58..d02c9ff 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -830,8 +830,10 @@ do_assignment(exec_list *instructions, struct
> _mesa_glsl_parse_state *state,
>
>  rhs->type->array_size());
>  d->type = var->type;
>}
> -  mark_whole_array_access(rhs);
> -  mark_whole_array_access(lhs);
> +  if (lhs->type->is_array()) {
> + mark_whole_array_access(rhs);
> + mark_whole_array_access(lhs);
> +  }
> }
>
> /* Most callers of do_assignment (assign, add_assign, pre_inc/dec,
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 3/8] glsl: Add array specifier to ast code

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/ast.h  |  44 +++
>  src/glsl/ast_array_index.cpp|  13 
>  src/glsl/ast_to_hir.cpp | 160
> +++-
>  src/glsl/ast_type.cpp   |   8 +-
>  src/glsl/glsl_parser_extras.cpp |  30 
>  src/glsl/glsl_parser_extras.h   |   2 +
>  6 files changed, 167 insertions(+), 90 deletions(-)
>
> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
> index 76911f0..4dda32e 100644
> --- a/src/glsl/ast.h
> +++ b/src/glsl/ast.h
> @@ -276,6 +276,21 @@ private:
> bool cons;
>  };
>
> +class ast_array_specifier : public ast_node {
> +public:
> +   ast_array_specifier()
> +  : ast_node()
>

Note: it's not necessary to explicitly mention the base class in the
initializer list when its constructor takes no arguments.


> +   {
> +  dimension_count = 1;
> +   }
>

It would be nice to have the constructor(s) of ast_array_specifier
initialize it to a consistent state.  As written, it creates an
ast_array_specifier with a dimension count of 1, but with is_unsized_array
set to false and an empty array_dimensions list, which isn't sensible.

I'd recommend having two constructors:

/** Unsized array specifier ([]) */
explicit ast_array_specifier(const struct YYLTYPE &locp)
  : dimension_count(1), is_unsized_array(true)
{
   set_location(locp);
}

and one for a sized array:

/** Sized array specifier ([dim]) */
ast_array_specifier(ast_expression *dim)
  : dimension_count(1), is_unsized_array(false)
{
   set_location(locp);
   array_dimensions.push_tail(&dim->link);
}

This would have the added advantage of making it easier for a reader of the
class definition to understand how the data structure works.


It would also be nice to have a method which adds a dimension to the array
specifier and maintains the proper invariant on dimension_count:

void add_dimension(ast_expression *dim)
{
   array_dimensions.push_tail(&dim->link);
   dimension_count++;
}

That way all the logic for maintaining internal consistency of the data
structure is here inside the class definition, rather than in the parser
where it's harder to verify that it's correct.

>
> +
> +   virtual void print(void) const;
> +
> +   unsigned dimension_count;
>

It would be nice to have a comment above dimension_count explaining that
this includes both sized and unsized dimensions.


> +   bool is_unsized_array;
>

It would be nice to have a comment above is_unsized_array explaining that
if true, this means that the array has an unsized outermost dimension.


> +   exec_list array_dimensions;
>

It would be nice to have a comment explaining that this list contains
objects of type ast_node containing the sized dimensions only, in
outermost-to-innermost order.


> +};
>
+
>  /**
>   * C-style aggregate initialization class
>   *
> @@ -325,14 +340,15 @@ public:
>
>  class ast_declaration : public ast_node {
>  public:
> -   ast_declaration(const char *identifier, bool is_array, ast_expression
> *array_size,
> -  ast_expression *initializer);
> +   ast_declaration(const char *identifier, bool is_array,
> +   ast_array_specifier *array_specifier,
> +   ast_expression *initializer);
> virtual void print(void) const;
>
> const char *identifier;
>
> bool is_array;
> -   ast_expression *array_size;
> +   ast_array_specifier *array_specifier;
>

Previously the reason we needed is_array was because we used array_size ==
NULL to represent both non-arrays and unsized arrays.  Now that we use a
non-NULL array_specifier to represent an unsized array, is_array is
redundant--it should be true exactly when array_specifier != NULL.  To
avoid redunancy, I think we should drop is_array from this class entirely.

The same goes for the is_array fields of ast_type_specifier,
ast_parameter_declarator, and ast_interface_block.

If you wanted to remove all of that in a follow up patch, that would be
fine by me.


>
> ast_expression *initializer;
>  };
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 4cc8eb1..326aa58 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -1771,63 +1771,116 @@ ast_compound_statement::hir(exec_list
> *instructions,
> return NULL;
>  }
>
> +static const unsigned
> +process_array_size(exec_node *node,
> +   struct _mesa_glsl_parse_state *state)
>

Since it's hard to infer what this function does from its type signature,
I'd recommend adding a comment above it saying something like "Evaluate the
given exec_node (which should be an ast_node representing a single array
dimension) and return its integer value."


> +{
> +   int result = 0;
>

This should have type "unsigned" for consistency with the function
signature and the use of "size->value.u[0]" below.


> +   exec_list dummy_instructions;
> +
> +   ast_node *array_size = exec_node_data(ast_node, node, link);
> +

Re: [Mesa-dev] [PATCH V2 2/8] glsl: Add arrays_of_arrays to yacc definition

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/glsl/glsl_parser.yy | 153
> +---
>  1 file changed, 66 insertions(+), 87 deletions(-)
>
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index 1c56d6f..6d63668 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -1611,19 +1565,55 @@ storage_qualifier:
> }
> ;
>
> -type_specifier:
> -   type_specifier_nonarray
> -   | type_specifier_nonarray '[' ']'
> +array_specifier:
> +   '[' ']'
> {
> -  $$ = $1;
> -  $$->is_array = true;
> -  $$->array_size = NULL;
> +  void *ctx = state;
> +  $$ = new(ctx) ast_array_specifier();
> +  $$->is_unsized_array = true;
> +  $$->set_location(yylloc);
> +   }
> +   | '[' constant_expression ']'
> +   {
> +  void *ctx = state;
> +  $$ = new(ctx) ast_array_specifier();
> +  $$->array_dimensions.push_tail(& $2->link);
> +  $$->is_unsized_array = false;
> +  $$->set_location(yylloc);
> +   }
> +   | array_specifier '[' ']'
> +   {
> +  if (!state->ARB_arrays_of_arrays_enable) {
> + _mesa_glsl_error(& @1, state,
> +  "#version 120 / GL_ARB_arrays_of_arrays "
> +  "required for defining arrays of arrays");
>

This error message is confusing.  It could be reasonably interpreted as
saying that either version 120 or GL_ARB_arrays_of_arrays is sufficient to
allow arrays of arrays, which isn't true.  I would just drop the mention of
version 120 entirely--the only reason it's significant at all is that
version 120 is the first version of GLSL to mention that arrays of arrays
aren't allowed, and that was just because it was left out of version 110 as
an oversight.



> +  } else {
> + _mesa_glsl_error(& @1, state,
> +  "only the outermost array dimension can "
> +  "be unsized");
> +  }
>

To prevent uninitialized data in the error condition, I'd recommend doing

$$ = $1;

here.  Otherwse we might crash before the user can see the error message.


> }
> -   | type_specifier_nonarray '[' constant_expression ']'
> +   | array_specifier '[' constant_expression ']'
> +   {
> +  if (!state->ARB_arrays_of_arrays_enable) {
> + _mesa_glsl_error(& @1, state,
> +  "#version 120 / GL_ARB_arrays_of_arrays "
> +  "required for defining arrays of arrays");
>

Similar comment about dropping "#version 120" from the error message here.


> +  }
> +
>

I think we need to add:

$$ = $1;

here.


> +  $$->array_dimensions.push_tail(& $3->link);
> +  $$->dimension_count++;
> +  $$->set_location(yylloc);
>

Assuming I'm right about adding "$$ = $1;" above, there's no need to call
set_location() here.  The location was already set when the
ast_array_specifier was created.


> +   }
> +   ;
> +
> +type_specifier:
> +   type_specifier_nonarray
> +   | type_specifier_nonarray array_specifier
> {
>$$ = $1;
>$$->is_array = true;
> -  $$->array_size = $3;
> +  $$->array_specifier = $2;
> }
> ;
>

With those changes, this patch is:

Reviewed-by: Paul Berry 

Note, however, that some of my review comments on patch 3/8 will probably
affect the content of this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 1/8] mesa: Add ARB_arrays_of_arrays

2014-01-21 Thread Paul Berry
On 21 January 2014 04:19, Timothy Arceri  wrote:

> Signed-off-by: Timothy Arceri 
> ---
>  src/mesa/main/extensions.c | 1 +
>  src/mesa/main/mtypes.h | 1 +
>  2 files changed, 2 insertions(+)
>

Since this is a GLSL extension, you also need to update
glcpp_parser_create() (in glcpp-parse.y) to call add_builtin_define(parser,
"GL_ARB_arrays_of_arrays") if extensions->ARB_arrays_of_arrays is set.

With that fixed, this patch is:

Reviewed-by: Paul Berry 

Personally I would also add the extension to the
_mesa_glsl_supported_extensions[] table in this patch, as well as add the
bools ARB_arrays_of_arrays_enable and ARB_arrays_of_arrays_warn to
_mesa_glsl_parse_state.  I see you've done that as part of patch 3/8
instead.  I suppose that's a matter of personal preference though.


>
> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> index 2e0ccc3..95eb7ca 100644
> --- a/src/mesa/main/extensions.c
> +++ b/src/mesa/main/extensions.c
> @@ -80,6 +80,7 @@ static const struct extension extension_table[] = {
> /* ARB Extensions */
> { "GL_ARB_ES2_compatibility",
> o(ARB_ES2_compatibility),   GL, 2009 },
> { "GL_ARB_ES3_compatibility",
> o(ARB_ES3_compatibility),   GL, 2012 },
> +   { "GL_ARB_arrays_of_arrays",
>  o(ARB_arrays_of_arrays),GL, 2012 },
> { "GL_ARB_base_instance",   o(ARB_base_instance),
>   GL, 2011 },
> { "GL_ARB_blend_func_extended",
> o(ARB_blend_func_extended), GL, 2009 },
> { "GL_ARB_clear_buffer_object", o(dummy_true),
>  GL, 2012 },
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 33df682..eef67a8 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3366,6 +3366,7 @@ struct gl_extensions
> GLboolean ANGLE_texture_compression_dxt;
> GLboolean ARB_ES2_compatibility;
> GLboolean ARB_ES3_compatibility;
> +   GLboolean ARB_arrays_of_arrays;
> GLboolean ARB_base_instance;
> GLboolean ARB_blend_func_extended;
> GLboolean ARB_color_buffer_float;
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Two small blorp fixes found during refactoring

2014-01-21 Thread Paul Berry
On 21 January 2014 08:07, Topi Pohjolainen wrote:

> These are not known to have caused any problems but they would get
> fixed when switching into fs-generator. Both fixes need
> corresponding patching of the unit tests and fixing them aforehand
> makes the switch possible without any changes in the tests.
>
> Topi Pohjolainen (2):
>   i965/blorp: patch jump counters also for endif
>   i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR
>
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   | 48
> ++
>  .../drivers/dri/i965/test_blorp_blit_eu_gen.cpp| 30 +++---
>  2 files changed, 47 insertions(+), 31 deletions(-)
>

Series is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Reserve space for "Vertex Count" in GS outputs.

2014-01-20 Thread Paul Berry
On 20 January 2014 22:56, Kenneth Graunke  wrote:

> v2: Also increment ir->offset in the GS visitor, rather than at the
> final assembly generation stage (requested by Paul).
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_gs.c   | 6 ++
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 +++
>  2 files changed, 13 insertions(+)
>
> Hey Paul,
>
> I think I implemented all of your feedback from last time.  (Previously,
> I'd done this in gen8_vec4_generator, and you asked me to move it to the
> visitor.  This patch does that - and it is indeed much nicer.)
>
>  --Ken
>

Looks good, thanks!

Reviewed-by: Paul Berry 


>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> index 018b0b6..c40112b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
> @@ -194,6 +194,12 @@ do_gs_prog(struct brw_context *brw,
>c.prog_data.output_vertex_size_hwords * 32 *
> gp->program.VerticesOut;
> output_size_bytes += 32 * c.prog_data.control_data_header_size_hwords;
>
> +   /* Broadwell stores "Vertex Count" as a full 8 DWord (32 byte) URB
> output,
> +* which comes before the control header.
> +*/
> +   if (brw->gen >= 8)
> +  output_size_bytes += 32;
> +
> assert(output_size_bytes >= 1);
> if (output_size_bytes > GEN7_MAX_GS_URB_ENTRY_SIZE_BYTES)
>return false;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> index f33c80d..f0351ea 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> @@ -267,6 +267,13 @@ vec4_gs_visitor::emit_urb_write_opcode(bool complete)
>
> vec4_instruction *inst = emit(GS_OPCODE_URB_WRITE);
> inst->offset = c->prog_data.control_data_header_size_hwords;
> +
> +   /* We need to increment Global Offset by 1 to make room for Broadwell's
> +* extra "Vertex Count" payload at the beginning of the URB entry.
> +*/
> +   if (brw->gen >= 8)
> +  inst->offset++;
> +
> inst->urb_write_flags = BRW_URB_WRITE_PER_SLOT_OFFSET;
> return inst;
>  }
> --
> 1.8.5.2
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 42/42] i965/blorp: switch eu-emitter to use FS IR and fs_generator

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen wrote:

> Unfortunately the unit tests need to be patched as well. This is
> because the direct eu-emitter only patches jump counters for
> if-else (see patch_IF_ELSE()) while the fs_generator patches the
> endif as well (see brw_set_uip_jip()).
>

I think the fact that blorp wasn't patching jump counters for if-else was a
previously undiscovered bug in blorp.  I suspect that the reason we never
discovered it was because blorp doesn't contain complex enough flow control
to cause an endif to ever execute a jump.

I'd recommend splitting this patch into two patches:

1. fix the previously undiscovered blorp bug (by having blorp call
brw_set_uip_jip() at the appropriate time) and update the unit tests.

2. switch eu-emitter to use FS IR and fs_generator.

That will make it a lot easier to reassure ourselves that the changes are
correct, and in the unlikely event that we later discover that they aren't,
it'll make it easier to bisect to the problem.

With this patch split, it is:

Reviewed-by: Paul Berry 

I've sent comments on patches 24, 27, 37, 38, 39, and 41.  All the others
are:

Reviewed-by: Paul Berry 


I really like how you structured this series, Topi.  It was easy to follow
what you did and to reassure myself that it was correct.  And I'm really
glad you did it, since it led us to discover two preexisting bugs!  Nice
work.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter

2014-01-20 Thread Paul Berry
On 20 January 2014 19:36, Paul Berry  wrote:

> On 20 December 2013 06:38, Topi Pohjolainen wrote:
>
>> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
>> b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
>> index b189aa2..dcfd82b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
>> @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const
>> struct brw_reg &src0,
>>  true /* eot */,
>>  use_header);
>>  }
>> +
>> +void
>> +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type,
>> +   const struct brw_reg &dst,
>> +   const struct brw_reg &src_1,
>> +   const struct brw_reg &src_2)
>> +{
>> +   if (texture_data_type == BRW_REGISTER_TYPE_F)
>> +  brw_ADD(&func, dst, src_1, src_2);
>> +   else
>> +  brw_AVG(&func, dst, src_1, src_2);
>> +}
>>
>
> It's a bit of an awkward split to have most of the algorithm for combining
> samples in brw_blorp_blit_program::manual_blend_average(), but the choice
> of whether to use ADD or AVG is here in
> brw_blorp_eu_emitter::emit_combine().  How about if we replace
> texture_data_type with a bool called "combine_using_add"?  That way someone
> reading manual_blend_average() won't have to refer to emit_combine() to
> understand what the algorithm does; and similarly someone reading
> emt_combine() won't have to look at manual_blend_average() to understand
> why we use ADD for floats and AVG for ints.
>

On further reflection, I think it would be even better to replace the
texture_data_type argument with an opcode argument--that way the caller can
pass in BRW_OPCODE_ADD or BRW_OPCODE_AVG.  Once we get to patch 42/42, this
function can be changed to just plumb the opcode straight through into the
fs_inst constructor.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/42] i965/eu: introduce blorp specific flavour of lrp

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen wrote:

> This is rather ugly but as I couldn't think of anything better
> for now and wanted to get the rest of the series under review,
> I left it as it is.
> Even though immediately surrounding code has tabs this piece is
> written space-indented.
>
> Signed-off-by: Topi Pohjolainen 
>

This patch is unnecessary.  The blorp specific "flavour" of lrp is actually
a previously undiscovered bug in blorp: blorp should be using
BRW_COMPRESSION_2NDHALF for its second half LRPs.  I believe the bug may be
benign, but we should fix it anyhow, and I think we can fix it by just
dropping this patch entirely.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 39/42] i965/fs: introduce blorp specific rt-write for fs_generator

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen wrote:

> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index df91235..4c159e6 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -190,6 +190,21 @@ fs_generator::generate_fb_write(fs_inst *inst)
> mark_surface_used(surf_index);
>  }
>
> +void
> +fs_generator::generate_blorp_fb_write(fs_inst *inst)
> +{
> +   brw_fb_WRITE(p,
> +16 /* dispatch_width */,
> +inst->base_mrf,
> +brw_reg_from_fs_reg(&inst->src[0]),
> +BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE,
> +1 /* BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX */,
>

This seems like it would lead to a maintenance burden if anyone ever
decides to change BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to a number
other than 1.  Can we move the declaration of
BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to somewhere that's accessible
to this file so that we can just pass it in directly?  Alternatively, we
could store it in inst->target.

With that issue addressed, this patch is:

Reviewed-by: Paul Berry 


> +inst->mlen,
> +0,
> +true,
> +inst->header_present);
> +}
> +
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 38/42] i965/fs: allow unit tests to dump the final patched assembly

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen wrote:

> Unit tests comparing generated blorp programs to known good need
> to have the dump in designated file instead of in default
> standard output. The comparison also expects the jump counters
> of if-else-instructions to be correctly set and hence the dump
> needs to be taken _after_ 'patch_IF_ELSE()' is run (the default
> dump of the fs_generator does this before).
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h |  7 +--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 +--
>  2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 9bef07c..d40d0a8 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -510,10 +510,13 @@ public:
>
> const unsigned *generate_assembly(exec_list *simd8_instructions,
>   exec_list *simd16_instructions,
> - unsigned *assembly_size);
> + unsigned *assembly_size,
> + bool dump_enabled = false,
> + FILE *dump_file = stdout);
>

Nit pick: rather than add 2 args, wouldn't it be easier to just add the
dump_file arg, with a default value of NULL, and only dump instructions if
it is non-NULL?

Either way, the patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 37/42] i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter

2014-01-20 Thread Paul Berry
On 20 December 2013 06:39, Topi Pohjolainen wrote:

> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
> b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
> index 1ecf076..3f2301c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
> @@ -162,6 +162,24 @@ protected:
>brw_RNDD(&func, dst, src);
> }
>
> +   inline void emit_if(int op,
> +   const struct brw_reg &x,
> +   const struct brw_reg &y)
> +   {
> +  brw_CMP(&func, vec16(brw_null_reg()), op, x, y);
> +  brw_IF(&func, BRW_EXECUTE_16);
> +   }
>

Can we maybe call tihs "emit_cmp_if()", so that it's clear when looking at
the caller that it performs both a CMP and an IF?

With that change, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/42] i965/blorp: wrap LRP

2014-01-20 Thread Paul Berry
On 20 December 2013 06:38, Topi Pohjolainen wrote:

> The split of the emission of the two halfs into single emission
> call prapares for fs_generator support that already does similar
> thing. No regressions seen on IVB (unit tests and piglit quick).
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 26
> ++
>  src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h | 13 +
>  2 files changed, 23 insertions(+), 16 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 1b7310b..b95104e 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1679,29 +1679,23 @@
> brw_blorp_blit_program::manual_blend_bilinear(unsigned num_samples)
> }
>
>  #define SAMPLE(x, y) offset(texture_data[x], y)
> -   brw_set_access_mode(&func, BRW_ALIGN_16);
> -   brw_set_compression_control(&func, BRW_COMPRESSION_NONE);
> for (int index = 3; index > 0; ) {
>/* Since we're doing SIMD16, 4 color channels fits in to 8
> registers.
> * Counter value of 8 in 'for' loop below is used to interpolate all
> * the color components.
> */
> -  for (int k = 0; k < 8; ++k)
> - brw_LRP(&func,
> - vec8(SAMPLE(index - 1, k)),
> - offset(x_frac, k & 1),
> - SAMPLE(index, k),
> - SAMPLE(index - 1, k));
> +  for (int k = 0; k < 8; k += 2)
> + emit_lrp(vec8(SAMPLE(index - 1, k)),
> +  offset(x_frac, k & 1),
>

This line can just be "x_frac" now, because k is always even, so (k & 1) is
always zero.


> +  SAMPLE(index, k),
> +  SAMPLE(index - 1, k));
>index -= 2;
> }
> -   for (int k = 0; k < 8; ++k)
> -  brw_LRP(&func,
> -  vec8(SAMPLE(0, k)),
> -  offset(y_frac, k & 1),
> -  vec8(SAMPLE(2, k)),
> -  vec8(SAMPLE(0, k)));
> -   brw_set_compression_control(&func, BRW_COMPRESSION_COMPRESSED);
> -   brw_set_access_mode(&func, BRW_ALIGN_1);
> +   for (int k = 0; k < 8; k += 2)
> +  emit_lrp(vec8(SAMPLE(0, k)),
> +   offset(y_frac, k & 1),
>

Same comment applies here.


> +   vec8(SAMPLE(2, k)),
> +   vec8(SAMPLE(0, k)));
>  #undef SAMPLE
>  }
>

With those two things fixed, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter

2014-01-20 Thread Paul Berry
On 20 December 2013 06:38, Topi Pohjolainen wrote:

> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
> index b189aa2..dcfd82b 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
> @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const
> struct brw_reg &src0,
>  true /* eot */,
>  use_header);
>  }
> +
> +void
> +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type,
> +   const struct brw_reg &dst,
> +   const struct brw_reg &src_1,
> +   const struct brw_reg &src_2)
> +{
> +   if (texture_data_type == BRW_REGISTER_TYPE_F)
> +  brw_ADD(&func, dst, src_1, src_2);
> +   else
> +  brw_AVG(&func, dst, src_1, src_2);
> +}
>

It's a bit of an awkward split to have most of the algorithm for combining
samples in brw_blorp_blit_program::manual_blend_average(), but the choice
of whether to use ADD or AVG is here in
brw_blorp_eu_emitter::emit_combine().  How about if we replace
texture_data_type with a bool called "combine_using_add"?  That way someone
reading manual_blend_average() won't have to refer to emit_combine() to
understand what the algorithm does; and similarly someone reading
emt_combine() won't have to look at manual_blend_average() to understand
why we use ADD for floats and AVG for ints.

With that change, this patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Add GS support to INTEL_DEBUG=shader_time.

2014-01-17 Thread Paul Berry
Previously, time spent in geometry shaders would be counted as part of
the vertex shader time.
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 +++
 src/mesa/drivers/dri/i965/brw_program.c   | 10 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  6 +++---
 src/mesa/drivers/dri/i965/brw_vec4.h  |  9 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 10 --
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  3 ++-
 src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp |  3 ++-
 8 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9c51646..972eb9f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -783,6 +783,9 @@ enum shader_time_shader_type {
ST_VS,
ST_VS_WRITTEN,
ST_VS_RESET,
+   ST_GS,
+   ST_GS_WRITTEN,
+   ST_GS_RESET,
ST_FS8,
ST_FS8_WRITTEN,
ST_FS8_RESET,
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index abfc921..a6a2403 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -288,7 +288,7 @@ get_written_and_reset(struct brw_context *brw, int i,
   uint64_t *written, uint64_t *reset)
 {
enum shader_time_shader_type type = brw->shader_time.types[i];
-   assert(type == ST_VS || type == ST_FS8 || type == ST_FS16);
+   assert(type == ST_VS || type == ST_GS || type == ST_FS8 || type == ST_FS16);
 
/* Find where we recorded written and reset. */
int wi, ri;
@@ -340,6 +340,8 @@ brw_report_shader_time(struct brw_context *brw)
   switch (type) {
   case ST_VS_WRITTEN:
   case ST_VS_RESET:
+  case ST_GS_WRITTEN:
+  case ST_GS_RESET:
   case ST_FS8_WRITTEN:
   case ST_FS8_RESET:
   case ST_FS16_WRITTEN:
@@ -349,6 +351,7 @@ brw_report_shader_time(struct brw_context *brw)
  continue;
 
   case ST_VS:
+  case ST_GS:
   case ST_FS8:
   case ST_FS16:
  get_written_and_reset(brw, i, &written, &reset);
@@ -372,6 +375,7 @@ brw_report_shader_time(struct brw_context *brw)
 
   switch (type) {
   case ST_VS:
+  case ST_GS:
   case ST_FS8:
   case ST_FS16:
  total_by_type[type] += scaled[i];
@@ -432,6 +436,9 @@ brw_report_shader_time(struct brw_context *brw)
   case ST_VS:
  stage = "vs";
  break;
+  case ST_GS:
+ stage = "gs";
+ break;
   case ST_FS8:
  stage = "fs8";
  break;
@@ -449,6 +456,7 @@ brw_report_shader_time(struct brw_context *brw)
 
printf("\n");
print_shader_time_line("total", "vs", -1, total_by_type[ST_VS], total);
+   print_shader_time_line("total", "gs", -1, total_by_type[ST_GS], total);
print_shader_time_line("total", "fs8", -1, total_by_type[ST_FS8], total);
print_shader_time_line("total", "fs16", -1, total_by_type[ST_FS16], total);
 }
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9d3735a..2e319e38 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1494,10 +1494,10 @@ vec4_visitor::emit_shader_time_end()
 */
emit(ADD(diff, src_reg(diff), src_reg(-2u)));
 
-   emit_shader_time_write(ST_VS, src_reg(diff));
-   emit_shader_time_write(ST_VS_WRITTEN, src_reg(1u));
+   emit_shader_time_write(st_base, src_reg(diff));
+   emit_shader_time_write(st_written, src_reg(1u));
emit(BRW_OPCODE_ELSE);
-   emit_shader_time_write(ST_VS_RESET, src_reg(1u));
+   emit_shader_time_write(st_reset, src_reg(1u));
emit(BRW_OPCODE_ENDIF);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d4029d8..f2ecb2b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -231,7 +231,10 @@ public:
struct brw_shader *shader,
void *mem_ctx,
 bool debug_flag,
-bool no_spills);
+bool no_spills,
+shader_time_shader_type st_base,
+shader_time_shader_type st_written,
+shader_time_shader_type st_reset);
~vec4_visitor();
 
dst_reg dst_null_f()
@@ -547,6 +550,10 @@ private:
 * If true, then register allocation should fail instead of spilling.
 */
const bool no_spills;
+
+   const shader_time_shader_type st_base;
+   const shader_time_shader_type st_written;
+   const shader_time_shader_type st_reset;
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 5d5b169..8d8db5e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visito

[Mesa-dev] [PATCH] i965: Modify some error messages to refer to "vec4" instead of "vs".

2014-01-17 Thread Paul Berry
These messages are in code that is shared between the VS and GS
back-ends, so use the terminology "vec4" to avoid confusion.
---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 8 
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index c1ef81d..51e88d2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -327,7 +327,7 @@ vec4_generator::generate_tex(vec4_instruction *inst,
  }
  break;
   default:
-assert(!"should not get here: invalid VS texture opcode");
+assert(!"should not get here: invalid vec4 texture opcode");
 break;
   }
} else {
@@ -356,7 +356,7 @@ vec4_generator::generate_tex(vec4_instruction *inst,
 assert(inst->mlen == 2);
 break;
   default:
-assert(!"should not get here: invalid VS texture opcode");
+assert(!"should not get here: invalid vec4 texture opcode");
 break;
   }
}
@@ -1218,10 +1218,10 @@ 
vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
 
default:
   if (inst->opcode < (int) ARRAY_SIZE(opcode_descs)) {
- _mesa_problem(&brw->ctx, "Unsupported opcode in `%s' in VS\n",
+ _mesa_problem(&brw->ctx, "Unsupported opcode in `%s' in vec4\n",
opcode_descs[inst->opcode].name);
   } else {
- _mesa_problem(&brw->ctx, "Unsupported opcode %d in VS", inst->opcode);
+ _mesa_problem(&brw->ctx, "Unsupported opcode %d in vec4", 
inst->opcode);
   }
   abort();
}
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 3b8cef6..81906e8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3343,7 +3343,7 @@ vec4_visitor::fail(const char *format, ...)
va_start(va, format);
msg = ralloc_vasprintf(mem_ctx, format, va);
va_end(va);
-   msg = ralloc_asprintf(mem_ctx, "VS compile failed: %s\n", msg);
+   msg = ralloc_asprintf(mem_ctx, "vec4 compile failed: %s\n", msg);
 
this->fail_msg = msg;
 
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache

2014-01-17 Thread Paul Berry
On 14 January 2014 03:35, Tapani Pälli  wrote:

>  On 01/14/2014 01:24 AM, Paul Berry wrote:
>
> On 2 January 2014 03:58, Tapani Pälli  wrote:
>
>   +   var->state_slots = NULL;
>> +
>> +   if (var->num_state_slots > 0) {
>> +  var->state_slots = ralloc_array(var, ir_state_slot,
>> + var->num_state_slots);
>>
>
>  Security issue: if the shader blob is corrupted, num_state_slots could
> be outrageously large, causing this memory allocation to fail (which would
> then cause the loop below to perform illegal memory accesses).  We should
> validate that num_state_slots is within a reasonable range, and if it
> isn't, abort reading the binary blob.
>
>
> What is the expected reasonable range for state_slots?
>

I believe the uniform with the most state slots is gl_LightSource, which
has 96.

What I'd suggest doing (in a separate patch) is: add a

#define MAX_NUM_STATE_SLOTS 96

somewhere, and then in builtin_variable_generator::add_uniform(), right
after setting uni->num_state_slots, add:

assert(uni->num_state_slots <= MAX_NUM_STATE_SLOTS);

That way, in the unlikely event that we ever add a GL feature that requires
more than this number of state slots, the assertion failure will alert us
that we need to increase MAX_NUM_STATE_SLOTS.


>
>
>+   /* fill parse state from shader header information */
>> +   switch (shader->Type) {
>> +   case GL_VERTEX_SHADER:
>> +  state->target = MESA_SHADER_VERTEX;
>> +  break;
>> +   case GL_FRAGMENT_SHADER:
>> +  state->target = MESA_SHADER_FRAGMENT;
>> +  break;
>> +   case GL_GEOMETRY_SHADER_ARB:
>> +  state->target = MESA_SHADER_GEOMETRY;
>> +  break;
>>
>
>  There should be a default case here to handle corrupted blobs with an
> illegal type.
>
>
> I'm now doing _mesa_shader_enum_to_shader_stage(shader->Type) here which
> will assert if type is illegal.
>

An assertion isn't good enough, because assertions don't get checked in
release builds.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization

2014-01-17 Thread Paul Berry
On 14 January 2014 02:35, Tapani Pälli  wrote:

>  On 01/13/2014 08:27 PM, Paul Berry wrote:
>
> On 2 January 2014 03:58, Tapani Pälli  wrote:
>
>> Class will be used by the shader binary cache implementation.
>>
>> Signed-off-by: Tapani Pälli 
>> ---
>>  src/glsl/memory_writer.h | 147
>> +++
>>  1 file changed, 147 insertions(+)
>>  create mode 100644 src/glsl/memory_writer.h
>>
>> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
>> new file mode 100644
>> index 000..a6c6b55
>> --- /dev/null
>> +++ b/src/glsl/memory_writer.h
>> @@ -0,0 +1,147 @@
>> +/* -*- c++ -*- */
>> +/*
>> + * Copyright © 2013 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#pragma once
>> +#ifndef MEMORY_WRITER_H
>> +#define MEMORY_WRITER_H
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#ifdef __cplusplus
>> +/**
>> + * Helper class for writing data to memory
>> + *
>> + * This class maintains a dynamically-sized memory buffer and allows
>> + * for data to be efficiently appended to it with automatic resizing.
>> + */
>> +class memory_writer
>> +{
>> +public:
>> +   memory_writer() :
>> +  memory(NULL),
>> +  curr_size(0),
>> +  pos(0) {}
>> +
>> +   ~memory_writer()
>> +   {
>> +  free(memory);
>> +   }
>> +
>> +   /* user wants to claim the memory */
>> +   char *release_memory(size_t *size)
>> +   {
>> +  /* final realloc to free allocated but unused memory */
>> +  char *result = (char *) realloc(memory, pos);
>> +  *size = pos;
>> +  memory = NULL;
>> +  curr_size = 0;
>> +  pos = 0;
>> +  return result;
>> +   }
>> +
>> +/**
>> + * write functions per type
>> + */
>> +#define DECL_WRITER(type) int write_ ##type (const type data) {\
>> +   return write(&data, sizeof(type));\
>> +}
>> +
>> +   DECL_WRITER(int32_t);
>> +   DECL_WRITER(int64_t);
>> +   DECL_WRITER(uint8_t);
>> +   DECL_WRITER(uint32_t);
>> +
>> +   int write_bool(bool data)
>> +   {
>> +  uint8_t val = data;
>> +  return write_uint8_t(val);
>> +   }
>> +
>> +   /* write function that reallocates more memory if required */
>> +   int write(const void *data, int32_t size)
>> +   {
>> +  if (!memory || pos > (int32_t)(curr_size - size))
>> + if (grow(size))
>> +return -1;
>> +
>> +  memcpy(memory + pos, data, size);
>> +
>> +  pos += size;
>> +  return 0;
>> +   }
>> +
>> +   int overwrite(const void *data, int32_t size, int32_t offset)
>> +   {
>> +  if (offset < 0 || offset + size > pos)
>> + return -1;
>> +  memcpy(memory + offset, data, size);
>> +  return 0;
>> +   }
>> +
>> +   int write_string(const char *str)
>> +   {
>> +  if (!str)
>> + return -1;
>> +  char terminator = '\0';
>> +  write(str, strlen(str));
>> +  write(&terminator, 1);
>>
>
>  C strings include a terminator, so there's no reason to write out the
> string contents and the terminator separtely.  You can just do:
>
>  write(str, strlen(str) + 1);
>
> Also, don't forget to propagate the return code to the caller:
>
>  return write(str, strlen(str) + 1);
>
>
> Ah right, will do. It could be that this needs to be modified back to how
> it was though (to write to length of the string too). I think it is
> otherwise impossible for the reader side to know if terminator really
> exists and it might continue to read forever (or call strlen for
> unterminated string).
>

I don't know of any C library function that does exactly what you want.  Of
course you could always code up your own loop.  But I think I agree that
storing the length first is the best approach.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program

2014-01-17 Thread Paul Berry
On 15 January 2014 02:46, Tapani  wrote:

>  On 01/14/2014 07:53 PM, Paul Berry wrote:
>
> On 2 January 2014 03:58, Tapani Pälli  wrote:
>
>>   +static void
>>> +serialize_uniform_storage(gl_uniform_storage *uni, memory_writer &blob)
>>>
>>
>>  I don't think this is right.  The ARB_get_program_binary spec says:
>>
>> 8. How does ProgramBinary interact with uniform values, including
>>shader-specified defaults?
>>
>> RESOLVED: All uniforms specified in the binary are reset to their
>> shader-
>> specified defaults, or to zero if they aren't specified, when the
>> program
>> binary is loaded. The spec language has been updated to specify this
>> behavior.
>>
>>  The OES_get_program_binary spec doesn't mention uniforms at all, but I
>> believe this is not intended to indicate a behavioural difference--it's
>> just a consequence of the fact that ARB_get_program_binary was written
>> later, so they had a chance to clarify things.  In fact, issue #7 in
>> ARB_get_program_binary specifically says:
>>
>> 7. How does this spec differ from the OES_get_program_binary and why?
>>
>> ...
>>
>> There are other areas where language was clarified, but this is meant
>> to
>> be consistent with the intent of the original specification and not to
>> alter previously established behavior.
>>
>>  So I believe we shouldn't be serializing uniform values.
>>
>>
>> For me this seemed much easier way to serialize than recreate it though.
> Would it be enough if I reset the default values in place?
>

Yes, I think that would be reasonable.  Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality

2014-01-17 Thread Paul Berry
On 16 January 2014 05:37, Tapani Pälli  wrote:

>  On 01/15/2014 06:13 PM, Paul Berry wrote:
>
> On 2 January 2014 03:58, Tapani Pälli  wrote:
>
>> Signed-off-by: Tapani Pälli 
>> ---
>>  src/mesa/main/shaderapi.c | 44
>> ++--
>>  1 file changed, 38 insertions(+), 6 deletions(-)
>>
>> +   char *data = mesa_program_serialize(shProg, &size);
>> +
>> +   /* we have more data that can fit to user given buffer */
>> +   if (size > bufSize) {
>> +  _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
>> +  if (data)
>> + free(data);
>>
>
>  Why would we ever expect mesa_program_serialize to set size to a nonzero
> value but return NULL?  It seems like this could only happen if there's a
> bug, in which case this really ought to be
>
>  assert(data !=NULL);
>
>  Also, it's safe to call free() on a NULL pointer.  According to the C
> standard, freeing a NULL pointer does nothing.
>
>
> sure, will fix
>
>
>
>
>> +  return;
>> +   }
>> +
>> +   if (data) {
>>
>
>  Similarly, this if-statement is unnecessary.
>
>
> it is required for memcpy but not for free
>

Oops, you're right.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 0/9] GL_OES_get_program_binary extension

2014-01-15 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> Hi;
>
> Here's another take on the get_program_binary extension. I've rewritten big
> chunks of it based on Paul's review and comments earlier. Here's a brief
> list
> out of my head of things changed:
>
> - uses mesa compilation time as verification method for cache, not git sha
> - much smaller serialization code size, now as part of IR classes
> - smaller binary blob size
> - does not dump separate 'prototypes block' for functions but instead
>  deserialization iterates over the whole blob for functions first
> - hashtables written faster than previously (iterated only once)
> - no 'unique_id' for ir_variables but using the address as identifier
> - patch set broken down to more individual patches/changes
>
> Mesa branch with these patches applied on top:
> http://cgit.freedesktop.org/~tpalli/mesa/log/?h=oes_get_program_binary
>
> I have plans for the addition of driver backend data but I was hoping this
> to
> be as a separate addition on top. Basically I extend gl_shader_program
> struct
> and offer API for drivers to dump their binary data there, some proto here:
>
> http://cgit.freedesktop.org/~tpalli/mesa/log/?h=driver_interface
> (in order to work more data needed like the keys but this is just test)
>
> I have also 'automatic' cache implementation here that can be used to
> verify
> the cache functionality with any application (without extension):
> http://cgit.freedesktop.org/~tpalli/mesa/log/?h=new_serialization
>
> Any comments appreciated;
>
> Tapani Pälli (9):
>   glsl: memory_writer helper class for data serialization
>   glsl: serialize methods for IR instructions
>   glsl: memory_map helper class for data deserialization
>   glsl: add MESA_SHADER_CACHE_MAGIC string for shader binary cache
>   glsl: export populate_symbol_table function
>   glsl: ir_deserializer class for the binary shader cache
>   mesa: iterate method for string_to_uint_map
>   glsl: functions to serialize gl_shader and gl_shader_program
>   mesa: OES_get_program_binary extension functionality
>

I sent comments on all patches but 4, 5, and 7.  Those patches are:

Reviewed-by: Paul Berry 

I see that you've already responded to some of my comments.  I'm on a
business trip this week so I have limited time to answer email, but I'll
try to respond to your responses as soon as I can.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality

2014-01-15 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/shaderapi.c | 44
> ++--
>  1 file changed, 38 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 57511e8..c07b226 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -57,6 +57,7 @@
>  #include "../glsl/ir.h"
>  #include "../glsl/ir_uniform.h"
>  #include "../glsl/program.h"
> +#include "../glsl/shader_cache.h"
>
>  /** Define this to enable shader substitution (see below) */
>  #define SHADER_SUBST 0
> @@ -1632,8 +1633,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei
> bufSize, GLsizei *length,
> if (length != NULL)
>*length = 0;
>
> -   (void) binaryFormat;
> -   (void) binary;
> +   size_t size = 0;
> +   char *data = mesa_program_serialize(shProg, &size);
> +
> +   /* we have more data that can fit to user given buffer */
> +   if (size > bufSize) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
> +  if (data)
> + free(data);
>

Why would we ever expect mesa_program_serialize to set size to a nonzero
value but return NULL?  It seems like this could only happen if there's a
bug, in which case this really ought to be

assert(data !=NULL);

Also, it's safe to call free() on a NULL pointer.  According to the C
standard, freeing a NULL pointer does nothing.


> +  return;
> +   }
> +
> +   if (data) {
>

Similarly, this if-statement is unnecessary.


> +  memcpy(binary, data, size);
> +  free(data);
> +   }
> +
> +   if (length != NULL)
> +  *length = size;
> +
> +   *binaryFormat = 0;
>  }
>
>  void GLAPIENTRY
> @@ -1647,10 +1666,23 @@ _mesa_ProgramBinary(GLuint program, GLenum
> binaryFormat,
> if (!shProg)
>return;
>
> -   (void) binaryFormat;
> -   (void) binary;
> -   (void) length;
> -   _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
> +   if (length <= 0)
> +  return;
>

In the case of an invalid length, we need to make sure to set the program's
LinkStatus to false.  Also, the information log needs to be cleared, in
accordance with this text from OES_get_program_binary:

If ProgramBinaryOES failed, any information about a previous link or
load of
that program object is lost.  Thus, a failed load does not restore the
old
state of .



> +
> +   /* free possible existing data and initialize structure */
> +   _mesa_free_shader_program_data(ctx, shProg);
> +   _mesa_init_shader_program(ctx, shProg);
> +
> +   /* fill structure from a binary blob */
> +   if (mesa_program_deserialize(shProg, binary, length)) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glProgramBinary(binary
> incompatible)");
>

This seems wrong to me.  From the OES_get_program_binary spec:

...  and  must be
those returned by a previous call to GetProgramBinaryOES, and 
must
be the length of the program binary as returned by GetProgramBinaryOES
or
GetProgramiv with  PROGRAM_BINARY_LENGTH_OES.  The program binary
will fail to load if these conditions are not met.

...

A program object's program binary is replaced by calls to LinkProgram or
ProgramBinaryOES.  Either command sets the program object's LINK_STATUS
to
TRUE or FALSE, as queried with GetProgramiv, to reflect success or
failure.
Either command also updates its information log, queried with
GetProgramInfoLog, to provide details about warnings or errors.

I believe this means that if deserialization fails, it should not be a GL
error.  It should simply be treated as a link failure.


> +  return;
> +   }
> +
> +   /* driver specific link, optimizations and what not */
> +   ctx->Driver.LinkShader(ctx, shProg);
>

Now I'm confused.  I thought a major part of the purpose of this extension
was that it would store the post-link program, so that not only does
glProgramBinary() avoid the runtime penalty of parsing and compiling, it
also avoids the runtime penalty of link-time optimizations.  Calling
ctx->Driver.LinkShader seems like it defeats that purpose.  It seems like
what we ought to be doing is store the data that ctx->Driver.LinkShader
*produces* in the binary blob, so that once it's loaded there's no further
linking necessary.  If there is a small amount of driver-specific hook
necessary, that should be placed in a new ctx->Driver function rather than
incurring the overhead of another link.


> +
> +   _mesa_ValidateProgram(program);
>

I don't think this is correct.  ValidateProgram() doesn't mean "check that
the program is well-formed".  It means "check whether the program can
execute given the current GL state".  A lot of GL apps compile all their
shaders during startup, long before they've established the correct GL
state for running those programs.  It's reasonable to assume that apps
using this extension will make their calls to glProgramBinary() during
startup as well.  So calling ValidateProgram() from gl

Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program

2014-01-14 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> diff --git a/src/glsl/shader_cache.cpp b/src/glsl/shader_cache.cpp
> new file mode 100644
> index 000..4b5de45
> --- /dev/null
> +++ b/src/glsl/shader_cache.cpp
> @@ -0,0 +1,489 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "main/shaderobj.h"
> +#include "main/uniforms.h"
> +#include "main/macros.h"
> +#include "program/hash_table.h"
> +#include "ir_serialize.h"
> +#include "ir_deserializer.h"
> +#include "shader_cache_magic.h"
> +
> +/**
> + * It is currently unknown if we need these to be available.
> + * There can be calls in mesa/main (like glGetAttachedShaders) that use
> + * the Shaders list.
> + */
> +const bool STORE_UNLINKED_SHADERS = false;
>

I think it really exaggerates our level of uncertainty to say that it is
"currently unknown" whether we need unlinked shaders to be available.
Speaking for myself at least, I'm quite convinced that we don't.  AFAICT,
OES_get_program_binary was purposefully architected so taht we don't need
to store the unlinked shaders.  GL has always maintained two independent
collections of state about any given program: pre-link state and post-link
state.  There's a one-way flow of information from the pre-link state to
the post-link state--once you've linked a program using glLinkShader(),
there's no way to get the unlinked information back.  This isn't a problem
for most users of GL because most people don't try to modify the pre-link
state after linking the program.  However it's perfectly permissible for
someone to create a program, attach shaders, link, and then detatch all the
shaders.  Since detatching the shaders affects pre-link state, the program
would still work after this point (since the post-link information would
not have changed since the link) but it would be impossible to query
information about the individual shaders anymore.

A similar situation exists when the client uses OES_get_attached_shaders.
glProgramBinaryOES() bypasses the pre-link state completely, and drops the
compiled binary directly into the post-link state.  From the
OES_get_program_binary spec:

Note that ProgramBinaryOES disregards any shader objects attached to the
program object, as these shader objects are used only by LinkProgram.

And later:

7. Can BindAttribLocation be called after ProgramBinaryOES to remap an
   attribute location used by the program binary?

RESOLVED: No.  BindAttribLocation only affects the result of a
subsequent
call to LinkProgram.  LinkProgram operates on the attached shader
objects
and replaces any program binary loaded prior to LinkProgram.  So there
is no
mechanism to remap an attribute location after loading a program binary.

However, an application is free to remap an attribute location prior to
retrieving the program binary.  By calling BindAttribLocation followed
by
LinkProgram, an application can remap the attribute location.  If this
is
followed by a call to GetProgramBinaryOES, the retrieved program binary
will
include the desired attribute location assignment.

So if the user creates a program and calls glProgramBinaryOES() followed by
glGetAttachedShaders(), they will see no shaders, since the pre-link state
is still in its initial configuration of having no shaders attached.

IMHO, trying to plan for the contingency case where we're wrong about this
is just going to lead to confusion and make the code hard to maintain.  I
think we should drop this const, and remove the code that would have been
executed if STORE_UNLINKED_SHADERS were true.  In the unlikely event that
it turns out we were wrong about this (or Khronos makes a change to the
spec that makes it necessary to store unlinked shaders) we can always
change the code in a fu

Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> +
> +
> +/**
> + * Reads header part of the binary blob. Main purpose of this header is to
> + * validate that cached shader was produced with same Mesa driver version.
> + */
> +int
> +ir_deserializer::read_header(struct gl_shader *shader)
> +{
> +   char *cache_magic_id = map->read_string();
> +   char *driver_vendor = map->read_string();
> +   char *driver_renderer = map->read_string();
> +
> +   /* only used or debug output, silence compiler warning */
> +   (void) driver_vendor;
> +   (void) driver_renderer;
>

A single version of Mesa potentially supports many different hardware
types, and those different hardware types may define different values of
GLSL built-in constants.  They also may require core Mesa to do different
sets of lowering passes during compilation.  So we can't just ignore
driver_vendor and driver_renderer.  We need to reject the binary blob if
they don't match.


> +
> +   shader->Version = map->read_uint32_t();
> +   shader->Type = map->read_uint32_t();
> +   shader->IsES = map->read_uint8_t();
> +
> +   CACHE_DEBUG("%s: version %d, type 0x%x, %s (mesa %s)\n[%s %s]\n",
> +   __func__,  shader->Version, shader->Type,
> +   (shader->IsES) ? "glsl es" : "desktop glsl",
> +   cache_magic_id, driver_vendor, driver_renderer);
> +
> +   const char *magic = mesa_get_shader_cache_magic();
> +
> +   if (memcmp(cache_magic_id, magic, strlen(magic)))
> +  return DIFFERENT_MESA_VER;
>

If cache_magic_id is "foobar" and magic is "foo", this will erroneusly
consider them equal.  The correct way to do this is to use strcmp().


> +
> +   /* post-link data */
> +   shader->num_samplers = map->read_uint32_t();
> +   shader->active_samplers = map->read_uint32_t();
> +   shader->shadow_samplers = map->read_uint32_t();
> +   shader->num_uniform_components = map->read_uint32_t();
> +   shader->num_combined_uniform_components = map->read_uint32_t();
> +   shader->uses_builtin_functions = map->read_uint8_t();
> +
> +   map->read(&shader->Geom, sizeof(shader->Geom));
> +
> +   for (unsigned i = 0; i < MAX_SAMPLERS; i++)
> +  shader->SamplerUnits[i] = map->read_uint8_t();
> +
> +   for (unsigned i = 0; i < MAX_SAMPLERS; i++)
> +  shader->SamplerTargets[i] = (gl_texture_index) map->read_int32_t();
> +
> +   return 0;
> +}
> +
> +
> +const glsl_type *
> +ir_deserializer::read_glsl_type()
> +{
> +   char *name = map->read_string();
> +   uint32_t type_size = map->read_uint32_t();
> +
> +   const glsl_type *existing_type =
> +  state->symbols->get_type(name);
> +
> +   /* if type exists, move read pointer forward and return type */
> +   if (existing_type) {
> +  map->ffwd(type_size);
> +  return existing_type;
> +   }
> +
> +   uint8_t base_type = map->read_uint8_t();
> +   uint32_t length = map->read_uint32_t();
> +   uint8_t vector_elms = map->read_uint8_t();
> +   uint8_t matrix_cols = map->read_uint8_t();
> +   uint8_t interface_packing = map->read_uint8_t();
> +
> +   /* array type has additional element_type information */
> +   if (base_type == GLSL_TYPE_ARRAY) {
> +  const glsl_type *element_type = read_glsl_type();
> +  if (!element_type) {
> + CACHE_DEBUG("error reading array element type\n");
> + return NULL;
> +  }
> +  return glsl_type::get_array_instance(element_type, length);
> +   }
> +
> +   /* structures have fields containing of names and types */
> +   else if (base_type == GLSL_TYPE_STRUCT ||
> +  base_type == GLSL_TYPE_INTERFACE) {
> +  glsl_struct_field *fields = ralloc_array(mem_ctx,
> + glsl_struct_field, length);
> +
> +  if (!fields)
> + return glsl_type::error_type;
> +
> +  for (unsigned k = 0; k < length; k++) {
> + uint8_t row_major, interpolation, centroid;
> + int32_t location;
> + char *field_name = map->read_string();
> + fields[k].name = _mesa_strdup(field_name);
> + fields[k].type = read_glsl_type();
> + row_major = map->read_uint8_t();
> + location = map->read_int32_t();
> + interpolation = map->read_uint8_t();
> + centroid = map->read_uint8_t();
> + fields[k].row_major = row_major;
> + fields[k].location = location;
> + fields[k].interpolation = interpolation;
> + fields[k].centroid = centroid;
>

Another security issue: if the binary blob is corrupted, length may be
outrageously large (e.g. 0x).  We need a way for this loop to bail
out and exit if it tries to read past the end of the binary blob.


> +  }
> +
> +  const glsl_type *ret_type = NULL;
> +
> +  if (base_type == GLSL_TYPE_STRUCT)
> + ret_type = glsl_type::get_record_instance(fields, length, name);
> +  else if (base_type == GLSL_TYPE_INTERFACE)
> + ret_type = glsl_type::get_interface_instance(fields,
> +length, (glsl_interface_packing) interface_packing, name);
> +
> +  /* free allo

Re: [Mesa-dev] [wip 3/9] glsl: memory_map helper class for data deserialization

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> Class will be used by the shader binary cache implementation.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/memory_map.h | 174
> ++
>  1 file changed, 174 insertions(+)
>  create mode 100644 src/glsl/memory_map.h
>
> diff --git a/src/glsl/memory_map.h b/src/glsl/memory_map.h
> new file mode 100644
> index 000..1b68b72
> --- /dev/null
> +++ b/src/glsl/memory_map.h
> @@ -0,0 +1,174 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#pragma once
> +#ifndef MEMORY_MAP_H
> +#define MEMORY_MAP_H
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#ifdef __cplusplus
> +
> +/**
> + * Helper class to read data
> + *
> + * Class can read either from user given memory or from a file. On Linux
> + * file reading wraps around the Posix functions for mapping a file into
> + * the process's address space. Other OS may need different
> implementation.
> + */
> +class memory_map
> +{
> +public:
> +   memory_map() :
> +  mode(memory_map::READ_MEM),
> +  fd(0),
> +  cache_size(0),
> +  cache_mmap(NULL),
> +  cache_mmap_p(NULL)
> +   {
> +  /* only used by read_string() */
> +  mem_ctx = ralloc_context(NULL);
> +   }
> +
> +   /* read from disk */
> +   int map(const char *path)
> +   {
> +  struct stat stat_info;
> +  if (stat(path, &stat_info) != 0)
> + return -1;
>

As before, I'm not thrilled with the use of -1 to mean failure and 0 to
mean success, because it forces the caller to use counterintuitive if
statements.  I'd prefer for map() to return a bool with true meaning
success and false meaning failure.


> +
> +  mode = memory_map::READ_MAP;
> +  cache_size = stat_info.st_size;
> +
> +  fd = open(path, O_RDONLY);
> +  if (fd) {
> + cache_mmap_p = cache_mmap = (char *)
> +mmap(NULL, cache_size, PROT_READ, MAP_PRIVATE, fd, 0);
> + return (cache_mmap == MAP_FAILED) ? -1 : 0;
>

MAP_FAILED is a nonzero value, so if this error condition ever occurs, the
destructor will errneously try to call munmap().

What I'd recommend doing instead is:

void *mmap_result = mmap(...);
if (mmap_result == MAP_FAILED) {
   close(fd);
   return -1;
}
cache_mmap_p = cache_mmap = (char *) mmap_result;
return 0;

>
> +  }
> +  return -1;
> +   }
> +
> +   /* read from memory */
> +   int map(const void *memory, size_t size)
> +   {
> +  cache_mmap_p = cache_mmap = (char *) memory;
> +  cache_size = size;
> +  return 0;
> +   }
>

IMHO, functions that cannot fail should return void.


> +
> +   /* wrap a portion from another map */
> +   int map(memory_map &map, size_t size)
> +   {
> +  cache_mmap_p = cache_mmap = map.cache_mmap_p;
> +  cache_size = size;
> +  map.ffwd(size);
> +  return 0;
> +   }
> +
> +   ~memory_map() {
> +  if (cache_mmap && mode == READ_MAP) {
> + munmap(cache_mmap, cache_size);
> + close(fd);
> +  }
> +  ralloc_free(mem_ctx);
> +   }
> +
> +   /* move read pointer forward */
> +   inline void ffwd(int len)
> +   {
> +  cache_mmap_p += len;
> +   }
> +
> +   inline void jump(unsigned pos)
> +   {
> +  cache_mmap_p = cache_mmap + pos;
> +   }
> +
> +
> +   /* position of read pointer */
> +   inline uint32_t position()
> +   {
> +  return cache_mmap_p - cache_mmap;
> +   }
> +
> +   inline char *read_string()
> +   {
> +  char *str = ralloc_strdup(mem_ctx, cache_mmap_p);
> +  ffwd(strlen(str)+1);
> +  return str;
>

This is problematic from a security perspective.  If the client provides
corrupted data that ends in a truncated string (lacking a null terminator)
that could cause ralloc_strdup() to try to read beyond the end of the
file.  We 

Re: [Mesa-dev] [wip 2/9] glsl: serialize methods for IR instructions

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> diff --git a/src/glsl/ir_serialize.cpp b/src/glsl/ir_serialize.cpp
> new file mode 100644
> index 000..30ca018
> --- /dev/null
> +++ b/src/glsl/ir_serialize.cpp
> @@ -0,0 +1,392 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "ir_serialize.h"
> +
> +
> +/**
> + * Wraps serialization of an ir instruction, writes ir_type
> + * and length of each instruction package as a header for it
> + */
> +void
> +ir_instruction::serialize(memory_writer &mem)
> +{
> +   uint32_t data_len = 0;
> +   uint8_t ir_type = this->ir_type;
> +   mem.write_uint8_t(ir_type);
> +
> +   int32_t start_pos = mem.position();
> +   mem.write_uint32_t(data_len);
> +
> +   this->serialize_data(mem);
> +
> +   data_len = mem.position() - start_pos - sizeof(data_len);
> +   mem.overwrite(&data_len, sizeof(data_len), start_pos);
>

This function isn't checking the return values from mem.write_*(), so
there's no way for it to detect failure.  Also, since this function returns
void, there's no way for it to notify the caller of failure.  A similar
comment applies to all of the other serialize*() functions in this patch.
(Of course, considering our previous discussion about potentially removing
these int return values, this issue may be moot).


> +}
> +
> +
> +
> +
> +static void
> +serialize_glsl_type(const glsl_type *type, memory_writer &mem)
>

The last time I reviewed this series, I mentioned the idea of making a
hashtable that maps each glsl_type to a small integer, so that we could
serialize each type just once (see
http://lists.freedesktop.org/archives/mesa-dev/2013-November/047740.html).
At the time, it sounded like you liked that idea.  Have you made that
change?  It looks to me like you've stopped serializing the built-in types,
but user-defined types are still serialized each time they occur.

With those two issues addressed, the patch is:

Reviewed-by: Paul Berry 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization

2014-01-13 Thread Paul Berry
On 2 January 2014 03:58, Tapani Pälli  wrote:

> Class will be used by the shader binary cache implementation.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/memory_writer.h | 147
> +++
>  1 file changed, 147 insertions(+)
>  create mode 100644 src/glsl/memory_writer.h
>
> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
> new file mode 100644
> index 000..a6c6b55
> --- /dev/null
> +++ b/src/glsl/memory_writer.h
> @@ -0,0 +1,147 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#pragma once
> +#ifndef MEMORY_WRITER_H
> +#define MEMORY_WRITER_H
> +
> +#include 
> +#include 
> +#include 
> +
> +#ifdef __cplusplus
> +/**
> + * Helper class for writing data to memory
> + *
> + * This class maintains a dynamically-sized memory buffer and allows
> + * for data to be efficiently appended to it with automatic resizing.
> + */
> +class memory_writer
> +{
> +public:
> +   memory_writer() :
> +  memory(NULL),
> +  curr_size(0),
> +  pos(0) {}
> +
> +   ~memory_writer()
> +   {
> +  free(memory);
> +   }
> +
> +   /* user wants to claim the memory */
> +   char *release_memory(size_t *size)
> +   {
> +  /* final realloc to free allocated but unused memory */
> +  char *result = (char *) realloc(memory, pos);
> +  *size = pos;
> +  memory = NULL;
> +  curr_size = 0;
> +  pos = 0;
> +  return result;
> +   }
> +
> +/**
> + * write functions per type
> + */
> +#define DECL_WRITER(type) int write_ ##type (const type data) {\
> +   return write(&data, sizeof(type));\
> +}
> +
> +   DECL_WRITER(int32_t);
> +   DECL_WRITER(int64_t);
> +   DECL_WRITER(uint8_t);
> +   DECL_WRITER(uint32_t);
> +
> +   int write_bool(bool data)
> +   {
> +  uint8_t val = data;
> +  return write_uint8_t(val);
> +   }
> +
> +   /* write function that reallocates more memory if required */
> +   int write(const void *data, int32_t size)
> +   {
> +  if (!memory || pos > (int32_t)(curr_size - size))
> + if (grow(size))
> +return -1;
> +
> +  memcpy(memory + pos, data, size);
> +
> +  pos += size;
> +  return 0;
> +   }
> +
> +   int overwrite(const void *data, int32_t size, int32_t offset)
> +   {
> +  if (offset < 0 || offset + size > pos)
> + return -1;
> +  memcpy(memory + offset, data, size);
> +  return 0;
> +   }
> +
> +   int write_string(const char *str)
> +   {
> +  if (!str)
> + return -1;
> +  char terminator = '\0';
> +  write(str, strlen(str));
> +  write(&terminator, 1);
>

C strings include a terminator, so there's no reason to write out the
string contents and the terminator separtely.  You can just do:

write(str, strlen(str) + 1);

Also, don't forget to propagate the return code to the caller:

return write(str, strlen(str) + 1);


> +  return 0;
> +   }
> +
> +   inline int32_t position() { return pos; }
> +
> +
> +private:
> +
> +   /* reallocate more memory */
> +   int grow(int32_t size)
> +   {
> +  int32_t new_size = 2 * (curr_size + size);
> +  char *more_mem = (char *) realloc(memory, new_size);
> +  if (more_mem == NULL) {
> + free(memory);
> + memory = NULL;
> + return -1;
> +  } else {
> + memory = more_mem;
> + curr_size = new_size;
> + return 0;
> +  }
> +   }
> +
> +   /* allocated memory */
> +   char *memory;
> +
> +   /* current size of the whole allocation */
> +   int32_t curr_size;
> +
> +   /* write position / size of the data written */
> +   int32_t pos;
> +};
> +
> +#endif /* ifdef __cplusplus */
> +
> +#endif /* MEMORY_WRITER_H */
> --
> 1.8.3.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop

[Mesa-dev] [PATCH] i965: Ensure that all necessary state is re-emitted if we run out of aperture.

2014-01-10 Thread Paul Berry
Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again.  However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state().  As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state.  (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).

This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.

Cc: 10.0 9.2 
---
 src/mesa/drivers/dri/i965/brw_draw.c |  6 ++
 src/mesa/drivers/dri/i965/brw_state.h|  1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c | 14 ++
 3 files changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index b898cd3..ec56a21 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -499,6 +499,12 @@ retry:
}
 }
   }
+
+  /* Now that we know we haven't run out of aperture space, we can safely
+   * reset the dirty bits.
+   */
+  if (brw->state.dirty.brw)
+ brw_clear_dirty_bits(brw);
}
 
if (brw->always_flush_batch)
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index a148125..e9428fb 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -143,6 +143,7 @@ brw_depthbuffer_format(struct brw_context *brw);
  * brw_state.c
  */
 void brw_upload_state(struct brw_context *brw);
+void brw_clear_dirty_bits(struct brw_context *brw);
 void brw_init_state(struct brw_context *brw);
 void brw_destroy_state(struct brw_context *brw);
 
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 1eb3a79..7490df4 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -573,6 +573,20 @@ void brw_upload_state(struct brw_context *brw)
 fprintf(stderr, "\n");
   }
}
+}
+
 
+/**
+ * Clear dirty bits to account for the fact that the state emitted by
+ * brw_upload_state() has been committed to the hardware.  This is a separate
+ * call from brw_upload_state() because it's possible that after the call to
+ * brw_upload_state(), we will discover that we've run out of aperture space,
+ * and need to rewind the batch buffer to the state it had before the
+ * brw_upload_state() call.
+ */
+void
+brw_clear_dirty_bits(struct brw_context *brw)
+{
+   struct brw_state_flags *state = &brw->state.dirty;
memset(state, 0, sizeof(*state));
 }
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.

2014-01-10 Thread Paul Berry
On 10 January 2014 11:41, Kenneth Graunke  wrote:

> On 01/09/2014 06:19 PM, Paul Berry wrote:
> > This is possible now that ctx->Shader.CurrentProgram is an array.
> > ---
> >  src/mesa/main/texstate.c | 75
> +++-
> >  1 file changed, 29 insertions(+), 46 deletions(-)
> >
> > diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c
> > index b9c76da..905a9d5 100644
> > --- a/src/mesa/main/texstate.c
> > +++ b/src/mesa/main/texstate.c
> > @@ -526,27 +526,20 @@ static void
> >  update_texture_state( struct gl_context *ctx )
> >  {
> > GLuint unit;
> > -   struct gl_program *fprog = NULL;
> > -   struct gl_program *vprog = NULL;
> > -   struct gl_program *gprog = NULL;
> > +   struct gl_program *prog[MESA_SHADER_STAGES];
> > GLbitfield enabledFragUnits = 0x0;
> > -
> > -   if (ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] &&
> > -   ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]->LinkStatus) {
> > -  vprog =
> ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]->_LinkedShaders[MESA_SHADER_VERTEX]->Program;
> > -   }
> > -
> > -   if (ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY] &&
> > -   ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]->LinkStatus) {
> > -  gprog =
> ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program;
> > -   }
> > -
> > -   if (ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT] &&
> > -   ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT]->LinkStatus) {
> > -  fprog =
> ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT]->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program;
> > -   }
> > -   else if (ctx->FragmentProgram._Enabled) {
> > -  fprog = &ctx->FragmentProgram.Current->Base;
> > +   int i;
> > +
> > +   for (i = 0; i < MESA_SHADER_STAGES; i++) {
> > +  if (ctx->Shader.CurrentProgram[i] &&
> > +  ctx->Shader.CurrentProgram[i]->LinkStatus) {
> > + prog[i] =
> ctx->Shader.CurrentProgram[i]->_LinkedShaders[i]->Program;
> > +  } else {
> > + if (i == MESA_SHADER_FRAGMENT && ctx->FragmentProgram._Enabled)
> > +prog[i] = &ctx->FragmentProgram.Current->Base;
> > + else
> > +prog[i] = NULL;
> > +  }
> > }
> >
> > /* TODO: only set this if there are actual changes */
> > @@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx )
> >  */
> > for (unit = 0; unit < ctx->Const.MaxCombinedTextureImageUnits;
> unit++) {
> >struct gl_texture_unit *texUnit = &ctx->Texture.Unit[unit];
> > -  GLbitfield enabledVertTargets = 0x0;
> > -  GLbitfield enabledFragTargets = 0x0;
> > -  GLbitfield enabledGeomTargets = 0x0;
> > +  GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES];
> >GLbitfield enabledTargets = 0x0;
> >GLuint texIndex;
> >
> > @@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx )
> > * by a fragment program/program.  When multiple flags are set,
> we'll
> > * settle on the one with highest priority (see below).
> > */
> > -  if (vprog) {
> > - enabledVertTargets |= vprog->TexturesUsed[unit];
> > +  for (i = 0; i < MESA_SHADER_STAGES; i++) {
> > + if (prog[i])
> > +enabledTargetsByStage[i] = prog[i]->TexturesUsed[unit];
> > + else if (i == MESA_SHADER_FRAGMENT)
> > +enabledTargetsByStage[i] = texUnit->Enabled;
> > + else
> > +enabledTargetsByStage[i] = 0;
> > + enabledTargets |= enabledTargetsByStage[i];
>
> This doesn't look equivalent.  Previously,
> enabled{Vert,Frag,Geom}Targets would accumulate bits over every
> iteration through units.  Now, enabledTargetsByStage only reflects the
> most recent unit.
>
> enabledTargets still accumulates properly, but you use
> enabledTargetsByStage below, so I think it needs to as well.
>
> You just need to use |= in both places, like the old code.
>

It's definitely possible that I got something wrong in this patch--I didn't
completely grok what the code was doing and instead strived to just
preserve the existing semantics.

But I'm pretty sure I got the enabledTargetsByStage part right.  Here's
what the code used to look like--the enabled{Vert,Frag,Geom}Targets
variables were all res

Re: [Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().

2014-01-10 Thread Paul Berry
On 9 January 2014 22:32, Chris Forbes  wrote:

> Minor nit, but could CS be done after the ordered pipeline stages, for
> consistency?
>

Sure, no problem.

In fact, I just realized that if I change the "type" parameter of
use_shader_program to gl_shader_stage, I can call it in a loop.  I'll do
that in a follow-up patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.

2014-01-10 Thread Paul Berry
On 9 January 2014 20:03, Chris Forbes  wrote:

> This is a slightly odd construction (although copied from the existing
> code):
>
> >  +  if ((shProg == NULL) || (shProg->_LinkedShaders[stage] == NULL))
> >  +shProg = NULL;
>

You're right.  This would be much better as:

   if ((shProg != NULL) && (shProg->_LinkedShaders[stage] == NULL))
  shProg = NULL;

I'll update it in a follow-up patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages.

2014-01-10 Thread Paul Berry
On 9 January 2014 22:17, Chris Forbes  wrote:

> This is a nice cleanup; I like that this brings both writes to
> prog->LastClipDistanceArraySize together -- but it looks like the
> behavior changes slightly.
>
> Previously, if there was no VS and no GS, then we would never write
> prog->LastClipDistanceArraySize. Now we'll read an old junk value
> (potentially from a previous linking of the same program object with
> different shaders attached) from prog->Vert.ClipDistanceArraySize
> (since we never called validate_vertex_shader_executable) -- but we'll
> never end up actually using it, since it's only used for transform
> feedback of gl_ClipDistance.
>

Yeah, that's a good point.  I agree that it's completely benign, but I
think I could have made things clearer.  I've changed it to this:

   if (num_shaders[MESA_SHADER_GEOMETRY] > 0)
  prog->LastClipDistanceArraySize = prog->Geom.ClipDistanceArraySize;
   else if (num_shaders[MESA_SHADER_VERTEX] > 0)
  prog->LastClipDistanceArraySize = prog->Vert.ClipDistanceArraySize;
   else
  prog->LastClipDistanceArraySize = 0; /* Not used */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Fix clears of layered framebuffers with mismatched layer counts.

2014-01-09 Thread Paul Berry
On 9 January 2014 18:19, Chris Forbes  wrote:

> I'm not convinced this is correct/safe, at least for the meta clear.
>
> The GL 3.2 spec says, on page 221 (page 235 of the PDF):
>
>   * If the number of layers of each attachment are not all identical,
> rendering
> will be limited to the smallest number of layers of any attachment.
>
> And then in the description of layered framebuffers on page 240 (page
> 254 of the PDF):
>
>If the fragment’s layer number is negative, or greater than the
> minimum number of layers of any attachment,
>the effects of the fragment on the framebuffer contents are undefined.
>
> Blorp clears can obviously do whatever they like, since they don't
> directly use the user's framebuffer configuration.
>

Yeah, there was some discussion about that here:
http://lists.freedesktop.org/archives/mesa-dev/2013-December/049492.html

Basically, my justification for doing it this way is that even though it's
undefined by GL what happens when you write to a layer that doesn't exist
in all attachments, we know that it works in i965, and i965 is the only
user of this code (it's the only back-end that supports layered
framebuffers and uses Meta).

In fact, I would venture to say that i965 is likely to be the only user of
this code for the forseeable future, since the only other back-end that
uses Meta is the i915 driver, and it's never going to be capable of
supporting layered framebuffers.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-01-09 Thread Paul Berry
This will allow testing of compute shader functionality before it is
completed.

To enable ARB_compute_shader functionality in the i965 driver, set
INTEL_COMPUTE_SHADER=1.
---
 src/mesa/drivers/dri/i965/brw_context.c  | 11 ++-
 src/mesa/drivers/dri/i965/intel_extensions.c |  2 ++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 1b42751..76dd9be 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -298,10 +298,17 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 
BRW_MAX_TEX_UNIT;
else
   ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0;
+   if (getenv("INTEL_COMPUTE_SHADER")) {
+  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 
BRW_MAX_TEX_UNIT;
+  ctx->Const.MaxUniformBufferBindings += 12;
+   } else {
+  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0;
+   }
ctx->Const.MaxCombinedTextureImageUnits =
   ctx->Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits +
-  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits;
+  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
+  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
 
ctx->Const.MaxTextureLevels = 14; /* 8192 */
if (ctx->Const.MaxTextureLevels > MAX_TEXTURE_LEVELS)
@@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters = 
MAX_ATOMIC_COUNTERS;
   ctx->Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters = 
MAX_ATOMIC_COUNTERS;
   ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters = 
MAX_ATOMIC_COUNTERS;
+  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxAtomicCounters = 
MAX_ATOMIC_COUNTERS;
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers = BRW_MAX_ABO;
   ctx->Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers = BRW_MAX_ABO;
   ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers = BRW_MAX_ABO;
+  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxAtomicBuffers = BRW_MAX_ABO;
   ctx->Const.MaxCombinedAtomicBuffers = 3 * BRW_MAX_ABO;
}
 
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index de07b7f..27bc97b 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -294,6 +294,8 @@ intelInitExtensions(struct gl_context *ctx)
  ctx->Extensions.ARB_transform_feedback_instanced = true;
  ctx->Extensions.ARB_draw_indirect = true;
   }
+  if (getenv("INTEL_COMPUTE_SHADER"))
+ ctx->Extensions.ARB_compute_shader = true;
}
 
if (brw->gen == 5 || can_write_oacontrol(brw))
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/30] mesa/cs: Add extension enable flags for ARB_compute_shader.

2014-01-09 Thread Paul Berry
---
 src/glsl/glcpp/glcpp-parse.y| 3 +++
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/glsl/standalone_scaffolding.cpp | 1 +
 src/mesa/main/extensions.c  | 1 +
 src/mesa/main/mtypes.h  | 1 +
 6 files changed, 9 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 55c4981..359bcea 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -1284,6 +1284,9 @@ glcpp_parser_create (const struct gl_extensions 
*extensions, int api)
 
  if (extensions->AMD_shader_trinary_minmax)
 add_builtin_define(parser, "GL_AMD_shader_trinary_minmax", 1);
+
+  if (extensions->ARB_compute_shader)
+ add_builtin_define(parser, "GL_ARB_compute_shader", 1);
   }
}
 
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 21dc3ab..33a43c5 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -513,6 +513,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_shader_atomic_counters, true,  false, 
ARB_shader_atomic_counters),
EXT(ARB_sample_shading, true,  false, ARB_sample_shading),
EXT(AMD_shader_trinary_minmax,  true,  false, dummy_true),
+   EXT(ARB_compute_shader, true,  false, ARB_compute_shader),
 };
 
 #undef EXT
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 2444a96..468707c 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -352,6 +352,8 @@ struct _mesa_glsl_parse_state {
bool ARB_shader_atomic_counters_warn;
bool AMD_shader_trinary_minmax_enable;
bool AMD_shader_trinary_minmax_warn;
+   bool ARB_compute_shader_enable;
+   bool ARB_compute_shader_warn;
/*@}*/
 
/** Extensions supported by the OpenGL implementation. */
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index 257d2e7..fe66067 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -91,6 +91,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
 
ctx->Extensions.dummy_false = false;
ctx->Extensions.dummy_true = true;
+   ctx->Extensions.ARB_compute_shader = true;
ctx->Extensions.ARB_conservative_depth = true;
ctx->Extensions.ARB_draw_instanced = true;
ctx->Extensions.ARB_ES2_compatibility = true;
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index efda548..fb95550 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -84,6 +84,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_blend_func_extended", o(ARB_blend_func_extended), 
GL, 2009 },
{ "GL_ARB_clear_buffer_object", o(dummy_true),  
GL, 2012 },
{ "GL_ARB_color_buffer_float",  o(ARB_color_buffer_float),  
GL, 2004 },
+   { "GL_ARB_compute_shader",  o(ARB_compute_shader),  
GL, 2012 },
{ "GL_ARB_copy_buffer", o(dummy_true),  
GL, 2008 },
{ "GL_ARB_conservative_depth",  o(ARB_conservative_depth),  
GL, 2011 },
{ "GL_ARB_debug_output",o(dummy_true),  
GL, 2009 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index bdd2804..7ba7b10 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3330,6 +3330,7 @@ struct gl_extensions
GLboolean ARB_base_instance;
GLboolean ARB_blend_func_extended;
GLboolean ARB_color_buffer_float;
+   GLboolean ARB_compute_shader;
GLboolean ARB_conservative_depth;
GLboolean ARB_depth_buffer_float;
GLboolean ARB_depth_clamp;
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/30] glsl/cs: Prohibit user-defined ins/outs in compute shaders.

2014-01-09 Thread Paul Berry
---
 src/glsl/ast_to_hir.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index c1b1d6a..ed3a4f9 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2220,6 +2220,13 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
   var->data.mode = ir_var_uniform;
 
if (!is_parameter && is_varying_var(var, state->stage)) {
+  /* User-defined ins/outs are not permitted in compute shaders. */
+  if (state->stage == MESA_SHADER_COMPUTE) {
+ _mesa_glsl_error(loc, state,
+  "user-defined input and output variables are not "
+  "permitted in compute shaders");
+  }
+
   /* This variable is being used to link data between shader stages (in
* pre-glsl-1.30 parlance, it's a "varying").  Check that it has a type
* that is allowed for such purposes.
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/30] mesa: use _mesa_validate_shader_target() more frequently.

2014-01-09 Thread Paul Berry
This patch replaces code in _mesa_new_shader() and delete_shader_cb()
that checks the type of a shader with calls to
_mesa_validate_shader_target().  This has two advantages: it allows
for a more thorough check (since _mesa_validate_shader_target()
doesn't permit shader targets that aren't supported by the back-end),
and it reduces the amount of code that will need to be modified when
adding new shader stages.
---
 src/mesa/main/shaderobj.c | 4 ++--
 src/mesa/main/shared.c| 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index dc81bbc..4f4bb69 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -34,6 +34,7 @@
 #include "main/context.h"
 #include "main/hash.h"
 #include "main/mtypes.h"
+#include "main/shaderapi.h"
 #include "main/shaderobj.h"
 #include "main/uniforms.h"
 #include "program/program.h"
@@ -105,8 +106,7 @@ struct gl_shader *
 _mesa_new_shader(struct gl_context *ctx, GLuint name, GLenum type)
 {
struct gl_shader *shader;
-   assert(type == GL_FRAGMENT_SHADER || type == GL_VERTEX_SHADER ||
-  type == GL_GEOMETRY_SHADER_ARB);
+   assert(_mesa_validate_shader_target(ctx, type));
shader = rzalloc(NULL, struct gl_shader);
if (shader) {
   shader->Type = type;
diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
index 2f73cf3..c11c7f9 100644
--- a/src/mesa/main/shared.c
+++ b/src/mesa/main/shared.c
@@ -38,6 +38,7 @@
 #include "dlist.h"
 #include "samplerobj.h"
 #include "set.h"
+#include "shaderapi.h"
 #include "shaderobj.h"
 #include "syncobj.h"
 
@@ -218,8 +219,7 @@ delete_shader_cb(GLuint id, void *data, void *userData)
 {
struct gl_context *ctx = (struct gl_context *) userData;
struct gl_shader *sh = (struct gl_shader *) data;
-   if (sh->Type == GL_FRAGMENT_SHADER || sh->Type == GL_VERTEX_SHADER ||
-   sh->Type == GL_GEOMETRY_SHADER) {
+   if (_mesa_validate_shader_target(ctx, sh->Type)) {
   ctx->Driver.DeleteShader(ctx, sh);
}
else {
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.

2014-01-09 Thread Paul Berry
---
 src/mesa/main/shaderapi.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 053f27b..680d449 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -663,6 +663,24 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname, GLint *param
 
   *params = shProg->NumAtomicBuffers;
   return;
+   case GL_COMPUTE_WORK_GROUP_SIZE: {
+  int i;
+  if (!_mesa_is_desktop_gl(ctx) || !ctx->Extensions.ARB_compute_shader)
+ break;
+  if (!shProg->LinkStatus) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(program not "
+ "linked successfully)");
+ return;
+  }
+  if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute "
+ "shaders)");
+ return;
+  }
+  for (i = 0; i < 3; i++)
+ params[i] = shProg->Comp.LocalSize[i];
+  return;
+   }
default:
   break;
}
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.

2014-01-09 Thread Paul Berry
---
 src/glsl/builtin_variables.cpp  | 4 
 src/glsl/glsl_parser_extras.cpp | 2 ++
 src/glsl/glsl_parser_extras.h   | 1 +
 src/glsl/main.cpp   | 3 +++
 src/glsl/standalone_scaffolding.cpp | 3 +++
 src/mesa/main/context.c | 3 +++
 src/mesa/main/get.c | 8 
 src/mesa/main/mtypes.h  | 1 +
 8 files changed, 25 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 171bf08..f94f5da 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -682,6 +682,10 @@ builtin_variable_generator::generate_constants()
}
 
if (state->is_version(430, 0) || state->ARB_compute_shader_enable) {
+  add_const_ivec3("gl_MaxComputeWorkGroupCount",
+  state->Const.MaxComputeWorkGroupCount[0],
+  state->Const.MaxComputeWorkGroupCount[1],
+  state->Const.MaxComputeWorkGroupCount[2]);
   add_const_ivec3("gl_MaxComputeWorkGroupSize",
   state->Const.MaxComputeWorkGroupSize[0],
   state->Const.MaxComputeWorkGroupSize[1],
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 8524fc6..65c504f 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -124,6 +124,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
this->Const.MaxAtomicBufferBindings = ctx->Const.MaxAtomicBufferBindings;
 
/* Compute shader constants */
+   for (unsigned i = 0; i < Elements(this->Const.MaxComputeWorkGroupCount); 
i++)
+  this->Const.MaxComputeWorkGroupCount[i] = 
ctx->Const.MaxComputeWorkGroupCount[i];
for (unsigned i = 0; i < Elements(this->Const.MaxComputeWorkGroupSize); i++)
   this->Const.MaxComputeWorkGroupSize[i] = 
ctx->Const.MaxComputeWorkGroupSize[i];
 
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index be34fd9..68a3d54 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -252,6 +252,7 @@ struct _mesa_glsl_parse_state {
   unsigned MaxAtomicBufferBindings;
 
   /* ARB_compute_shader */
+  unsigned MaxComputeWorkGroupCount[3];
   unsigned MaxComputeWorkGroupSize[3];
} Const;
 
diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index 94bc1cc..5ea57d5 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -50,6 +50,9 @@ initialize_context(struct gl_context *ctx, gl_api api)
 */
ctx->Const.GLSLVersion = glsl_version;
ctx->Extensions.ARB_ES3_compatibility = true;
+   ctx->Const.MaxComputeWorkGroupCount[0] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[1] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[2] = 65535;
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index 0c83ea3..cce2dce 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -140,6 +140,9 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32;
 
ctx->Const.MaxDrawBuffers = 1;
+   ctx->Const.MaxComputeWorkGroupCount[0] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[1] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[2] = 65535;
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 942f247..6d7e467 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -692,6 +692,9 @@ _mesa_init_constants(struct gl_context *ctx)
ctx->Const.MaxVertexAttribBindings = MAX_VERTEX_GENERIC_ATTRIBS;
 
/* GL_ARB_compute_shader */
+   ctx->Const.MaxComputeWorkGroupCount[0] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[1] = 65535;
+   ctx->Const.MaxComputeWorkGroupCount[2] = 65535;
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index bcbb5d5..e977521 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1791,6 +1791,14 @@ find_value_indexed(const char *func, GLenum pname, 
GLuint index, union value *v)
   v->value_int = 
ctx->Array.ArrayObj->VertexBinding[VERT_ATTRIB_GENERIC(index)].Stride;
   return TYPE_INT;
 
+   case GL_MAX_COMPUTE_WORK_GROUP_COUNT:
+  if (!_mesa_is_desktop_gl(ctx) || !ctx->Extensions.ARB_compute_shader)
+ goto invalid_enum;
+  if (index >= 3)
+ goto invalid_value;
+  v->value_int = ctx->Const.MaxComputeWorkGroupCount[index];
+  return TYPE_INT;
+
case GL_MAX_COMPUTE_WORK_GROUP_SIZE:
   if (!_mesa_is_desktop_gl(ctx) || !ctx->Extensions

[Mesa-dev] [PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.

2014-01-09 Thread Paul Berry
This is possible now that ctx->Shader.CurrentProgram is an array.
---
 src/mesa/main/shaderapi.c | 39 +--
 1 file changed, 9 insertions(+), 30 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index d57748c..61ac0e3 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -130,11 +130,11 @@ _mesa_init_shader_state(struct gl_context *ctx)
 void
 _mesa_free_shader_state(struct gl_context *ctx)
 {
-   _mesa_reference_shader_program(ctx, 
&ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX], NULL);
-   _mesa_reference_shader_program(ctx, 
&ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY],
- NULL);
-   _mesa_reference_shader_program(ctx, 
&ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT],
- NULL);
+   int i;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  _mesa_reference_shader_program(ctx, &ctx->Shader.CurrentProgram[i],
+ NULL);
+   }
_mesa_reference_shader_program(ctx, &ctx->Shader._CurrentFragmentProgram,
  NULL);
_mesa_reference_shader_program(ctx, &ctx->Shader.ActiveProgram, NULL);
@@ -946,32 +946,11 @@ use_shader_program(struct gl_context *ctx, GLenum type,
   struct gl_shader_program *shProg)
 {
struct gl_shader_program **target;
+   gl_shader_stage stage = _mesa_shader_enum_to_shader_stage(type);
 
-   switch (type) {
-   case GL_VERTEX_SHADER:
-  target = &ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX];
-  if ((shProg == NULL)
- || (shProg->_LinkedShaders[MESA_SHADER_VERTEX] == NULL)) {
-shProg = NULL;
-  }
-  break;
-   case GL_GEOMETRY_SHADER_ARB:
-  target = &ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY];
-  if ((shProg == NULL)
- || (shProg->_LinkedShaders[MESA_SHADER_GEOMETRY] == NULL)) {
-shProg = NULL;
-  }
-  break;
-   case GL_FRAGMENT_SHADER:
-  target = &ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT];
-  if ((shProg == NULL)
- || (shProg->_LinkedShaders[MESA_SHADER_FRAGMENT] == NULL)) {
-shProg = NULL;
-  }
-  break;
-   default:
-  return;
-   }
+   target = &ctx->Shader.CurrentProgram[stage];
+   if ((shProg == NULL) || (shProg->_LinkedShaders[stage] == NULL))
+  shProg = NULL;
 
if (*target != shProg) {
   FLUSH_VERTICES(ctx, _NEW_PROGRAM | _NEW_PROGRAM_CONSTANTS);
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.

2014-01-09 Thread Paul Berry
---
 src/glsl/main.cpp   | 1 +
 src/glsl/standalone_scaffolding.cpp | 1 +
 src/mesa/main/context.c | 1 +
 src/mesa/main/get.c | 1 +
 src/mesa/main/get_hash_params.py| 3 +++
 src/mesa/main/mtypes.h  | 1 +
 6 files changed, 8 insertions(+)

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index bb2054f..94bc1cc 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -53,6 +53,7 @@ initialize_context(struct gl_context *ctx, gl_api api)
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
+   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index e8eb529..0c83ea3 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -143,6 +143,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
+   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index ebe27b4..942f247 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -695,6 +695,7 @@ _mesa_init_constants(struct gl_context *ctx)
ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
+   ctx->Const.MaxComputeWorkGroupInvocations = 1024;
 }
 
 
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6b914f4..bcbb5d5 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -382,6 +382,7 @@ EXTRA_EXT(ARB_texture_multisample);
 EXTRA_EXT(ARB_texture_gather);
 EXTRA_EXT(ARB_shader_atomic_counters);
 EXTRA_EXT(ARB_draw_indirect);
+EXTRA_EXT(ARB_compute_shader);
 
 static const int
 extra_ARB_color_buffer_float_or_glcore[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 7f025a9..c18e848 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -742,6 +742,9 @@ descriptor=[
 # GL_ARB_vertex_attrib_binding
   [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
   [ "MAX_VERTEX_ATTRIB_BINDINGS", 
"CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
+
+# GL_ARB_compute_shader
+  [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_ENUM(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" 
],
 ]},
 
 # Enums restricted to OpenGL Core profile
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index d3f3a30..6481dc1 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3323,6 +3323,7 @@ struct gl_constants
 
/** GL_ARB_compute_shader */
GLuint MaxComputeWorkGroupSize[3];
+   GLuint MaxComputeWorkGroupInvocations;
 };
 
 
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 24/30] glsl/cs: Handle compute shader local_size_{x, y, z} declaration.

2014-01-09 Thread Paul Berry
---
 src/glsl/ast.h  | 34 +++
 src/glsl/ast_to_hir.cpp | 79 ++
 src/glsl/ast_type.cpp   | 14 ++
 src/glsl/builtin_variables.cpp  | 20 +
 src/glsl/glsl_lexer.ll  |  3 +-
 src/glsl/glsl_parser.yy | 94 -
 src/glsl/glsl_parser_extras.cpp | 51 --
 src/glsl/glsl_parser_extras.h   | 15 +++
 src/mesa/main/mtypes.h  | 11 +
 9 files changed, 285 insertions(+), 36 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 76911f0..fa00d75 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -414,6 +414,12 @@ struct ast_type_qualifier {
 unsigned prim_type:1;
 unsigned max_vertices:1;
 /** \} */
+
+ /**
+  * local_size_{x,y,z} flags for compute shaders.  Bit 0 represents
+  * local_size_x, and so on.
+  */
+ unsigned local_size:3;
   }
   /** \brief Set of flags, accessed by name. */
   q;
@@ -464,6 +470,13 @@ struct ast_type_qualifier {
int offset;
 
/**
+* Local size specified via GL_ARB_compute_shader's "local_size_{x,y,z}"
+* layout qualifier.  Element i of this array is only valid if
+* flags.q.local_size & (1 << i) is set.
+*/
+   int local_size[3];
+
+   /**
 * Return true if and only if an interpolation qualifier is present.
 */
bool has_interpolation() const;
@@ -961,6 +974,27 @@ private:
const GLenum prim_type;
 };
 
+
+/**
+ * AST node representing a decalaration of the input layout for compute
+ * shaders.
+ */
+class ast_cs_input_layout : public ast_node
+{
+public:
+   ast_cs_input_layout(const struct YYLTYPE &locp, const unsigned *local_size)
+   {
+  memcpy(this->local_size, local_size, sizeof(this->local_size));
+  set_location(locp);
+   }
+
+   virtual ir_rvalue *hir(exec_list *instructions,
+  struct _mesa_glsl_parse_state *state);
+
+private:
+   unsigned local_size[3];
+};
+
 /*@}*/
 
 extern void
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index a26745d..c1b1d6a 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -77,6 +77,7 @@ _mesa_ast_to_hir(exec_list *instructions, struct 
_mesa_glsl_parse_state *state)
state->toplevel_ir = instructions;
 
state->gs_input_prim_type_specified = false;
+   state->cs_input_local_size_specified = false;
 
/* Section 4.2 of the GLSL 1.20 specification states:
 * "The built-in functions are scoped in a scope outside the global scope
@@ -5239,6 +5240,84 @@ ast_gs_input_layout::hir(exec_list *instructions,
 }
 
 
+ir_rvalue *
+ast_cs_input_layout::hir(exec_list *instructions,
+ struct _mesa_glsl_parse_state *state)
+{
+   YYLTYPE loc = this->get_location();
+
+   /* If any compute input layout declaration preceded this one, make sure it
+* was consistent with this one.
+*/
+   if (state->cs_input_local_size_specified) {
+  for (int i = 0; i < 3; i++) {
+ if (state->cs_input_local_size[i] != this->local_size[i]) {
+_mesa_glsl_error(&loc, state,
+ "compute shader input layout does not match"
+ " previous declaration");
+return NULL;
+ }
+  }
+   }
+
+   /* From the ARB_compute_shader specification:
+*
+* If the local size of the shader in any dimension is greater
+* than the maximum size supported by the implementation for that
+* dimension, a compile-time error results.
+*
+* It is not clear from the spec how the error should be reported if
+* the total size of the work group exceeds
+* MAX_COMPUTE_WORK_GROUP_INVOCATIONS, but it seems reasonable to
+* report it at compile time as well.
+*/
+   GLuint64 total_invocations = 1;
+   for (int i = 0; i < 3; i++) {
+  if (this->local_size[i] > state->ctx->Const.MaxComputeWorkGroupSize[i]) {
+ _mesa_glsl_error(&loc, state,
+  "local_size_%c exceeds MAX_COMPUTE_WORK_GROUP_SIZE"
+  " (%d)", 'x' + i,
+  state->ctx->Const.MaxComputeWorkGroupSize[i]);
+ break;
+  }
+  total_invocations *= this->local_size[i];
+  if (total_invocations >
+  state->ctx->Const.MaxComputeWorkGroupInvocations) {
+ _mesa_glsl_error(&loc, state,
+  "product of local_sizes exceeds "
+  "MAX_COMPUTE_WORK_GROUP_INVOCATIONS (%d)",
+  state->ctx->Const.MaxComputeWorkGroupInvocations);
+ break;
+  }
+   }
+
+   state->cs_input_local_size_specified = true;
+   for (int i = 0; i < 3; i++)
+  state->cs_input_local_size[i] = this->local_size[i];
+
+   /* We may now declare the built-in constant gl_WorkGroupSize (see
+* builtin_variable_generator::generate_constants() for why we didn't
+* d

[Mesa-dev] [PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.

2014-01-09 Thread Paul Berry
---
 src/glsl/builtin_variables.cpp  | 27 +++
 src/glsl/glsl_parser_extras.cpp |  4 
 src/glsl/glsl_parser_extras.h   |  3 +++
 src/glsl/main.cpp   |  3 +++
 src/glsl/standalone_scaffolding.cpp |  3 +++
 src/mesa/main/context.c |  5 +
 src/mesa/main/get.c |  8 
 src/mesa/main/mtypes.h  |  3 +++
 8 files changed, 56 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 17ae087..171bf08 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -390,6 +390,7 @@ private:
  enum ir_variable_mode mode, int slot);
ir_variable *add_uniform(const glsl_type *type, const char *name);
ir_variable *add_const(const char *name, int value);
+   ir_variable *add_const_ivec3(const char *name, int x, int y, int z);
void add_varying(int slot, const glsl_type *type, const char *name,
 const char *name_as_gs_input);
 
@@ -530,6 +531,25 @@ builtin_variable_generator::add_const(const char *name, 
int value)
 }
 
 
+ir_variable *
+builtin_variable_generator::add_const_ivec3(const char *name, int x, int y,
+int z)
+{
+   ir_variable *const var = add_variable(name, glsl_type::ivec3_type,
+ ir_var_auto, -1);
+   ir_constant_data data;
+   memset(&data, 0, sizeof(data));
+   data.i[0] = x;
+   data.i[1] = y;
+   data.i[2] = z;
+   var->constant_value = new(var) ir_constant(glsl_type::ivec3_type, &data);
+   var->constant_initializer =
+  new(var) ir_constant(glsl_type::ivec3_type, &data);
+   var->data.has_initializer = true;
+   return var;
+}
+
+
 void
 builtin_variable_generator::generate_constants()
 {
@@ -660,6 +680,13 @@ builtin_variable_generator::generate_constants()
   add_const("gl_MaxTessControlAtomicCounters", 0);
   add_const("gl_MaxTessEvaluationAtomicCounters", 0);
}
+
+   if (state->is_version(430, 0) || state->ARB_compute_shader_enable) {
+  add_const_ivec3("gl_MaxComputeWorkGroupSize",
+  state->Const.MaxComputeWorkGroupSize[0],
+  state->Const.MaxComputeWorkGroupSize[1],
+  state->Const.MaxComputeWorkGroupSize[2]);
+   }
 }
 
 
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 33a43c5..8524fc6 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -123,6 +123,10 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
this->Const.MaxCombinedAtomicCounters = 
ctx->Const.MaxCombinedAtomicCounters;
this->Const.MaxAtomicBufferBindings = ctx->Const.MaxAtomicBufferBindings;
 
+   /* Compute shader constants */
+   for (unsigned i = 0; i < Elements(this->Const.MaxComputeWorkGroupSize); i++)
+  this->Const.MaxComputeWorkGroupSize[i] = 
ctx->Const.MaxComputeWorkGroupSize[i];
+
this->current_function = NULL;
this->toplevel_ir = NULL;
this->found_return = false;
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 468707c..be34fd9 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -250,6 +250,9 @@ struct _mesa_glsl_parse_state {
   unsigned MaxFragmentAtomicCounters;
   unsigned MaxCombinedAtomicCounters;
   unsigned MaxAtomicBufferBindings;
+
+  /* ARB_compute_shader */
+  unsigned MaxComputeWorkGroupSize[3];
} Const;
 
/**
diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index 864c929..bb2054f 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -50,6 +50,9 @@ initialize_context(struct gl_context *ctx, gl_api api)
 */
ctx->Const.GLSLVersion = glsl_version;
ctx->Extensions.ARB_ES3_compatibility = true;
+   ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
+   ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
+   ctx->Const.MaxComputeWorkGroupSize[2] = 64;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index ab92da8..e8eb529 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -140,6 +140,9 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32;
 
ctx->Const.MaxDrawBuffers = 1;
+   ctx->Const.MaxComputeWorkGroupSize[0] = 1024;
+   ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
+   ctx->Const.MaxComputeWorkGroupSize[2] = 64;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /*

[Mesa-dev] [PATCH 25/30] mesa/cs: Handle compute shader local size during linking.

2014-01-09 Thread Paul Berry
---
 src/glsl/linker.cpp   | 64 +++
 src/mesa/main/mtypes.h| 17 +
 src/mesa/main/shaderapi.c |  7 ++
 3 files changed, 88 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 7461b17..11e0651 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1284,6 +1284,69 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
prog->Geom.VerticesOut = linked_shader->Geom.VerticesOut;
 }
 
+
+/**
+ * Perform cross-validation of compute shader local_size_{x,y,z} layout
+ * qualifiers for the attached compute shaders, and propagate them to the
+ * linked CS and linked shader program.
+ */
+static void
+link_cs_input_layout_qualifiers(struct gl_shader_program *prog,
+struct gl_shader *linked_shader,
+struct gl_shader **shader_list,
+unsigned num_shaders)
+{
+   for (int i = 0; i < 3; i++)
+  linked_shader->Comp.LocalSize[i] = 0;
+
+   /* This function is called for all shader stages, but it only has an effect
+* for compute shaders.
+*/
+   if (linked_shader->Stage != MESA_SHADER_COMPUTE)
+  return;
+
+   /* From the ARB_compute_shader spec, in the section describing local size
+* declarations:
+*
+* If multiple compute shaders attached to a single program object
+* declare local work-group size, the declarations must be identical;
+* otherwise a link-time error results. Furthermore, if a program
+* object contains any compute shaders, at least one must contain an
+* input layout qualifier specifying the local work sizes of the
+* program, or a link-time error will occur.
+*/
+   for (unsigned sh = 0; sh < num_shaders; sh++) {
+  struct gl_shader *shader = shader_list[sh];
+
+  if (shader->Comp.LocalSize[0] != 0) {
+ if (linked_shader->Comp.LocalSize[0] != 0) {
+for (int i = 0; i < 3; i++) {
+   if (linked_shader->Comp.LocalSize[i] !=
+   shader->Comp.LocalSize[i]) {
+  linker_error(prog, "compute shader defined with conflicting "
+   "local sizes\n");
+  return;
+   }
+}
+ }
+ for (int i = 0; i < 3; i++)
+linked_shader->Comp.LocalSize[i] = shader->Comp.LocalSize[i];
+  }
+   }
+
+   /* Just do the intrastage -> interstage propagation right now,
+* since we already know we're in the right type of shader program
+* for doing it.
+*/
+   if (linked_shader->Comp.LocalSize[0] == 0) {
+  linker_error(prog, "compute shader didn't declare local size\n");
+  return;
+   }
+   for (int i = 0; i < 3; i++)
+  prog->Comp.LocalSize[i] = linked_shader->Comp.LocalSize[i];
+}
+
+
 /**
  * Combine a group of shaders for a single stage to generate a linked shader
  *
@@ -1389,6 +1452,7 @@ link_intrastage_shaders(void *mem_ctx,
ralloc_steal(linked, linked->UniformBlocks);
 
link_gs_inout_layout_qualifiers(prog, linked, shader_list, num_shaders);
+   link_cs_input_layout_qualifiers(prog, linked, shader_list, num_shaders);
 
populate_symbol_table(linked);
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index e0c88a7..56e120b 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2171,6 +2171,11 @@ struct gl_fragment_program
 struct gl_compute_program
 {
struct gl_program Base;   /**< base class */
+
+   /**
+* Size specified using local_size_{x,y,z}.
+*/
+   unsigned LocalSize[3];
 };
 
 
@@ -2606,6 +2611,18 @@ struct gl_shader_program
  0 if not present. */
} Vert;
 
+   /**
+* Compute shader state - copied into gl_compute_program by
+* _mesa_copy_linked_program_data().
+*/
+   struct {
+  /**
+   * If this shader contains a compute stage, size specified using
+   * local_size_{x,y,z}.  Otherwise undefined.
+   */
+  unsigned LocalSize[3];
+   } Comp;
+
/* post-link info: */
unsigned NumUserUniformStorage;
struct gl_uniform_storage *UniformStorage;
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 5188e9c..053f27b 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1846,6 +1846,13 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
   dst_gp->UsesEndPrimitive = src->Geom.UsesEndPrimitive;
}
   break;
+   case MESA_SHADER_COMPUTE: {
+  struct gl_compute_program *dst_cp = (struct gl_compute_program *) dst;
+  int i;
+  for (i = 0; i < 3; i++)
+ dst_cp->LocalSize[i] = src->Comp.LocalSize[i];
+   }
+  break;
default:
   break;
}
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/30] mesa: Start implementing compute shaders.

2014-01-09 Thread Paul Berry
This is the first of several planned patch series to implement the
extension ARB_compute_shader in Mesa.  This series allows the Mesa
front-end to parse and compile a "do-nothing" compute shader--that is,
one which contains nothing but a compute shader input layout
declaration and an empty main() function.

Since compute shader support is not yet complete, I haven't enabled
it.  To try it out, set the environment variable
INTEL_COMPUTE_SHADER=1.

With that environment variable set, the series passes all of the
ARB_compute_shader tests I recently sent to the Piglit mailing list,
except for the "minmax" test.

Patches 01-12 do preparatory refactoring in order to make adding
compute shaders (and other future shader stages) easier.

Patch 13 adds the extension enable flags for compute shaders, both in
the context and in the GLSL compiler.

Patch 14 adds the ARB_compute_shader functions and enums to Mesa's
dispatch code generation logic.  The functions don't do anything yet.

Patch 15 changes the linker so that once the compute shader stage has
been added, it won't mistakenly consider it to be part of the rest of
the graphics pipeline.

Patches 16-20 add MESA_SHADER_COMPUTE to the gl_shader_stage enum, and
update the rest of the code base to accommodate its presence in the
enum.

Patches 21-23 implement some compute shader constants.

Patches 24-26 implement logic to handle compute shader work group
sizes (also known as "local sizes").

Patches 27-28 implement two other basic rules of compute shaders (they
don't have user-defined ins/outs, and they can't be mixed with other
shader types).

Finally, patches 29-30 implement the necessary back-end code for the
i965 driver to allow compute shaders to be turned on with an
environment variable.

Once these patches land, I plan to start working on the i965 back end.

[PATCH 01/30] mesa: Replace _mesa_program_index_to_target with 
_mesa_shader_stage_to_program.
[PATCH 02/30] mesa: Make validate_shader_target() non-static.
[PATCH 03/30] main: Allow ctx == NULL in _mesa_validate_shader_target().
[PATCH 04/30] mesa: use _mesa_validate_shader_target() more frequently.
[PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader 
stages.
[PATCH 06/30] mesa: Replace 
ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array.
[PATCH 07/30] mesa: Fold long lines introduced by the previous patch.
[PATCH 08/30] i965: Fix comments to refer to the new ctx->Shader.CurrentProgram 
array.
[PATCH 09/30] meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with 
an array.
[PATCH 10/30] mesa: Remove ad-hoc arrays of gl_shader_program.
[PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.
[PATCH 12/30] mesa: Change redundant code into loops in texstate.c.
[PATCH 13/30] mesa/cs: Add extension enable flags for ARB_compute_shader.
[PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader.
[PATCH 15/30] glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as 
a bound.
[PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update statements.
[PATCH 17/30] glsl/cs: Populate default values for 
ctx->Const.Program[MESA_SHADER_COMPUTE].
[PATCH 18/30] glsl/cs: update main.cpp to use the ".comp" extension for compute 
shaders.
[PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().
[PATCH 20/30] mesa/cs: Create the gl_compute_program struct, and the code to 
initialize it.
[PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
[PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
[PATCH 23/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.
[PATCH 24/30] glsl/cs: Handle compute shader local_size_{x,y,z} declaration.
[PATCH 25/30] mesa/cs: Handle compute shader local size during linking.
[PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
[PATCH 27/30] glsl/cs: Prohibit user-defined ins/outs in compute shaders.
[PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders.
[PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to 
initialize it.
[PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().

2014-01-09 Thread Paul Berry
---
 src/mesa/main/shaderapi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 519b200..5188e9c 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -993,6 +993,7 @@ _mesa_use_program(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 {
use_shader_program(ctx, GL_VERTEX_SHADER, shProg);
use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, shProg);
+   use_shader_program(ctx, GL_COMPUTE_SHADER, shProg);
use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg);
_mesa_active_program(ctx, shProg, "glUseProgram");
 
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/30] glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].

2014-01-09 Thread Paul Berry
---
 src/glsl/main.cpp   | 4 
 src/glsl/standalone_scaffolding.cpp | 4 
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index 03b7c78..afc15cb 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -50,6 +50,10 @@ initialize_context(struct gl_context *ctx, gl_api api)
 */
ctx->Const.GLSLVersion = glsl_version;
ctx->Extensions.ARB_ES3_compatibility = true;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not 
used */
 
switch (ctx->Const.GLSLVersion) {
case 100:
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index fe66067..ab92da8 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -140,6 +140,10 @@ void initialize_context_to_defaults(struct gl_context 
*ctx, gl_api api)
ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32;
 
ctx->Const.MaxDrawBuffers = 1;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
+   ctx->Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not 
used */
 
/* Set up default shader compiler options. */
struct gl_shader_compiler_options options;
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.

2014-01-09 Thread Paul Berry
---
 src/mesa/drivers/dri/i965/brw_context.h |  8 
 src/mesa/drivers/dri/i965/brw_program.c | 11 +++
 2 files changed, 19 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index df32ccb..abc1783 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -316,6 +316,14 @@ struct brw_fragment_program {
GLuint id;  /**< serial no. to identify frag progs, never re-used */
 };
 
+
+/** Subclass of Mesa compute program */
+struct brw_compute_program {
+   struct gl_compute_program program;
+   unsigned id;  /**< serial no. to identify frag progs, never re-used */
+};
+
+
 struct brw_shader {
struct gl_shader base;
 
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 90844e5..2d92acb 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct gl_context 
*ctx,
   }
}
 
+   case GL_COMPUTE_PROGRAM_NV: {
+  struct brw_compute_program *prog = CALLOC_STRUCT(brw_compute_program);
+  if (prog) {
+ prog->id = get_new_program_id(brw->intelScreen);
+
+ return _mesa_init_compute_program(ctx, &prog->program, target, id);
+  } else {
+ return NULL;
+  }
+   }
+
default:
   assert(!"Unsupported target in brwNewProgram()");
   return NULL;
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/30] meta: Replace save_state::{Vertex, Geometry, Fragment}Shader with an array.

2014-01-09 Thread Paul Berry
Since ctx->Shader.Current{Vertex,Geometry,Fragment}Program is an
array, this allows some meta code to be rolled up into loops.
---
 src/mesa/drivers/common/meta.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 5643e3c..2eeb09b 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -138,9 +138,7 @@ struct save_state
GLboolean FragmentProgramEnabled;
struct gl_fragment_program *FragmentProgram;
GLboolean ATIFragmentShaderEnabled;
-   struct gl_shader_program *VertexShader;
-   struct gl_shader_program *GeometryShader;
-   struct gl_shader_program *FragmentShader;
+   struct gl_shader_program *Shader[MESA_SHADER_STAGES];
struct gl_shader_program *ActiveShader;
 
/** MESA_META_STENCIL_TEST */
@@ -617,12 +615,10 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
  _mesa_set_enable(ctx, GL_FRAGMENT_SHADER_ATI, GL_FALSE);
   }
 
-  _mesa_reference_shader_program(ctx, &save->VertexShader,
- 
ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]);
-  _mesa_reference_shader_program(ctx, &save->GeometryShader,
- 
ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]);
-  _mesa_reference_shader_program(ctx, &save->FragmentShader,
- 
ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT]);
+  for (int i = 0; i < MESA_SHADER_STAGES; i++) {
+ _mesa_reference_shader_program(ctx, &save->Shader[i],
+ ctx->Shader.CurrentProgram[i]);
+  }
   _mesa_reference_shader_program(ctx, &save->ActiveShader,
  ctx->Shader.ActiveProgram);
 
@@ -829,6 +825,7 @@ _mesa_meta_end(struct gl_context *ctx)
 {
struct save_state *save = &ctx->Meta->Save[ctx->Meta->SaveStackDepth - 1];
const GLbitfield state = save->SavedState;
+   int i;
 
/* After starting a new occlusion query, initialize the results to the
 * values saved previously. The driver will then continue to increment
@@ -960,23 +957,24 @@ _mesa_meta_end(struct gl_context *ctx)
   save->ATIFragmentShaderEnabled);
   }
 
-  if (ctx->Extensions.ARB_vertex_shader)
-_mesa_use_shader_program(ctx, GL_VERTEX_SHADER, save->VertexShader);
+  if (ctx->Extensions.ARB_vertex_shader) {
+_mesa_use_shader_program(ctx, GL_VERTEX_SHADER,
+  save->Shader[MESA_SHADER_VERTEX]);
+  }
 
   if (_mesa_has_geometry_shaders(ctx))
 _mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB,
- save->GeometryShader);
+ save->Shader[MESA_SHADER_GEOMETRY]);
 
   if (ctx->Extensions.ARB_fragment_shader)
 _mesa_use_shader_program(ctx, GL_FRAGMENT_SHADER,
- save->FragmentShader);
+ save->Shader[MESA_SHADER_FRAGMENT]);
 
   _mesa_reference_shader_program(ctx, &ctx->Shader.ActiveProgram,
 save->ActiveShader);
 
-  _mesa_reference_shader_program(ctx, &save->VertexShader, NULL);
-  _mesa_reference_shader_program(ctx, &save->GeometryShader, NULL);
-  _mesa_reference_shader_program(ctx, &save->FragmentShader, NULL);
+  for (i = 0; i < MESA_SHADER_STAGES; i++)
+ _mesa_reference_shader_program(ctx, &save->Shader[i], NULL);
   _mesa_reference_shader_program(ctx, &save->ActiveShader, NULL);
}
 
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader.

2014-01-09 Thread Paul Berry
---
 src/mapi/glapi/gen/ARB_compute_shader.xml | 40 +++
 src/mapi/glapi/gen/Makefile.am|  1 +
 src/mapi/glapi/gen/gl_API.xml |  4 ++-
 src/mapi/glapi/gen/gl_genexec.py  |  1 +
 src/mesa/Makefile.sources |  1 +
 src/mesa/SConscript   |  1 +
 src/mesa/main/compute.c   | 54 +++
 src/mesa/main/compute.h   | 38 ++
 src/mesa/main/tests/dispatch_sanity.cpp   |  4 +--
 9 files changed, 141 insertions(+), 3 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_compute_shader.xml
 create mode 100644 src/mesa/main/compute.c
 create mode 100644 src/mesa/main/compute.h

diff --git a/src/mapi/glapi/gen/ARB_compute_shader.xml 
b/src/mapi/glapi/gen/ARB_compute_shader.xml
new file mode 100644
index 000..1db373e
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_compute_shader.xml
@@ -0,0 +1,40 @@
+
+
+
+
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
+  
+
+
+
+  
+
+  
+
+  
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 65bd913..a9cf2a31 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -91,6 +91,7 @@ API_XML = \
ARB_base_instance.xml \
ARB_blend_func_extended.xml \
ARB_color_buffer_float.xml \
+   ARB_compute_shader.xml \
ARB_copy_buffer.xml \
ARB_debug_output.xml \
ARB_depth_buffer_float.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 697b2ec..d6cd1f5 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8464,7 +8464,9 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+http://www.w3.org/2001/XInclude"/>
+
+
 
 http://www.w3.org/2001/XInclude"/>
 
diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py
index b557b3b..e376da3 100644
--- a/src/mapi/glapi/gen/gl_genexec.py
+++ b/src/mapi/glapi/gen/gl_genexec.py
@@ -57,6 +57,7 @@ header = """/**
 #include "main/clear.h"
 #include "main/clip.h"
 #include "main/colortab.h"
+#include "main/compute.h"
 #include "main/condrender.h"
 #include "main/context.h"
 #include "main/convolve.h"
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 39525bc..2fdc2b6 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -23,6 +23,7 @@ MAIN_FILES = \
$(SRCDIR)main/clear.c \
$(SRCDIR)main/clip.c \
$(SRCDIR)main/colortab.c \
+   $(SRCDIR)main/compute.c \
$(SRCDIR)main/condrender.c \
$(SRCDIR)main/context.c \
$(SRCDIR)main/convolve.c \
diff --git a/src/mesa/SConscript b/src/mesa/SConscript
index bb9b304..24264c8 100644
--- a/src/mesa/SConscript
+++ b/src/mesa/SConscript
@@ -51,6 +51,7 @@ main_sources = [
 'main/clear.c',
 'main/clip.c',
 'main/colortab.c',
+'main/compute.c',
 'main/condrender.c',
 'main/context.c',
 'main/convolve.c',
diff --git a/src/mesa/main/compute.c b/src/mesa/main/compute.c
new file mode 100644
index 000..575
--- /dev/null
+++ b/src/mesa/main/compute.c
@@ -0,0 +1,54 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "glheader.h"
+#include "compute.h"
+#include "context.h"
+
+void GLAPIENTRY
+_mesa_DispatchCompute(GLuint num_groups_x,
+  GLuint num_groups_y,
+  GLuint num_groups_z)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (ctx->Extensions.ARB_compute_shader) {
+  assert(!"TODO");
+   } else {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "unsupported function (glDispatchCompute) called");
+   }
+}
+
+extern void GLAPIENTRY
+_mesa_DispatchComputeIndirect(GLintptr indirect)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (ctx->Extensions.ARB_comput

[Mesa-dev] [PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders.

2014-01-09 Thread Paul Berry
Fixes piglit test:
spec/ARB_compute_shader/linker/mix_compute_and_non_compute
---
 src/glsl/linker.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 11e0651..f1344ea 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2107,6 +2107,13 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   goto done;
}
 
+   /* Compute shaders have additional restrictions. */
+   if (num_shaders[MESA_SHADER_COMPUTE] > 0 &&
+   num_shaders[MESA_SHADER_COMPUTE] != prog->NumShaders) {
+  linker_error(prog, "Compute shaders may not be linked with any other "
+   "type of shader\n");
+   }
+
for (unsigned int i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] != NULL)
 ctx->Driver.DeleteShader(ctx, prog->_LinkedShaders[i]);
-- 
1.8.5.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.

2014-01-09 Thread Paul Berry
This is possible now that ctx->Shader.CurrentProgram is an array.
---
 src/mesa/main/texstate.c | 75 +++-
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c
index b9c76da..905a9d5 100644
--- a/src/mesa/main/texstate.c
+++ b/src/mesa/main/texstate.c
@@ -526,27 +526,20 @@ static void
 update_texture_state( struct gl_context *ctx )
 {
GLuint unit;
-   struct gl_program *fprog = NULL;
-   struct gl_program *vprog = NULL;
-   struct gl_program *gprog = NULL;
+   struct gl_program *prog[MESA_SHADER_STAGES];
GLbitfield enabledFragUnits = 0x0;
-
-   if (ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] &&
-   ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]->LinkStatus) {
-  vprog = 
ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX]->_LinkedShaders[MESA_SHADER_VERTEX]->Program;
-   }
-
-   if (ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY] &&
-   ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]->LinkStatus) {
-  gprog = 
ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program;
-   }
-
-   if (ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT] &&
-   ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT]->LinkStatus) {
-  fprog = 
ctx->Shader.CurrentProgram[MESA_SHADER_FRAGMENT]->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program;
-   }
-   else if (ctx->FragmentProgram._Enabled) {
-  fprog = &ctx->FragmentProgram.Current->Base;
+   int i;
+
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (ctx->Shader.CurrentProgram[i] &&
+  ctx->Shader.CurrentProgram[i]->LinkStatus) {
+ prog[i] = ctx->Shader.CurrentProgram[i]->_LinkedShaders[i]->Program;
+  } else {
+ if (i == MESA_SHADER_FRAGMENT && ctx->FragmentProgram._Enabled)
+prog[i] = &ctx->FragmentProgram.Current->Base;
+ else
+prog[i] = NULL;
+  }
}
 
/* TODO: only set this if there are actual changes */
@@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx )
 */
for (unit = 0; unit < ctx->Const.MaxCombinedTextureImageUnits; unit++) {
   struct gl_texture_unit *texUnit = &ctx->Texture.Unit[unit];
-  GLbitfield enabledVertTargets = 0x0;
-  GLbitfield enabledFragTargets = 0x0;
-  GLbitfield enabledGeomTargets = 0x0;
+  GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES];
   GLbitfield enabledTargets = 0x0;
   GLuint texIndex;
 
@@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx )
* by a fragment program/program.  When multiple flags are set, we'll
* settle on the one with highest priority (see below).
*/
-  if (vprog) {
- enabledVertTargets |= vprog->TexturesUsed[unit];
+  for (i = 0; i < MESA_SHADER_STAGES; i++) {
+ if (prog[i])
+enabledTargetsByStage[i] = prog[i]->TexturesUsed[unit];
+ else if (i == MESA_SHADER_FRAGMENT)
+enabledTargetsByStage[i] = texUnit->Enabled;
+ else
+enabledTargetsByStage[i] = 0;
+ enabledTargets |= enabledTargetsByStage[i];
   }
 
-  if (gprog) {
- enabledGeomTargets |= gprog->TexturesUsed[unit];
-  }
-
-  if (fprog) {
- enabledFragTargets |= fprog->TexturesUsed[unit];
-  }
-  else {
- /* fixed-function fragment program */
- enabledFragTargets |= texUnit->Enabled;
-  }
-
-  enabledTargets = enabledVertTargets | enabledFragTargets |
-   enabledGeomTargets;
-
   texUnit->_ReallyEnabled = 0x0;
 
   if (enabledTargets == 0x0) {
@@ -624,7 +606,7 @@ update_texture_state( struct gl_context *ctx )
   }
 
   if (!texUnit->_ReallyEnabled) {
- if (fprog) {
+ if (prog[MESA_SHADER_FRAGMENT]) {
 /* If we get here it means the shader is expecting a texture
  * object, but there isn't one (or it's incomplete).  Use the
  * fallback texture.
@@ -654,25 +636,26 @@ update_texture_state( struct gl_context *ctx )
 
   ctx->Texture._EnabledUnits |= (1 << unit);
 
-  if (enabledFragTargets)
+  if (enabledTargetsByStage[MESA_SHADER_FRAGMENT])
  enabledFragUnits |= (1 << unit);
 
-  if (!fprog)
+  if (!prog[MESA_SHADER_FRAGMENT])
  update_tex_combine(ctx, texUnit);
}
 
 
/* Determine which texture coordinate sets are actually needed */
-   if (fprog) {
+   if (prog[MESA_SHADER_FRAGMENT]) {
   const GLuint coordMask = (1 << MAX_TEXTURE_COORD_UNITS) - 1;
   ctx->Texture._EnabledCoordUnits
- = (fprog->InputsRead >> VARYING_SLOT_TEX0) & coordMask;
+ = (prog[MESA_SHADER_FRAGMENT]->InputsRead >> VARYING_SLOT_TEX0) &
+ coordMask;
}
else {
   ctx->Texture._EnabledCoordUnits = enabledFragUnits;
}
 
-   if (!fprog || !vprog)
+   if (!prog[MESA_SHADER_FRAGMENT] || !prog[MESA_SHADER_VERTEX])
   update_texgen(ctx);
 }
 
-- 
1.8.5

[Mesa-dev] [PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.

2014-01-09 Thread Paul Berry
This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader.  This
avoids "unhandled switch case" compiler warnings.
---
 src/glsl/ast_to_hir.cpp   |  9 +
 src/glsl/builtin_variables.cpp| 17 +
 src/glsl/standalone_scaffolding.h |  2 ++
 src/mesa/main/context.c   |  8 
 src/mesa/main/mtypes.h|  3 ++-
 src/mesa/main/shaderapi.c |  5 +
 src/mesa/main/shaderobj.h |  2 ++
 src/mesa/program/prog_print.c |  3 +++
 src/mesa/program/program.h|  4 
 9 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index ef8e699..a26745d 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2093,6 +2093,12 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
 
   fail = true;
   break;
+
+   case MESA_SHADER_COMPUTE:
+  _mesa_glsl_error(loc, state,
+   "compute shader variables cannot be given "
+   "explicit locations");
+  return;
};
 
if (fail) {
@@ -2275,6 +2281,9 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
 if (var->data.mode == ir_var_shader_in)
var->data.invariant = true;
 break;
+  case MESA_SHADER_COMPUTE:
+ /* Invariance isn't meaningful in compute shaders. */
+ break;
   }
}
 
diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index f630923..17ae087 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -356,6 +356,7 @@ public:
void generate_vs_special_vars();
void generate_gs_special_vars();
void generate_fs_special_vars();
+   void generate_cs_special_vars();
void generate_varyings();
 
 private:
@@ -866,6 +867,16 @@ builtin_variable_generator::generate_fs_special_vars()
 
 
 /**
+ * Generate variables which only exist in compute shaders.
+ */
+void
+builtin_variable_generator::generate_cs_special_vars()
+{
+   /* TODO: finish this. */
+}
+
+
+/**
  * Add a single "varying" variable.  The variable's type and direction (input
  * or output) are adjusted as appropriate for the type of shader being
  * compiled.  For geometry shaders using {ARB,EXT}_geometry_shader4,
@@ -886,6 +897,9 @@ builtin_variable_generator::add_varying(int slot, const 
glsl_type *type,
case MESA_SHADER_FRAGMENT:
   add_input(slot, type, name);
   break;
+   case MESA_SHADER_COMPUTE:
+  /* Compute shaders don't have varyings. */
+  break;
}
 }
 
@@ -973,5 +987,8 @@ _mesa_glsl_initialize_variables(exec_list *instructions,
case MESA_SHADER_FRAGMENT:
   gen.generate_fs_special_vars();
   break;
+   case MESA_SHADER_COMPUTE:
+  gen.generate_cs_special_vars();
+  break;
}
 }
diff --git a/src/glsl/standalone_scaffolding.h 
b/src/glsl/standalone_scaffolding.h
index 327fef2..df783af 100644
--- a/src/glsl/standalone_scaffolding.h
+++ b/src/glsl/standalone_scaffolding.h
@@ -58,6 +58,8 @@ _mesa_shader_enum_to_shader_stage(GLenum v)
   return MESA_SHADER_FRAGMENT;
case GL_GEOMETRY_SHADER:
   return MESA_SHADER_GEOMETRY;
+   case GL_COMPUTE_SHADER:
+  return MESA_SHADER_COMPUTE;
default:
   assert(!"bad value in _mesa_shader_enum_to_shader_stage()");
   return MESA_SHADER_VERTEX;
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 5855f15..b0cf5da 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -498,6 +498,14 @@ init_program_limits(struct gl_context *ctx, 
gl_shader_stage stage,
   prog->MaxInputComponents = 16 * 4; /* old limit not to break tnl and 
swrast */
   prog->MaxOutputComponents = 16 * 4; /* old limit not to break tnl and 
swrast */
   break;
+   case MESA_SHADER_COMPUTE:
+  prog->MaxParameters = 0; /* not meaningful for compute shaders */
+  prog->MaxAttribs = 0; /* not meaningful for compute shaders */
+  prog->MaxAddressRegs = 0; /* not meaningful for compute shaders */
+  prog->MaxUniformComponents = 4 * MAX_UNIFORMS;
+  prog->MaxInputComponents = 0; /* not meaningful for compute shaders */
+  prog->MaxOutputComponents = 0; /* not meaningful for compute shaders */
+  break;
default:
   assert(0 && "Bad shader stage in init_program_limits()");
}
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ba7b10..8b88d75 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -404,9 +404,10 @@ typedef enum
MESA_SHADER_VERTEX = 0,
MESA_SHADER_GEOMETRY = 1,
MESA_SHADER_FRAGMENT = 2,
+   MESA_SHADER_COMPUTE = 3,
 } gl_shader_stage;
 
-#define MESA_SHADER_STAGES (MESA_SHADER_FRAGMENT + 1)
+#define MESA_SHADER_STAGES (MESA_SHADER_COMPUTE + 1)
 
 
 /**
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
in

  1   2   3   4   5   6   7   8   9   10   >