Re: [Mesa-dev] [PATCH 3/4] i965: For color clears, only disable writes to components that exist.
On 21 March 2014 04:47, Kenneth Graunke kenn...@whitecape.org wrote: The SIMD16 replicated FB write message only works if we don't need the color calculator to mask our framebuffer writes. Previously, we bailed on it if color_mask wasn't true, true, true, true. However, this was needlessly strict for formats with fewer than four components - only the components that actually exist matter. WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set to true, true, true, false. This will work perfectly fine with the replicated data message; we just bailed unnecessarily. Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) and Bay Trail (at 2048x1152) by about 40% (using Chrome 24). Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Dylan Baker baker.dyla...@gmail.com Cc: Keith Packard kei...@keithp.com Cc: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Nice find! Series is: Reviewed-by: Paul Berry stereotype...@gmail.com diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp index 45b9fa0..7e19c2c 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp @@ -231,7 +231,7 @@ brw_blorp_clear_params::brw_blorp_clear_params(struct brw_context *brw, /* Constant color writes ignore everyting in blend and color calculator * state. This is not documented. */ - for (int i = 0; i 4; i++) { + for (int i = 0; i _mesa_format_num_components(irb-mt-format); i++) { if (!color_mask[i]) { color_write_disable[i] = true; wm_prog_key.use_simd16_replicated_data = false; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Drop special case for edgeflag thanks to Marek's change to core.
On 17 March 2014 14:54, Eric Anholt e...@anholt.net wrote: --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 9 - 1 file changed, 9 deletions(-) For future cross-referencing it would be nice to mention the SHA of Marek's change in the commit message. diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index d42c074..d08e56a 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -717,12 +717,6 @@ static void brw_emit_vertices(struct brw_context *brw) uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC; uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC; - /* The gen4 driver expects edgeflag to come in as a float, and passes - * that float on to the tests in the clipper. Mesa's current vertex - * attribute value for EdgeFlag is stored as a float, which works out. - * glEdgeFlagPointer, on the other hand, gives us an unnormalized - * integer ubyte. Just rewrite that to convert to a float. - */ if (input-attrib == VERT_ATTRIB_EDGEFLAG) { /* Gen6+ passes edgeflag as sideband along with the vertex, instead * of in the VUE. We have to upload it sideband as the last vertex @@ -732,9 +726,6 @@ static void brw_emit_vertices(struct brw_context *brw) gen6_edgeflag_input = input; continue; } - - if (format == BRW_SURFACEFORMAT_R8_UINT) -format = BRW_SURFACEFORMAT_R8_SSCALED; } switch (input-glarray-Size) { -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On 10 March 2014 17:23, Ian Romanick i...@freedesktop.org wrote: I had a pretty similar patch on the top of my pow-optimization branch. I also expand x**3 and x**4. I had hoped that would enable some cases to expand then merge to MADs. It should also be faster on older GENs where POW perf sucks. I didn't send it out because I wanted to add a similar optimization in the back end that would turn x*x*x*x back into x**4 on GPUs where the POW would be faster. Be careful with that one, though: pow is undefined when x 0, so x*x*x*x = pow(x, 4) isn't a valid conversion in general--it only works if you know the operation started off as a pow in the first place. (Note: you can do x*x*x*x = pow(abs(x), 4), but that doesn't generalize to odd powers). I also didn't have anything in shader-db that benefitted from x**2 or x**3. It seems like there were a couple that would be modified by a x**5 flattening, but I think that would universally be slower On 03/10/2014 03:54 PM, Matt Turner wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + break; case ir_unop_rcp: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] i965/gs: add snb support
On 24 February 2014 00:36, Ilia Mirkin imir...@alum.mit.edu wrote: Before you read any further, this is nowhere close to working. However it's in a state where I think most of the structure is there, albeit with a lot of XXX comments. And I haven't actually implemented the new opcodes I've added. I was hoping one or two Intel people could take a look at this and let me know of any pitfalls I'm likely to run into. I've already gotten a lot of help and advice from Ken, but wanted to put something out publicly. Any and all feedback much appreciated! Not-Signed-off-by: Ilia Mirkin imir...@alum.mit.edu I'm really excited to see you're working on this! A lot of people are going to be really happy to see geometry shader support added to Gen6. AFAICT, your approach is good. I plan to look at it in more detail later today or sometime tomorrow. But before I do, here are some general comments: 1. For long-term readability, it might be useful to have 3 classes: vec4_gs_visitor (for stuff that's common to all gs visitors), gen6_gs_visitor (for the stuff that's specific to gen6 geometry shaders), and gen7plus_gs_visitor (for the stuff that's specific to gen7+ geometry shaders). What you currently have is a little harder to read, because it means that to understand gen6 geometry shaders, you have to keep in mind that some of the code in vec4_gs_visitor is never going to be executed because it will always be overridden by gen6_gs_visitor. 2. Stream output (a.k.a. transform feedback): As you've probably already figured out, transform feedback on gen6 is tricky because the hardware doesn't have a dedicated pipeline stage for it; you have to emit code in the geometry shader to do it. This is a problem because the GL spec says that two things are supposed to happen in between the GS stage and transform feedback: dangling vertex removal (ignoring line strips containing just a single vertex and ignoring triangle strips containing less than 2 vertices), and expanding line/triangle strips into individual line segments/triangles. In Gen6, these operations happen after the GS stage, so the code you emit in the geometry shader to do transform feedback needs to simulate them. Back when I first implemented transform feedback for Gen6, supporting user-defined geometry shaders was the furthest thing from my mind, so I implemented the transform feedback code in raw assembly rather than re-using the vec4_visitor/vec4_generator infrastructure (see gen6_sol_program()). Unfortunately that's going to make it hard to re-use that code in user-defined geometry shaders. Also, when I implemented that code, I was able to make some major simplifying assumptions because there was no user-defined geometry shader--that meant that the number of vertices to process was known at compile time, and I didn't have to worry about dangling vertex removal or expanding line/triangle strips into individual line segments/triangles (because the hardware automatically performs those operations before invoking the GS). You're not going to have that luxury. My recommendation is: feel free to use gen6_sol_program() as a reference point, but don't bother trying to re-use it--the simplifying assumptions are just going to make it too different from what you want. I'd recommend leaving gen6_sol_program() around to handle the case where there's no user-defined geometry shader, and then you can write fresh code for handling transform feedback when there *is* a user-defined geometry shader. Before you go too far on the implementation, I'd also recommend writing some new piglit tests in tests/spec/glsl-1.50/execution/geometry/ to verify that dangling vertex removal and strip expansion are handled properly for transform feedback data. If you don't have gs-supporting hardware at your disposal to validate those tests against, I (or one of the Intel folks) would be happy to validate the tests for you. Note in particular that in order to expand a triangle strip to triangles, you have to be careful with the order to make sure that both the provoking vertex and the triangle orientation are preserved (it's not necessary to follow ARB_provoking_vertex, though; you can just assume the GL default of LAST_VERTEX_CONVENTION applies). For example, if the pre-expansion vertex order is ABCDEF, then the only correct post-expansion order is ABC,CBD,CDE,EDF. I'm not sure whether Gen7+ hardware gets this order exactly right in all circumstances, but it seems like it should be easy to get it right on Gen6, because the order is entirely determined by software :) I'd also recommend adding some piglit tests to make sure that transform feedback overflow conditions are handled properly when a geometry shader is in use, since overflow handling on Gen6 is (partially) the responsibility of the shader code you generate, and you're not going to be able to piggy-back on the overflow handling in gen6_sol_program(). One other note: check out this text from the Gen6
Re: [Mesa-dev] [PATCH] i965: Don't try to use the hardware blitter for multisampled miptrees.
On 21 February 2014 19:15, Kenneth Graunke kenn...@whitecape.org wrote: The blitter is completely ignorant of MSAA buffer layouts, so any attempt to use BLT paths with MSAA buffers is likely to break spectacularly. In most cases, BLORP handles MSAA blits, so we never hit this bug. Until recently, it also wasn't worth fixing, since Meta couldn't handle MSAA either, so there was nothing to fall back to. But now there is. +143 piglit tests on Broadwell (which doesn't have BLORP support). Surprisingly, three also start failing. No changes on Ivybridge. That actually doesn't surprise me too much. Since color buffers use an array-like sample layout (sample N in the Nth layer), trying to use the blitter on a multisampled buffer just gets you sample 0 from each pixel. Which for some tests is good enough to pass (even though the blit is actually doing the wrong thing). So it's possible that a few of these tests that were passing by luck with the blitter don't yet have proper Meta support :) Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_blit.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index df85dc9..cedb230 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -165,6 +165,10 @@ intel_miptree_blit(struct brw_context *brw, mesa_format src_format = _mesa_get_srgb_format_linear(src_mt-format); mesa_format dst_format = _mesa_get_srgb_format_linear(dst_mt-format); + /* The blitter doesn't understand multisampling at all. */ + if (src_mt-num_samples 0 || dst_mt-num_samples 0) + return false; + You might consider moving this check up to the top of the function, since it doesn't depend on src_format or dst_format. But it's not a big deal. Either way, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 8/9] i965: support gl_InvocationID for gen7
On 4 February 2014 15:07, Jordan Justen jordan.l.jus...@intel.com wrote: v2: * Make gl_InstanceID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_defines.h | 12 src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 20 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 16 +--- 5 files changed, 48 insertions(+), 3 deletions(-) Hopefully I will get a chance to look at this in more detail soon, but for now, consider this: Acked-by: Paul Berry stereotype...@gmail.com diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 75d09fc..199a699 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -902,6 +902,13 @@ enum opcode { * form the final channel mask. */ GS_OPCODE_SET_CHANNEL_MASKS, + + /** +* Get the Instance ID fields from the payload. +* +* - dst is the GRF for gl_InvocationID. +*/ + GS_OPCODE_GET_INSTANCE_ID, }; enum brw_urb_write_flags { @@ -1538,6 +1545,11 @@ enum brw_message_target { # define BRW_GS_EDGE_INDICATOR_0 (1 8) # define BRW_GS_EDGE_INDICATOR_1 (1 9) +/* GS Thread Payload + */ +/* R0 */ +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27 + /* 3DSTATE_GS Output Vertex Size has an effective maximum of 62. It's * counted in multiples of 16 bytes. */ diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 6cc2595..5e3005a 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -525,6 +525,8 @@ brw_instruction_name(enum opcode op) return prepare_channel_masks; case GS_OPCODE_SET_CHANNEL_MASKS: return set_channel_masks; + case GS_OPCODE_GET_INSTANCE_ID: + return get_instance_id; default: /* Yes, this leaks. It's in debug code, it should never occur, and if diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index e17b5cd..182a1e1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -616,6 +616,7 @@ private: void generate_gs_set_dword_2_immed(struct brw_reg dst, struct brw_reg src); void generate_gs_prepare_channel_masks(struct brw_reg dst); void generate_gs_set_channel_masks(struct brw_reg dst, struct brw_reg src); + void generate_gs_get_instance_id(struct brw_reg dst); void generate_oword_dual_block_offsets(struct brw_reg m1, struct brw_reg index); void generate_scratch_write(vec4_instruction *inst, diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 94d1e79..a48d829 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -639,6 +639,22 @@ vec4_generator::generate_gs_set_channel_masks(struct brw_reg dst, } void +vec4_generator::generate_gs_get_instance_id(struct brw_reg dst) +{ + /* We want to right shift R0.0 R0.1 by GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT +* and store into dst.0 dst.4. So generate the instruction: +* +* shr(8) dst1 R01,4,0 GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT { align1 WE_all } +*/ + brw_push_insn_state(p); + brw_set_access_mode(p, BRW_ALIGN_1); + dst = retype(dst, BRW_REGISTER_TYPE_UD); + struct brw_reg r0(retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD)); + brw_SHR(p, dst, stride(r0, 1, 4, 0), brw_imm_ud(GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT)); + brw_pop_insn_state(p); +} + +void vec4_generator::generate_oword_dual_block_offsets(struct brw_reg m1, struct brw_reg index) { @@ -1218,6 +1234,10 @@ vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, generate_gs_set_channel_masks(dst, src[0]); break; + case GS_OPCODE_GET_INSTANCE_ID: + generate_gs_get_instance_id(dst); + break; + case SHADER_OPCODE_SHADER_TIME_ADD: brw_shader_time_add(p, src[0], prog_data-base.binding_table.shader_time_start); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 40743cc..360703b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -51,9 +51,19 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw, dst_reg * vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir) { - /* Geometry
Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.
On 4 February 2014 10:03, Jordan Justen jljus...@gmail.com wrote: What about trying to make use of MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader? We could add extensions.c:bool _mesa_is_extension_override_enabled(char *) And then if (_mesa_is_extension_override_enabled(GL_ARB_compute_shader)) Or, similarly, get overrides shoved into ctx-Extensions to allow checking for overrides at this early stage. -Jordan I looked into what would be necessary to do this, and it's unfortunately more complicated than it should be due to some flukes about initialization order (perhaps this is what you were alluding to when you said get overrides shoved into ctx-Extensions to allow checking for overrides at this early stage). Yeah, I'll admit I looked at it a bit, and decided to keep my feedback 'hand wavy'. :) Currently, our initialization order is: 1. brwCreateContext() calls brw_initialize_context_constants(), which needs to know whether compute shaders are supported in order to set constants like MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS correctly. 2. Later, brwCreateContext() calls intelInitExtensions(), which is the function where we'll set ctx-Extensions.ARB_compute_shader to true once compute shaders are fully supported. 3. Much later in initialization, when the context is being made current for the first time, core Mesa calls _mesa_make_extension_string(), which calls get_extension_override() to modify the the ctx-Extensions table based on the environment variable override. To do what you suggested, I would either have to: A. change 1 to call _mesa_is_extension_override_enabled(); that function, in turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we couldn't reuse the parsing code in _mesa_make_extension_string(), since that parsing code updates ctx-Extensions as a side effect, and it's not time to update ctx-Extensions yet. In addition to the code duplication, we would have a danger of bugs, since the override takes effect so late in initialization--if we ever accidentally introduced some code in between steps 2 and 3 that checked the value of ctx-Extensions.ARB_compute_shader, it would see false even if compute shaders were enabled by override. Or: B. change the order of initialization so that 2 happens first, followed by 3 and then 1. In the long run I think this would be a good thing, but it's a big change (since it affects all back ends) and I'm not sure it would be a good idea to hold up this patch series waiting for it. How would you feel if I landed the patch series as is, and then worked on a follow up patch to do B? Sure, you can have my r-b for this then. -Jordan Great, thanks. I've pushed the patches. I'll work on B as soon as I can. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.
On 1 February 2014 23:21, Jordan Justen jljus...@gmail.com wrote: On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com wrote: This will allow testing of compute shader functionality before it is completed. To enable ARB_compute_shader functionality in the i965 driver, set INTEL_COMPUTE_SHADER=1. --- src/mesa/drivers/dri/i965/brw_context.c | 11 ++- src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 1b42751..76dd9be 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -298,10 +298,17 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = BRW_MAX_TEX_UNIT; else ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0; + if (getenv(INTEL_COMPUTE_SHADER)) { What about trying to make use of MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader? We could add extensions.c:bool _mesa_is_extension_override_enabled(char *) And then if (_mesa_is_extension_override_enabled(GL_ARB_compute_shader)) Or, similarly, get overrides shoved into ctx-Extensions to allow checking for overrides at this early stage. -Jordan I looked into what would be necessary to do this, and it's unfortunately more complicated than it should be due to some flukes about initialization order (perhaps this is what you were alluding to when you said get overrides shoved into ctx-Extensions to allow checking for overrides at this early stage). Currently, our initialization order is: 1. brwCreateContext() calls brw_initialize_context_constants(), which needs to know whether compute shaders are supported in order to set constants like MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS correctly. 2. Later, brwCreateContext() calls intelInitExtensions(), which is the function where we'll set ctx-Extensions.ARB_compute_shader to true once compute shaders are fully supported. 3. Much later in initialization, when the context is being made current for the first time, core Mesa calls _mesa_make_extension_string(), which calls get_extension_override() to modify the the ctx-Extensions table based on the environment variable override. To do what you suggested, I would either have to: A. change 1 to call _mesa_is_extension_override_enabled(); that function, in turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we couldn't reuse the parsing code in _mesa_make_extension_string(), since that parsing code updates ctx-Extensions as a side effect, and it's not time to update ctx-Extensions yet. In addition to the code duplication, we would have a danger of bugs, since the override takes effect so late in initialization--if we ever accidentally introduced some code in between steps 2 and 3 that checked the value of ctx-Extensions.ARB_compute_shader, it would see false even if compute shaders were enabled by override. Or: B. change the order of initialization so that 2 happens first, followed by 3 and then 1. In the long run I think this would be a good thing, but it's a big change (since it affects all back ends) and I'm not sure it would be a good idea to hold up this patch series waiting for it. How would you feel if I landed the patch series as is, and then worked on a follow up patch to do B? + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = BRW_MAX_TEX_UNIT; + ctx-Const.MaxUniformBufferBindings += 12; + } else { + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0; + } ctx-Const.MaxCombinedTextureImageUnits = ctx-Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits + ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits + - ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits; + ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits + + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits; ctx-Const.MaxTextureLevels = 14; /* 8192 */ if (ctx-Const.MaxTextureLevels MAX_TEXTURE_LEVELS) @@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters = MAX_ATOMIC_COUNTERS; ctx-Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters = MAX_ATOMIC_COUNTERS; ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters = MAX_ATOMIC_COUNTERS; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxAtomicCounters = MAX_ATOMIC_COUNTERS; ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers = BRW_MAX_ABO; ctx-Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers = BRW_MAX_ABO; ctx-Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers = BRW_MAX_ABO; + ctx-Const.Program
Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization
on the output file. That *might* be benign for now (I haven't looked through all the callers to write_string to see whether the distinction between and NULL matters) but even so, this is the sort of thing that can lead to subtle bugs in the future. I'd suggest that instead, we store strlen + 1, so that the reader can tell the difference between NULL and . + + if (str) + write(str, len + 1); It's surprising to see you writing len + 1 bytes (so surprising that at first I thought it must be a mistake). Reading ahead to patch 3/10, it looks like you did this so that you could call ralloc_strdup() in read_string(). But that seems a little wasteful to me, not because we're wasting a byte but because it means that read_string() traverses the string twice: - First it reads the length - Then it checks that a null terminator exists in the expected place - Then it calls ralloc_strdup(), which walks the string to see how long it is (totally unnecessary, since the length is known) - ralloc_strdup() calls ralloc_array() to allocate memory for the result - ralloc_strdup() calls memcpy() to make a copy of the string - ralloc_strdup() stores a null terminator at the end of the copied string. Personally, I would just call ralloc_array() directly from read_string(), then memcpy() the contents of the string and store the null terminator. In which case there is no need to store the null terminator when writing the string. But I'm nit picking at this point, so I'm not going to be a stickler about it. If you would prefer to just add a comment explaining why you're serializing the null terminator, I would be ok with that too. + } + + unsigned position() + { + return pos; + } + + /** +* check if some data was written +*/ + bool data_was_written(void *data, uint32_t id) For functions that ask questions, I recommend using a word order that's unambiguously a question (e.g. was_data_written). data_was_written sounds more like a statement, so someone might misinterpret this function as a synonym for mark_data_written(). But I'm nit picking again here. With the other issues addressed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization
Whoops, I discovered another issue: On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 188 +++ 1 file changed, 188 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..979169f --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,188 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#include main/hash_table.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) + { + data_hash = _mesa_hash_table_create(0, int_equal); + hash_value = _mesa_hash_data(this, sizeof(memory_writer)); + } + + ~memory_writer() + { + free(memory); + _mesa_hash_table_destroy(data_hash, NULL); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) void write_ ##type (const type data) {\ + write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + void write_bool(bool data) + { + uint8_t val = data; + write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + void write(const void *data, int size) + { + if (!memory || pos (curr_size - size)) + if (!grow(size)) +return; + + memcpy(memory + pos, data, size); + + pos += size; + } + + void overwrite(const void *data, int size, int offset) + { + if (offset 0 || offset + size pos) + return; + memcpy(memory + offset, data, size); + } + + /* length is written to make reading safe */ + void write_string(const char *str) + { + uint32_t len = str ? strlen(str) : 0; + write_uint32_t(len); + + if (str) + write(str, len + 1); + } + + unsigned position() + { + return pos; + } + + /** +* check if some data was written +*/ + bool data_was_written(void *data, uint32_t id) + { + hash_entry *entry = + _mesa_hash_table_search(data_hash, hash_value, +(void*) (intptr_t) id); This isn't an efficient way to use a hashtable. You're always passing the same hash_value (the one computed in the constructor), which means that every single hash entry will be stored in the same chain, and lookups will be really slow. Since you're only using this hashtable to detect duplicate glsl_types, there's a much easier approach: take advantage of the fact that glsl_type is a flyweight class (in other words, code elsewhere in Mesa ensures that no duplicate glsl_type objects exist anywhere in memory, so any two glsl_type pointers can be compared simply using pointer equality). Because of that, you should be able to just use a pointer-based hashtable, like ir_variable_refcount_visitor does. In short, that
Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization
On 4 February 2014 19:52, Paul Berry stereotype...@gmail.com wrote: Whoops, I discovered another issue: On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 188 +++ 1 file changed, 188 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..979169f --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,188 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#include main/hash_table.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) + { + data_hash = _mesa_hash_table_create(0, int_equal); + hash_value = _mesa_hash_data(this, sizeof(memory_writer)); + } + + ~memory_writer() + { + free(memory); + _mesa_hash_table_destroy(data_hash, NULL); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) void write_ ##type (const type data) {\ + write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + void write_bool(bool data) + { + uint8_t val = data; + write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + void write(const void *data, int size) + { + if (!memory || pos (curr_size - size)) + if (!grow(size)) +return; + + memcpy(memory + pos, data, size); + + pos += size; + } + + void overwrite(const void *data, int size, int offset) + { + if (offset 0 || offset + size pos) + return; + memcpy(memory + offset, data, size); + } + + /* length is written to make reading safe */ + void write_string(const char *str) + { + uint32_t len = str ? strlen(str) : 0; + write_uint32_t(len); + + if (str) + write(str, len + 1); + } + + unsigned position() + { + return pos; + } + + /** +* check if some data was written +*/ + bool data_was_written(void *data, uint32_t id) + { + hash_entry *entry = + _mesa_hash_table_search(data_hash, hash_value, +(void*) (intptr_t) id); This isn't an efficient way to use a hashtable. You're always passing the same hash_value (the one computed in the constructor), which means that every single hash entry will be stored in the same chain, and lookups will be really slow. Since you're only using this hashtable to detect duplicate glsl_types, there's a much easier approach: take advantage of the fact that glsl_type is a flyweight class (in other words, code elsewhere in Mesa ensures that no duplicate glsl_type objects exist anywhere in memory, so any two glsl_type pointers can be compared simply using pointer equality). Because of that, you should be able to just use
Re: [Mesa-dev] [PATCH 02/10] glsl: serialize methods for IR instructions
On 29 January 2014 01:24, Tapani Pälli tapani.pa...@intel.com wrote: + + +/** + * Serialization of exec_list, writes length of the list + + * calls serialize_data for each instruction + */ +void +serialize_list(exec_list *list, memory_writer mem) +{ + uint32_t list_len = 0; + foreach_list(node, list) + list_len++; + + mem.write_uint32_t(list_len); + + foreach_list(node, list) + ((ir_instruction *)node)-serialize(mem); Nit pick: we could avoid having to walk the list twice by using mem.overwrite() to write out the length once we've written all the list elements. A similar comment applies to ir_call::serialize_data(). +} + + +static void +serialize_glsl_type(const glsl_type *type, memory_writer mem) +{ + uint32_t data_len = 0; + + mem.write_string(type-name); + + unsigned start_pos = mem.position(); + mem.write_uint32_t(data_len); + + uint32_t type_hash; + + /** +* notify reader if a user defined type exists already +* (has been serialized before) +*/ + uint8_t user_type_exists = 0; + + /* serialize only user defined types */ + switch (type-base_type) { + case GLSL_TYPE_ARRAY: + case GLSL_TYPE_STRUCT: + case GLSL_TYPE_INTERFACE: + break; + default: + goto glsl_type_serilization_epilogue; + } + With the changes I recommended in the last patch, I believe we can replace everything from here... + type_hash = _mesa_hash_data(type, sizeof(glsl_type)); + + /* check if this type has been written before */ + if (mem.data_was_written((void *)type, type_hash)) + user_type_exists = 1; + else + mem.mark_data_written((void *)type, type_hash); ...to here with: user_type_exists = mem.make_unique_id(type, type_hash); Although I'd recommend renaming type_hash to type_id so that there's no confusion about the fact that it's a unique ID and not a hash (hashes aren't guaranteed to be unique). + + mem.write_uint8_t(user_type_exists); + mem.write_uint32_t(type_hash); + + /* no need to write again ... */ + if (user_type_exists) + goto glsl_type_serilization_epilogue; Nit pick: how about instead of the goto, just do: if (!user_type_exists) { /* glsl type data */ mem.write_uint32_t((uint32_t) type-length); ... } data_len = mem.position() - start_pos - sizeof(data_len); ... + + /* glsl type data */ + mem.write_uint32_t((uint32_t)type-length); + mem.write_uint8_t((uint8_t)type-base_type); + mem.write_uint8_t((uint8_t)type-interface_packing); + + if (type-base_type == GLSL_TYPE_ARRAY) + serialize_glsl_type(type-element_type(), mem); + else if (type-base_type == GLSL_TYPE_STRUCT || + type-base_type == GLSL_TYPE_INTERFACE) { + glsl_struct_field *field = type-fields.structure; + for (unsigned k = 0; k type-length; k++, field++) { + mem.write_string(field-name); + serialize_glsl_type(field-type, mem); + mem.write_uint8_t((uint8_t)field-row_major); + mem.write_int32_t(field-location); + mem.write_uint8_t((uint8_t)field-interpolation); + mem.write_uint8_t((uint8_t)field-centroid); + } + } + +glsl_type_serilization_epilogue: + + data_len = mem.position() - start_pos - sizeof(data_len); + mem.overwrite(data_len, sizeof(data_len), start_pos); +} + + +void +ir_variable::serialize_data(memory_writer mem) +{ + /* name can be NULL, see ir_print_visitor for explanation */ + const char *non_null_name = name ? name : parameter; Since mem.write_string handles NULL strings, can we get rid of this? + int64_t unique_id = (int64_t) (intptr_t) this; + uint8_t mode = data.mode; + uint8_t has_constant_value = constant_value ? 1 : 0; + uint8_t has_constant_initializer = constant_initializer ? 1 : 0; + + serialize_glsl_type(type, mem); + + mem.write_string(non_null_name); + mem.write_int64_t(unique_id); + mem.write_uint8_t(mode); + + mem.write(data, sizeof(data)); + + mem.write_uint32_t(num_state_slots); + mem.write_uint8_t(has_constant_value); + mem.write_uint8_t(has_constant_initializer); + + for (unsigned i = 0; i num_state_slots; i++) { + mem.write_int32_t(state_slots[i].swizzle); + for (unsigned j = 0; j 5; j++) { + mem.write_int32_t(state_slots[i].tokens[j]); + } + } + + if (constant_value) + constant_value-serialize(mem); + + if (constant_initializer) + constant_initializer-serialize(mem); + + uint8_t has_interface_type = get_interface_type() ? 1 : 0; + + mem.write_uint8_t(has_interface_type); + if (has_interface_type) + serialize_glsl_type(get_interface_type(), mem); +} Everything's a nit pick except for the suggestion to use make_unique_id(). With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev
Re: [Mesa-dev] [PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
On 1 February 2014 21:25, Jordan Justen jljus...@gmail.com wrote: On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com wrote: --- src/glsl/main.cpp | 1 + src/glsl/standalone_scaffolding.cpp | 1 + src/mesa/main/context.c | 1 + src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py| 3 +++ src/mesa/main/mtypes.h | 1 + 6 files changed, 8 insertions(+) diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp index bb2054f..94bc1cc 100644 --- a/src/glsl/main.cpp +++ b/src/glsl/main.cpp @@ -53,6 +53,7 @@ initialize_context(struct gl_context *ctx, gl_api api) ctx-Const.MaxComputeWorkGroupSize[0] = 1024; ctx-Const.MaxComputeWorkGroupSize[1] = 1024; ctx-Const.MaxComputeWorkGroupSize[2] = 64; + ctx-Const.MaxComputeWorkGroupInvocations = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index e8eb529..0c83ea3 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -143,6 +143,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) ctx-Const.MaxComputeWorkGroupSize[0] = 1024; ctx-Const.MaxComputeWorkGroupSize[1] = 1024; ctx-Const.MaxComputeWorkGroupSize[2] = 64; + ctx-Const.MaxComputeWorkGroupInvocations = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index ebe27b4..942f247 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -695,6 +695,7 @@ _mesa_init_constants(struct gl_context *ctx) ctx-Const.MaxComputeWorkGroupSize[0] = 1024; ctx-Const.MaxComputeWorkGroupSize[1] = 1024; ctx-Const.MaxComputeWorkGroupSize[2] = 64; + ctx-Const.MaxComputeWorkGroupInvocations = 1024; } diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 6b914f4..bcbb5d5 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -382,6 +382,7 @@ EXTRA_EXT(ARB_texture_multisample); EXTRA_EXT(ARB_texture_gather); EXTRA_EXT(ARB_shader_atomic_counters); EXTRA_EXT(ARB_draw_indirect); +EXTRA_EXT(ARB_compute_shader); static const int extra_ARB_color_buffer_float_or_glcore[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 7f025a9..c18e848 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -742,6 +742,9 @@ descriptor=[ # GL_ARB_vertex_attrib_binding [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], [ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ], + +# GL_ARB_compute_shader + [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_ENUM(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ], CONTEXT_ENUM and CONTEXT_INT seem to follow the same code paths, but would CONTEXT_INT be a better fit here? You're right, CONTEXT_INT makes more sense here. I've changed it. Also, how do we decide between adding to gl_constants and just using CONST(1024)? My thought would be to use CONST unless we are pretty sure drivers will want to vary the value. My reasoning for using gl_constants is that the value of the constant is determined by a hardware limitation rather than a core Mesa software limitation, hence it's likely to vary from one back end to the next. I had a quick look through our uses of CONST and it looks like most of the time we use CONST it's when we're talking about a core Mesa software limitation (e.g. MAX_MODELVIEW_STACK_DEPTH is CONST(MAX_MODELVIEW_STACK_DEPTH) because that's the size of the per-context data structure we use to keep track of the modelview matrix stack. I admit that we haven't historically been terribly consistent about this decision, though :) 21-23 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com ]}, # Enums restricted to OpenGL Core profile diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d3f3a30..6481dc1 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3323,6 +3323,7 @@ struct gl_constants /** GL_ARB_compute_shader */ GLuint MaxComputeWorkGroupSize[3]; + GLuint MaxComputeWorkGroupInvocations; }; -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
On 1 February 2014 22:28, Jordan Justen jljus...@gmail.com wrote: On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com wrote: --- src/mesa/main/shaderapi.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 053f27b..680d449 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -663,6 +663,24 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, GLint *param *params = shProg-NumAtomicBuffers; return; + case GL_COMPUTE_WORK_GROUP_SIZE: { + int i; + if (!_mesa_is_desktop_gl(ctx) || !ctx-Extensions.ARB_compute_shader) + break; + if (!shProg-LinkStatus) { + _mesa_error(ctx, GL_INVALID_OPERATION, glGetProgramiv(program not + linked successfully)); Nit, but dropping successfully sounds better to me. It seems more likely that they tried to get the value before linking than that they ignored a link error. Sure, I can go along with that. I've changed it to just glGetProgramiv(program not linked). 24-26 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com + return; + } + if (shProg-_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) { + _mesa_error(ctx, GL_INVALID_OPERATION, glGetProgramiv(no compute + shaders)); + return; + } + for (i = 0; i 3; i++) + params[i] = shProg-Comp.LocalSize[i]; + return; + } default: break; } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.
On 1 February 2014 22:37, Jordan Justen jljus...@gmail.com wrote: On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry stereotype...@gmail.com wrote: --- src/mesa/drivers/dri/i965/brw_context.h | 8 src/mesa/drivers/dri/i965/brw_program.c | 11 +++ 2 files changed, 19 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index df32ccb..abc1783 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -316,6 +316,14 @@ struct brw_fragment_program { GLuint id; /** serial no. to identify frag progs, never re-used */ }; + +/** Subclass of Mesa compute program */ +struct brw_compute_program { + struct gl_compute_program program; + unsigned id; /** serial no. to identify frag progs, never re-used */ frag in comment Whoops, thanks. I've changed it to serial no. to identify compute progs Reviewed-by: Jordan Justen jordan.l.jus...@intel.com +}; + + struct brw_shader { struct gl_shader base; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 90844e5..2d92acb 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct gl_context *ctx, } } + case GL_COMPUTE_PROGRAM_NV: { + struct brw_compute_program *prog = CALLOC_STRUCT(brw_compute_program); + if (prog) { + prog-id = get_new_program_id(brw-intelScreen); + + return _mesa_init_compute_program(ctx, prog-program, target, id); + } else { + return NULL; + } + } + default: assert(!Unsupported target in brwNewProgram()); return NULL; -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 10/11] glsl: add a pass to convert out of SSA form
-else_instructions.push_tail(assign(var, phi-else_src)); + + phi-remove(); +} + +static void +eliminate_phi_loop_begin(ir_phi_loop_begin *phi, ir_loop *ir, exec_list *instrs) +{ + ir_variable *var = insert_decl(instrs, phi-dest-type, ralloc_parent(ir)); + ir-body_instructions.push_head(phi-dest); + phi-dest-insert_after(assign(phi-dest, var)); + phi-dest-data.mode = ir_var_temporary; + + if (phi-enter_src != NULL) + ir-insert_before(assign(var, phi-enter_src)); + + if (phi-repeat_src != NULL) + ir-body_instructions.push_tail(assign(var, phi-repeat_src)); + + foreach_list(n, phi-continue_srcs) { + ir_phi_jump_src *src = (ir_phi_jump_src *) n; + + if (src-src != NULL) +src-jump-insert_before(assign(var, src-src)); + } + + phi-remove(); +} + +static void +eliminate_phi_loop_end(ir_phi_loop_end *phi, ir_loop *ir, exec_list *instrs) +{ + ir_variable *var = insert_decl(instrs, phi-dest-type, ralloc_parent(ir)); + ir-insert_after(phi-dest); + phi-dest-insert_after(assign(phi-dest, var)); + phi-dest-data.mode = ir_var_temporary; + + foreach_list(n, phi-break_srcs) { + ir_phi_jump_src *src = (ir_phi_jump_src *) n; + + if (src-src != NULL) +src-jump-insert_before(assign(var, src-src)); + } + + phi-remove(); +} + +namespace { + +class ir_from_ssa_visitor : public ir_hierarchical_visitor +{ +public: + ir_from_ssa_visitor(exec_list *base_instrs) : base_instrs(base_instrs) + { + } + + virtual ir_visitor_status visit_leave(ir_if *); + virtual ir_visitor_status visit_enter(ir_loop *); + virtual ir_visitor_status visit_leave(ir_loop *); + virtual ir_visitor_status visit(ir_dereference_variable *); + +private: + exec_list *base_instrs; +}; + +}; + +ir_visitor_status +ir_from_ssa_visitor::visit_leave(ir_if *ir) +{ + foreach_list_safe(n, ir-phi_nodes) { + eliminate_phi_if((ir_phi_if *) n, ir, this-base_instrs); + } + + return visit_continue; +} + +ir_visitor_status +ir_from_ssa_visitor::visit_enter(ir_loop *ir) +{ + foreach_list_safe(n, ir-begin_phi_nodes) { + eliminate_phi_loop_begin((ir_phi_loop_begin *) n, ir, this-base_instrs); + } + + return visit_continue; +} + +ir_visitor_status +ir_from_ssa_visitor::visit_leave(ir_loop *ir) +{ + foreach_list_safe(n, ir-end_phi_nodes) { + eliminate_phi_loop_end((ir_phi_loop_end *) n, ir, this-base_instrs); + } + + return visit_continue; +} + +ir_visitor_status +ir_from_ssa_visitor::visit(ir_dereference_variable *ir) +{ + if (this-in_assignee ir-var-data.mode == ir_var_temporary_ssa) { + this-base_ir-insert_before(ir-var); + ir-var-data.mode = ir_var_temporary; + } + + return visit_continue; +} + +static void +convert_from_ssa_function(exec_list *instructions) +{ + ir_from_ssa_visitor v(instructions); + v.run(instructions); +} + +void +convert_from_ssa(exec_list *instructions) +{ + foreach_list(node, instructions) { + ir_instruction *ir = (ir_instruction *) node; + ir_function *f = ir-as_function(); + if (f) { +foreach_list(sig_node, f-signatures) { + ir_function_signature *sig = (ir_function_signature *) sig_node; + + convert_from_ssa_function(sig-body); +} + } + } +} It looks like the only reason you need to manually walk through the functions and signatures is so that you can make sure that ir_from_ssa_visitor::base_instrs always points to the correct function's instruction list. Normally the way we achieve that sort of thing is to do: ir_visitor_status ir_from_ssa_visitor::visit_enter(ir_function_signature *ir) { this-base_instrs = ir-body; return visit_continue; } ir_visitor_status ir_from_ssa_visitor::visit_leave(ir_function_signature *ir) { this-base_instrs = NULL; return visit_continue; } That way you could get rid of convert_from_ssa_function() and convert_from_ssa() would just become: void convert_from_ssa(exec_list *instructions) { ir_from_ssa_visitor v; v.run(instructions); } I don't have really strong feelings about it, though. Provided that the confusing comment in eliminate_phi_if() is fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 00/11] glsl: add Single Static Assignment (SSA)
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote: This series enables GLSL IR support for SSA, including passes to convert to and from SSA form. SSA is a form of the intermediate representation of a compiler in which each variable is assigned exactly once. SSA form makes many optimizations faster and easier to write, and enables other more powerful optimizations. SSA is used in GCC [1] and LLVM [2] as well as various compiler backends within Mesa itself, such as r600g-sb and Nouveau. Adding support for SSA will allow the various optimizations these backends perform to be implemented in one place, instead of making each driver reinvent the wheel (as several have already done). Additionally, all new backends would recieve these optimizations, reducing the burden of writing a compiler backend for a new driver. Even though no optimization passes are now implemented, I am putting out this series to solicit feedback on the design, to make sure I don't have to rewrite things before I go ahead and write these new passes. There are no piglit regressions on Softpipe, except for the spec/OpenGL 2.0/max-samplers test, which only passed before because the compiler happened to unroll the loop; the extra copies caused by the conversion to and from SSA stop the compiler from unrolling, meaning that the resulting GLSL IR code contains an indirect sampler index which glsl-to-tgsi can't handle. I had a detailed look at your patches and I really like where you're going with this. I sent a lot of feedback in response to patches 4, 5, 7, 9, and 10; I don't think any of my feedback would require a major change to your overall plan. Nice work! Consider patches 1, 2, 3, 6, 8, and 11: Reviewed-by: Paul Berry stereotype...@gmail.com Note, however, that probably patch 11 should be postponed until we've written something that makes use of the SSA form (i.e. the GVN-GCM algorithm you mentioned, or a direct conversion from GLSL IR to an SSA backend). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] glsl: Fix continue statements in do-while loops.
From the GLSL 4.40 spec, section 6.4 (Jumps): The continue jump is used only in loops. It skips the remainder of the body of the inner most loop of which it is inside. For while and do-while loops, this jump is to the next evaluation of the loop condition-expression from which the loop continues as previously defined. Previously, we incorrectly treated a continue statement as jumping to the top of a do-while loop. This patch fixes the problem by replicating the loop condition when converting the continue statement to IR. (We already do a similar thing in for loops, to ensure that continue causes the loop expression to be executed). Fixes piglit tests: - glsl-fs-continue-inside-do-while.shader_test - glsl-vs-continue-inside-do-while.shader_test - glsl-fs-continue-in-switch-in-do-while.shader_test - glsl-vs-continue-in-switch-in-do-while.shader_test Cc: mesa-sta...@lists.freedesktop.org --- src/glsl/ast_to_hir.cpp | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 950f513..8d096ad 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4029,17 +4029,22 @@ ast_jump_statement::hir(exec_list *instructions, _mesa_glsl_error( loc, state, break may only appear in a loop or a switch); } else { -/* For a loop, inline the for loop expression again, - * since we don't know where near the end of - * the loop body the normal copy of it - * is going to be placed. +/* For a loop, inline the for loop expression again, since we don't + * know where near the end of the loop body the normal copy of it is + * going to be placed. Same goes for the condition for a do-while + * loop. */ if (state-loop_nesting_ast != NULL -mode == ast_continue -state-loop_nesting_ast-rest_expression) { - state-loop_nesting_ast-rest_expression-hir(instructions, - state); -} +mode == ast_continue) { +if (state-loop_nesting_ast-rest_expression) { + state-loop_nesting_ast-rest_expression-hir(instructions, + state); +} +if (state-loop_nesting_ast-mode == +ast_iteration_statement::ast_do_while) { + state-loop_nesting_ast-condition_to_hir(instructions, state); +} + } if (state-switch_state.is_switch_innermost mode == ast_break) { -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glsl: Make condition_to_hir() callable from outside ast_iteration_statement.
In addition to making it public, we also need to change its first argument from an ir_loop * to an exec_list *, so that it can be used to insert the condition anywhere in the IR (rather than just in the body of the loop). This will be necessary in order to make continue statements work properly in do-while loops. Cc: mesa-sta...@lists.freedesktop.org --- src/glsl/ast.h | 3 +-- src/glsl/ast_to_hir.cpp | 10 +- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/src/glsl/ast.h b/src/glsl/ast.h index 0bda28d..2d6f3a2 100644 --- a/src/glsl/ast.h +++ b/src/glsl/ast.h @@ -888,14 +888,13 @@ public: ast_node *body; -private: /** * Generate IR from the condition of a loop * * This is factored out of ::hir because some loops have the condition * test at the top (for and while), and others have it at the end (do-while). */ - void condition_to_hir(class ir_loop *, struct _mesa_glsl_parse_state *); + void condition_to_hir(exec_list *, struct _mesa_glsl_parse_state *); }; diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 1bfb4e5..950f513 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4369,14 +4369,14 @@ ast_case_label::hir(exec_list *instructions, } void -ast_iteration_statement::condition_to_hir(ir_loop *stmt, +ast_iteration_statement::condition_to_hir(exec_list *instructions, struct _mesa_glsl_parse_state *state) { void *ctx = state; if (condition != NULL) { ir_rvalue *const cond = -condition-hir( stmt-body_instructions, state); +condition-hir(instructions, state); if ((cond == NULL) || !cond-type-is_boolean() || !cond-type-is_scalar()) { @@ -4397,7 +4397,7 @@ ast_iteration_statement::condition_to_hir(ir_loop *stmt, new(ctx) ir_loop_jump(ir_loop_jump::jump_break); if_stmt-then_instructions.push_tail(break_stmt); -stmt-body_instructions.push_tail(if_stmt); +instructions-push_tail(if_stmt); } } } @@ -4432,7 +4432,7 @@ ast_iteration_statement::hir(exec_list *instructions, state-switch_state.is_switch_innermost = false; if (mode != ast_do_while) - condition_to_hir(stmt, state); + condition_to_hir(stmt-body_instructions, state); if (body != NULL) body-hir( stmt-body_instructions, state); @@ -4441,7 +4441,7 @@ ast_iteration_statement::hir(exec_list *instructions, rest_expression-hir( stmt-body_instructions, state); if (mode == ast_do_while) - condition_to_hir(stmt, state); + condition_to_hir(stmt-body_instructions, state); if (mode != ast_do_while) state-symbols-pop_scope(); -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/7] glsl: parse invocations layout qualifier for ARB_gpu_shader5
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote: _mesa_glsl_parse_state::gs_invocations will store the invocation count. Signed-off-by: Jordan Justen jordan.l.jus...@intel.com This looks like it will work if the shader contains a line like this: layout(triangles, invocations=6) in; But it won't handle this: layout(triangles) in; layout(invocations=6) in; Although it's not explicitly mentioned in the GLSL spec, I believe that this should be allowed too. (The spec does explicitly allow similar splitting of geometry shader output layout qualifiers.) Note that the way we handle geometry shader input primitive types is unusual, so it might not have been the best to imitate. For other types of shader-global layout qualifiers (i.e. geometry shader output primitive count and geometry shader max_vertices), the code in glsl_parser.yy simply merges the qualifiers into the global state-out_qualifier object at the time that they are parsed. The reason we handle geometry shader input primitive types specially is because they have additional semantics (they cause geometry shader input arrays to be resized), so we can't process them until ast-to-hir time. But that doesn't apply to the invocation count. I'd recommend an in_qualifier field to _mesa_glsl_parse_state, and handling invocations the same way that geometry shader output qualifiers are handled (grep for -out_qualifier to how this works). Unfortuneatly, because of the extra semantics associated with input primitive types, we'll still have to process them specially. So I think the code in glsl_parser.yy will need to look something like this: | layout_qualifier IN_TOK ';' { void *ctx = state; $$ = NULL; if (state-stage != MESA_SHADER_GEOMETRY) { _mesa_glsl_error( @1, state, input layout qualifiers only valid in geometry shaders); } else { if ($1.flags.q.prim_type) { /* Make sure this is a valid input primitive type. */ switch ($1.prim_type) { case GL_POINTS: case GL_LINES: case GL_LINES_ADJACENCY: case GL_TRIANGLES: case GL_TRIANGLES_ADJACENCY: $$ = new(ctx) ast_gs_input_layout(@1, $1); break; default: _mesa_glsl_error(@1, state, invalid geometry shader input primitive type); break; } } if (!state-in_qualifier-merge_qualifier( @1, state, $1)) YYERROR; } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/7] glsl/linker: produce gl_shader_program Geom.Invocations
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote: Grab the parsed invocation count, check for consistency during linking, and finally save the result in gl_shader_program Geom.Invocations. Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- src/glsl/glsl_parser_extras.cpp | 2 ++ src/glsl/linker.cpp | 18 ++ src/mesa/main/mtypes.h | 2 ++ 3 files changed, 22 insertions(+) diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 87784ed..3e98966 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -1339,6 +1339,8 @@ set_shader_inout_layout(struct gl_shader *shader, if (state-out_qualifier-flags.q.max_vertices) shader-Geom.VerticesOut = state-out_qualifier-max_vertices; + shader-Geom.Invocations = state-gs_invocations; + Considering my comment on patch 1, I think this will change to: shader-Geom.Invocations = 0; if (state-in_qualifier-flags.q.invocations) shader-Geom.Invocations = state-in_qualifier-invocations; if (state-gs_input_prim_type_specified) { shader-Geom.InputType = state-gs_input_prim_type; } else { diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 93b4754..800de0b 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -1206,6 +1206,7 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program *prog, unsigned num_shaders) { linked_shader-Geom.VerticesOut = 0; + linked_shader-Geom.Invocations = 0; linked_shader-Geom.InputType = PRIM_UNKNOWN; linked_shader-Geom.OutputType = PRIM_UNKNOWN; @@ -1259,6 +1260,18 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program *prog, } linked_shader-Geom.VerticesOut = shader-Geom.VerticesOut; } + + if (shader-Geom.Invocations != 0) { +if (linked_shader-Geom.Invocations != 0 +linked_shader-Geom.Invocations != shader-Geom.Invocations) { + linker_error(prog, geometry shader defined with conflicting +invocation count (%d and %d)\n, +linked_shader-Geom.Invocations, +shader-Geom.Invocations); + return; +} +linked_shader-Geom.Invocations = shader-Geom.Invocations; + } } /* Just do the intrastage - interstage propagation right now, @@ -1285,6 +1298,11 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program *prog, return; } prog-Geom.VerticesOut = linked_shader-Geom.VerticesOut; + + if (linked_shader-Geom.Invocations == 0) + linked_shader-Geom.Invocations = 1; I think it would be equivalent if we dropped these two lines and simply initialized linked_shader-Geom.Invocations to 1 at the top of the function--is that right? + + prog-Geom.Invocations = linked_shader-Geom.Invocations; } /** diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 3d42a21..2d90b0a 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2401,6 +2401,7 @@ struct gl_shader */ struct { GLint VerticesOut; + GLint Invocations; It would be nice to have a comment above this declaration explaining that if the shader didn't specify an invocation count, this value will be 1 (so that back-end implementors know they don't have to coerce 0 to 1). /** * GL_POINTS, GL_LINES, GL_LINES_ADJACENCY, GL_TRIANGLES, or * GL_TRIANGLES_ADJACENCY, or PRIM_UNKNOWN if it's not set in this @@ -2599,6 +2600,7 @@ struct gl_shader_program struct { GLint VerticesIn; GLint VerticesOut; + GLint Invocations; Same comment applies here. GLenum InputType; /** GL_POINTS, GL_LINES, GL_LINES_ADJACENCY_ARB, GL_TRIANGLES, or GL_TRIANGLES_ADJACENCY_ARB */ GLenum OutputType; /** GL_POINTS, GL_LINE_STRIP or GL_TRIANGLE_STRIP */ -- 1.8.5.3 With those minor issues addressed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/7] main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote: Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/main/shaderapi.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index a8336c9..fb107d5 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -603,6 +603,12 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, GLint *param if (check_gs_query(ctx, shProg)) *params = shProg-Geom.VerticesOut; return; + case GL_GEOMETRY_SHADER_INVOCATIONS: + if (!has_core_gs) This needs to be: if (!has_core_gs || !ctx-Extensions.ARB_gpu_shader5) so that the query will generate the proper error message on platforms that don't support ARB_gpu_shader5. + break; + if (check_gs_query(ctx, shProg)) + *params = shProg-Geom.Invocations; + return; case GL_GEOMETRY_INPUT_TYPE: if (!has_core_gs) break; -- 1.8.5.3 With that fixed, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/7] i965: support gl_InvocationID for gen7
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote: Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_defines.h | 5 + src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 24 --- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 7f4cd10..5fe1aba 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1500,6 +1500,11 @@ enum brw_message_target { # define BRW_GS_EDGE_INDICATOR_0 (1 8) # define BRW_GS_EDGE_INDICATOR_1 (1 9) +/* GS Thread Payload + */ +/* R0 */ +# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27 + /* 3DSTATE_GS Output Vertex Size has an effective maximum of 62. It's * counted in multiples of 16 bytes. */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 40743cc..12e137c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -51,9 +51,27 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw, dst_reg * vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir) { - /* Geometry shaders don't use any system values. */ - assert(!Unreached); - return NULL; + dst_reg *reg; + src_reg r0(retype(brw_vec4_grf(0, 0), BRW_REGISTER_TYPE_UD)); + + switch (ir-data.location) { + case SYSTEM_VALUE_INVOCATION_ID: + this-current_annotation = initialize gl_InvocationID; + + reg = new(mem_ctx) dst_reg(this, ir-type); + + /* Copy and shift gen7 instance id from R0 into the + * gl_InvocationID register. + */ + emit(SHR(*reg, src_reg(r0), + (uint32_t) GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT)); + break; + default: + assert(!not reached); + break; + } + + return reg; } This would do what you want if invocation ID 0 was delivered in the upper bits of R0.0 and invocation ID 1 was delivered in the upper bits of R0.4. But that's not the case. They are delivered in R0.0 and R0.1 respectively. I think you need to generate this instruction: SHR (8) dst1 R01;4,0 GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT Note that this instruction requires ALIGN1 mode and a goofy register region, so you can't generate it directly from the visitor; what you'll have to do instead is create a new opcode to do it. If you want to see examples where we've done similar things, grep for GS_OPCODE_SET_WRITE_OFFSET, GS_OPCODE_SET_VERTEX_COUNT, GS_OPCODE_SET_DWORD_2_IMMED, GS_OPCODE_PREPARE_CHANNEL_MASKS, or GS_OPCODE_SET_CHANNEL_MASKS. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 7/7] i965: support instanced GS on gen7
On 28 January 2014 11:22, Jordan Justen jordan.l.jus...@intel.com wrote: Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_gs.c | 2 ++ src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 -- src/mesa/drivers/dri/i965/gen7_gs_state.c | 2 ++ 5 files changed, 11 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 8d098e6..ade4458 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -639,6 +639,8 @@ struct brw_gs_prog_data bool include_primitive_id; + int invocations; + /** * True if the thread should be dispatched in DUAL_INSTANCE mode, false if * it should be dispatched in DUAL_OBJECT mode. diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 5fe1aba..eaf6e8f 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1478,6 +1478,7 @@ enum brw_message_target { # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT 0 # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID 1 # define GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT20 +# define GEN7_GS_INSTANCE_CONTROL_SHIFT15 # define GEN7_GS_DISPATCH_MODE_SINGLE (0 11) # define GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE (1 11) # define GEN7_GS_DISPATCH_MODE_DUAL_OBJECT (2 11) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index abc181b..3c6393f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -48,6 +48,8 @@ do_gs_prog(struct brw_context *brw, c.prog_data.include_primitive_id = (gp-program.Base.InputsRead VARYING_BIT_PRIMITIVE_ID) != 0; + c.prog_data.invocations = gp-program.Invocations; + /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed * by the state cache. diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 12e137c..21dbc29 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -596,9 +596,11 @@ brw_gs_emit(struct brw_context *brw, } /* Compile the geometry shader in DUAL_OBJECT dispatch mode, if we can do -* so without spilling. +* so without spilling. If the GS invocations count 1, then we can't use +* dual object mode. */ - if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { + if (c-prog_data.invocations = 1 || + likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { This should be , not ||. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com I sent comments on patches 1, 2, 4, 6, and 7. Patches 3 and 5 are: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote: This patch introduces all the changes to the IR that are necessary for representing programs in the SSA form. This consists of a new variable mode, the SSA temporary, which is guarenteed to be written to exactly once, and classes to represent phi nodes in the IR. In the current code, variables are first declared using an ir_variable instruction inserted into the instruction stream, and then every dereference will point to the ir_variable declared earlier. SSA temporaries, however, do not work this way. Instead, the variable is declared when it is assigned. That is, the variable is owned by the one and only instruction where it is defined. In SSA, phi nodes may exist at the beginning of any join nodes, or basic blocks with more than one predecessor. In our IR, this can happen in one of three places: - After an if statement, where the then branch and the else branch converge. - At the beginning of a loop, which can be reached from either before the loop (on the first iteration), the end of the loop (when we get to the end of the loop and jump back to the beginning), or any continue statement. - At the end of a loop, which can be reached from any break statement within the loop. Accordingly, there are three different types of phi nodes: if phi nodes, phi nodes at the beginning of a loop, and phi nodes at the end of a loop, all of which derive from the ir_phi base class. --- src/glsl/ir.cpp| 56 +++ src/glsl/ir.h | 196 - src/glsl/ir_clone.cpp | 147 --- src/glsl/ir_hierarchical_visitor.cpp | 36 + src/glsl/ir_hierarchical_visitor.h | 11 ++ src/glsl/ir_hv_accept.cpp | 55 ++- src/glsl/ir_print_visitor.cpp | 196 - src/glsl/ir_print_visitor.h| 15 ++ src/glsl/ir_validate.cpp | 158 +++- src/glsl/ir_visitor.h | 8 + src/mesa/drivers/dri/i965/brw_fs.h | 4 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 28 src/mesa/drivers/dri/i965/brw_vec4.h | 4 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 24 +++ src/mesa/program/ir_to_mesa.cpp| 28 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 16 files changed, 956 insertions(+), 39 deletions(-) diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index 1a36bd6..f1ded80 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -1229,6 +1229,37 @@ ir_loop::ir_loop() } +ir_phi::ir_phi() +{ + this-dest = NULL; +} Rather than have a no-argument constructor for ir_phi that sets this-dest to NULL, and then have the derived classes set dest to the appropriate value, let's just do: ir_phi::ir_phi(ir_variable *dest) : dest(dest) { } Then in the derived classes we can just pass dest on through, e.g.: ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src, ir_variable *else_src) : ir_phi(dest), if_src(if_src), else_src(else_src) { this-ir_type = ir_type_phi_if; } + + +ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src, +ir_variable *else_src) + : if_src(if_src), else_src(else_src) +{ + this-ir_type = ir_type_phi_if; + this-dest = dest; +} + + +ir_phi_loop_begin::ir_phi_loop_begin(ir_variable* dest, ir_variable* enter_src, +ir_variable* repeat_src) + : enter_src(enter_src), repeat_src(repeat_src) +{ + this-ir_type = ir_type_phi_loop_begin; + this-dest = dest; +} + + +ir_phi_loop_end::ir_phi_loop_end(ir_variable *dest) +{ + this-ir_type = ir_type_phi_loop_end; + this-dest = dest; +} + + ir_dereference_variable::ir_dereference_variable(ir_variable *var) { assert(var != NULL); @@ -1554,6 +1585,9 @@ ir_variable::ir_variable(const struct glsl_type *type, const char *name, this-data.max_array_access = 0; this-data.atomic.buffer_index = 0; this-data.atomic.offset = 0; + this-ssa_assignment = NULL; + this-ssa_phi = NULL; + this-ssa_call = NULL; if (type != NULL) { if (type-base_type == GLSL_TYPE_SAMPLER) @@ -1722,12 +1756,19 @@ steal_memory(ir_instruction *ir, void *new_ctx) { ir_variable *var = ir-as_variable(); ir_constant *constant = ir-as_constant(); + ir_dereference_variable *deref = ir-as_dereference_variable(); + ir_phi *phi = ir-as_phi(); + ir_phi_loop_begin *phi_loop_begin = ir-as_phi_loop_begin(); + ir_phi_loop_end *phi_loop_end = ir-as_phi_loop_end(); if (var != NULL var-constant_value != NULL) steal_memory(var-constant_value, ir); if (var != NULL var-constant_initializer != NULL) steal_memory(var-constant_initializer, ir); + if
Re: [Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote: diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp index cb732a5..7075579 100644 --- a/src/glsl/ir_clone.cpp +++ b/src/glsl/ir_clone.cpp @@ -40,7 +40,15 @@ ir_rvalue::clone(void *mem_ctx, struct hash_table *ht) const ir_variable * ir_variable::clone(void *mem_ctx, struct hash_table *ht) const { - ir_variable *var = new(mem_ctx) ir_variable(this-type, this-name, + ir_variable *var; + + if (ht) { + var = (ir_variable *)hash_table_find(ht, this); + if (var) +return var; + } Can you add a comment explaining why it's necessary to do a hashtable lookup now? (I'm guessing it's because a phi node can refer to an SSA temporary that occurs later in the instruction stream, and therefore when we go to clone an SSA temporary variable, it's possible that we've already cloned it, but I'm not 100% certain about that). + + var = new(mem_ctx) ir_variable(this-type, this-name, (ir_variable_mode) this-data.mode); Let's fix up this line to align the opening paren with the initial t in this-type. + +ir_phi * +ir_phi::clone(void *mem_ctx, struct hash_table *ht) const +{ + /* shouldn't get here */ + assert(0); + return new(mem_ctx) ir_phi(); +} There's no need to implement this function. If we leave it out (and leave out the corresponding declaration of clone() in ir.h's declaration of the ir_phi class), then ir_phi will be an abstract base class, and the compiler will make sure we never instantiate it. +static ir_phi_jump_src * +clone_phi_jump_src(ir_phi_jump_src *src, void *mem_ctx, struct hash_table *ht) +{ + ir_phi_jump_src *new_src = new(mem_ctx) ir_phi_jump_src(); + new_src-src = src-src-clone(mem_ctx, ht); + new_src-jump = src-jump-clone(mem_ctx, ht); + return new_src; Let's make a constructor for ir_phi_jump_src that takes src and jump as arguments, so that we can just do: return new(mem_ctx) ir_phi_jump_src(src-src-clone(mem_ctx, ht), src-jump-clone(mem_ctx, ht)); + + +void +ir_print_visitor::visit(ir_phi *ir) +{ + printf(error); +} I think if we make ir_phi an abstract base class (like I suggested earlier) we can drop this function entirely, as well as all the other visit(class ir_phi *) functions introduced in this patch. + + +void +ir_print_visitor::visit(ir_phi_if *ir) +{ + printf((phi\n); For consistency with the class names, I'd prefer for this to be: printf((phi_if\n); + + indentation++; + indent(); + ir-dest-accept(this); + printf(\n); + + indent(); + if (ir-if_src) { + printf(%s , unique_name(ir-if_src)); + } else { + printf((undefined) ); + } + if (ir-else_src) { + printf(%s), unique_name(ir-else_src)); + } else { + printf((undefined))); + } Would it be worth modifying unique_name() so that it returns (undefined) if it's passed a null pointer? That would allow us to simplify the 10 lines above to: printf(%s %s, unique_name(ir-if_src), unique_name(ir-else_src)); As well as similar simplifications in the functions below. + indentation--; +} + +void +ir_print_visitor::print_phi_jump_src(ir_phi_jump_src *src) +{ + printf((%s , unique_name(src-jump)); I think this is going to make the output IR difficult to follow, since it won't be obvious that (break@1 ...) represents a phi node rather than a break instruction. I'd recommend changing this to: printf((phi_jump_src %s , unique_name(src-jump)); + if (src-src) { + printf( %s)\n, unique_name(src-src)); + } else { + printf( (undefined))\n); + } +} + +void +ir_print_visitor::visit(ir_phi_loop_begin *ir) +{ + printf((phi\n); As with ir_phi_if, I'd recommend changing this to: printf((phi_loop_begin\n); A similar comment applies to ir_print_visitor::visit(ir_phi_loop_end *). + + indentation++; + indent(); + ir-dest-accept(this); + printf(\n); + + indent(); + if (ir-enter_src) { + printf(%s , unique_name(ir-enter_src)); + } else { + printf((undefined) ); + } + if (ir-repeat_src) { + printf(%s\n, unique_name(ir-repeat_src)); + } else { + printf((undefined)\n); + } + + foreach_list(n, ir-continue_srcs) { + ir_phi_jump_src *src = (ir_phi_jump_src *) n; + indent(); + print_phi_jump_src(src); + } + + indentation--; + indent(); + printf()); +} + + + + +ir_visitor_status +ir_validate::visit(ir_phi_loop_begin *ir) +{ + validate_phi_dest(ir, this-ht); + + /* +* note: ir_phi_loop_begin is a special case in that variables may be, and in +* fact normally are, defined *after* they are used in everything except +* enter_src +*/ + + if (ir-enter_src) { + validate_var_use(ir-enter_src, this-ht, (void *) ir, ir_phi_loop_begin); + } +
Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.
On 28 January 2014 13:15, Carl Worth cwo...@cworth.org wrote: Paul Berry stereotype...@gmail.com writes: Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback functions were using ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] I see that this commit was nominated for the stable branch when it was committed. However, it appears to depend at least on commit b22146dc714 which is on I think you mean 3b22146dc714 (mesa: Replace ctx-Shader.Current{Vertex,Fragment,Geometry}Program with an array.). master, but is not on the 10.0 branch. So this patch cannot be applied as-is to 10.0. My current inclination is to simply drop this patch from the list of candidates for 10.0. If you think differently, please recommend an alternate plan. -Carl (Also from IRC: cworth stereotype441: In my recent mail, I should have included 43e77215b13 as well. (It's another patch I plan to drop from stable unless you guide me differently.)) In principle we could make versions of these patches that don't depend on commit 3b22146dc714, but I don't think it's worth it--the bugs I've fixed in these patches are unlikely to arise until ARB_separate_shader_objects is supported. So yeah, I'm ok with dropping these two patches from the list of candidates for 10.0. I believe it was Ken who initially suggested marking these patches as candidates for stable. Cc'ing him in case he wants to express a dissenting opinion. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 04/11] glsl: add dead branch analysis
conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +/** + * \file ir_dead_branches.h + * + * Provides a visitor which determines, for each if instruction, whether + * control will never flow the from the then-block or else-block + * to the next instruction due to jump statements (break, continue, return, + * discard). + */ + +#include ir.h +#include ir_visitor.h + +class ir_dead_branches It would be nice to have a comment above this class indicating that one ir_dead_branches object is created for each ir_if. (It's clear from context, but for a casual user of this visitor it might not be obvious). +{ +public: + ir_dead_branches(ir_if *ir); + + ir_if *ir; + + /** +* whether a jump statement is guarenteed to be hit when the +* then_instructions are run, making the branch from the then_instructions +* dead +*/ + bool then_dead; + bool else_dead; /** ditto for the else_instructions */ + + /** whether the then branch is dead due to a return or discard */ This comment is ambiguous. Some people interpret a boolean which is documented as whether A or B as meaning A if the bool is true; B if the bool is false. I'd recommend changing this comment to something like True if the then branch is dead due to a return or discard; false if the then branch is dead due to a continue. + bool then_dead_return; + bool else_dead_return; /** ditto for else branch */ +}; + +class ir_dead_branches_visitor : public ir_hierarchical_visitor It would be nice to have a short comment above this class giving a brief overview of how the class is intended to be used (i.e. construct it, visit the IR using it, then call get_dead_branches() to retrieve the dead branch information corresponding to a given ir_if block). +{ +public: + ir_dead_branches_visitor(); + ~ir_dead_branches_visitor(); + + virtual ir_visitor_status visit_enter(ir_if *); + virtual ir_visitor_status visit_enter(ir_loop *); + virtual ir_visitor_status visit(ir_loop_jump *); + virtual ir_visitor_status visit_enter(ir_return *); + virtual ir_visitor_status visit_enter(ir_discard *); + + ir_dead_branches *get_dead_branches(ir_if *ir); + +private: + void visit_return(); + + ir_if *outer_if; + bool in_loop; It would be nice to have a comment above in_loop explaining that it is only true if we're visiting code inside a loop that is contained within outer_if. (That is, when we visit an if that's nested inside a loop, we reset in_loop to false). Without such a comment, someone might erroneously think that in_loop is true whenever we're visiting code that is contained within a loop (to arbitrarily deep nesting), which isn't true. + bool in_then; Similarly, it would be nice to have a comment above in_loop explaining that it indicates whether we are traversing the then branch or the else branch of the if statement indicated by outer_if. Without this comment, someone might think that in_then is true whenever we're visiting code that is contained within the then branch of any if statement, to arbitrarily deep nesting, which isn't true. + + struct hash_table *ht; It would be nice to have a comment above this field explaining what the keys and the values in the hash table are. +}; -- 1.8.3.1 One final comment: I notice that the only way in which ir_dead_branches_visitor::outer_if is used (other than comparing it to NULL) is to pass it to get_dead_branches(). How about if instead of storing outer_if in the ir_dead_branches_visitor class, we simply store a pointer to the associated ir_dead_branches object. That way we would avoid a lot of unnecessary hashtable lookups. But I don't have a terribly strong feeling about that. With all the comment changes, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 05/11] glsl: add loop jump visitor
On 22 January 2014 09:16, Connor Abbott cwabbo...@gmail.com wrote: +class ir_loop_jumps_visitor : public ir_hierarchical_visitor +{ +public: + ir_loop_jumps_visitor(); + ~ir_loop_jumps_visitor(); + + virtual ir_visitor_status visit_enter(ir_loop *); + virtual ir_visitor_status visit(ir_loop_jump *); + + ir_loop_jumps *get_loop_jumps(ir_loop *ir); + +private: + ir_loop *outer_loop; As with outer_if in the previous patch, I think we could save some unnecessary hashtable lookups if we simply stored the ir_loop_jumps * here instead of storing the ir_loop *. But I don't feel terribly strongly about it. Either way, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.
On 22 January 2014 04:50, Francisco Jerez curroje...@riseup.net wrote: Paul Berry stereotype...@gmail.com writes: On 15 January 2014 11:42, Francisco Jerez curroje...@riseup.net wrote: v2: Reuse the glsl_sampler_dim enum for images. Reuse the glsl_type::sampler_* fields instead of creating new ones specific to image types. Reuse the same constructor as for samplers adding a new 'base_type' argument. --- Is this what you had in mind Paul? Yes, that seems reasonable. Thanks. Does your previous reviewed-by still apply for v2 of this patch? Thanks. Yes. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.
Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback functions were using ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the program that would be the source of transform feedback data. This isn't correct--if there's a geometry shader present it should be ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]. (These might be different if separate shader objects are in use). This patch creates a function get_xfb_source(), which figures out the correct program to use based on GL state, and updates _mesa_{Begin,Resume}TransformFeedback to call it. get_xfb_source() is written in terms of the gl_shader_stage enum, so it should not need modification when we add tessellation shaders in the future. It also creates a new driver flag, NewTransformFeedbackProg, which is flagged whenever this program changes. To reduce future confusion, this patch also rewords some comments and error message text to avoid referring to vertex shaders. --- src/mesa/main/mtypes.h| 8 -- src/mesa/main/transformfeedback.c | 52 +-- 2 files changed, 45 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 3dd9678..7fd3298 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1815,8 +1815,9 @@ struct gl_transform_feedback_object /** * The shader program active when BeginTransformFeedback() was called. -* When active and unpaused, this equals -* ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]. +* When active and unpaused, this equals ctx-Shader.CurrentProgram[stage], +* where stage is the pipeline stage that is the source of data for +* transform feedback. */ struct gl_shader_program *shader_program; @@ -3779,6 +3780,9 @@ struct gl_driver_flags /** gl_context::TransformFeedback::CurrentObject */ GLbitfield NewTransformFeedback; + /** gl_context::TransformFeedback::CurrentObject::shader_program */ + GLbitfield NewTransformFeedbackProg; + /** gl_context::RasterDiscard */ GLbitfield NewRasterizerDiscard; diff --git a/src/mesa/main/transformfeedback.c b/src/mesa/main/transformfeedback.c index 74897ba..9376a9e 100644 --- a/src/mesa/main/transformfeedback.c +++ b/src/mesa/main/transformfeedback.c @@ -24,7 +24,7 @@ /* - * Vertex transform feedback support. + * Transform feedback support. * * Authors: * Brian Paul @@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices( **/ +/** + * Figure out which stage of the pipeline is the source of transform feedback + * data given the current context state, and return its gl_shader_program. + * + * If no active program can generate transform feedback data (i.e. no vertex + * shader is active), returns NULL. + */ +static struct gl_shader_program * +get_xfb_source(struct gl_context *ctx) +{ + int i; + for (i = MESA_SHADER_FRAGMENT - 1; i = MESA_SHADER_VERTEX; i--) { + if (ctx-Shader.CurrentProgram[i] != NULL) + return ctx-Shader.CurrentProgram[i]; + } + return NULL; +} + + void GLAPIENTRY _mesa_BeginTransformFeedback(GLenum mode) { struct gl_transform_feedback_object *obj; - struct gl_transform_feedback_info *info; + struct gl_transform_feedback_info *info = NULL; + struct gl_shader_program *source; GLuint i; unsigned vertices_per_prim; GET_CURRENT_CONTEXT(ctx); obj = ctx-TransformFeedback.CurrentObject; - if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) { + /* Figure out what pipeline stage is the source of data for transform +* feedback. +*/ + source = get_xfb_source(ctx); + if (source == NULL) { _mesa_error(ctx, GL_INVALID_OPERATION, glBeginTransformFeedback(no program active)); return; } - info = - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkedTransformFeedback; + info = source-LinkedTransformFeedback; if (info-NumOutputs == 0) { _mesa_error(ctx, GL_INVALID_OPERATION, @@ -452,7 +475,10 @@ _mesa_BeginTransformFeedback(GLenum mode) obj-GlesRemainingPrims = max_vertices / vertices_per_prim; } - obj-shader_program = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; + if (obj-shader_program != source) { + ctx-NewDriverState |= ctx-DriverFlags.NewTransformFeedbackProg; + obj-shader_program = source; + } assert(ctx-Driver.BeginTransformFeedback); ctx-Driver.BeginTransformFeedback(ctx, mode, obj); @@ -519,7 +545,7 @@ bind_buffer_range(struct gl_context *ctx, GLuint index, /** - * Specify a buffer object to receive vertex shader results. Plus, + * Specify a buffer object to receive transform feedback results. Plus, * specify the starting offset to place the results, and max size. * Called from the glBindBufferRange() function. */ @@ -563,7 +589,7 @@ _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx, /** - * Specify a buffer object to receive vertex shader results. + * Specify a
[Mesa-dev] [PATCH 2/2] i965/gen7: Use to the correct program when uploading transform feedback state.
Transform feedback may come from either the geometry shader or the vertex shader, so we can't use ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the current post-link transform feedback information. Fortunately we can use ctx-TransformFeedback.CurrentObject-shader_program. --- src/mesa/drivers/dri/i965/brw_state_upload.c | 1 + src/mesa/drivers/dri/i965/gen7_sol_state.c | 15 +-- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index af554eb..ca256c2 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -305,6 +305,7 @@ void brw_init_state( struct brw_context *brw ) STATIC_ASSERT(BRW_NUM_STATE_BITS = 8 * sizeof(brw-state.dirty.brw)); ctx-DriverFlags.NewTransformFeedback = BRW_NEW_TRANSFORM_FEEDBACK; + ctx-DriverFlags.NewTransformFeedbackProg = BRW_NEW_TRANSFORM_FEEDBACK; ctx-DriverFlags.NewRasterizerDiscard = BRW_NEW_RASTERIZER_DISCARD; ctx-DriverFlags.NewUniformBuffer = BRW_NEW_UNIFORM_BUFFER; ctx-DriverFlags.NewAtomicBuffer = BRW_NEW_ATOMIC_BUFFER; diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 1ca2b74..5d2f243 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -39,14 +39,11 @@ static void upload_3dstate_so_buffers(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; - /* BRW_NEW_VERTEX_PROGRAM */ - const struct gl_shader_program *vs_prog = - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; - const struct gl_transform_feedback_info *linked_xfb_info = - vs_prog-LinkedTransformFeedback; /* BRW_NEW_TRANSFORM_FEEDBACK */ struct gl_transform_feedback_object *xfb_obj = ctx-TransformFeedback.CurrentObject; + const struct gl_transform_feedback_info *linked_xfb_info = + xfb_obj-shader_program-LinkedTransformFeedback; int i; /* Set up the up to 4 output buffers. These are the ranges defined in the @@ -102,12 +99,11 @@ gen7_upload_3dstate_so_decl_list(struct brw_context *brw, const struct brw_vue_map *vue_map) { struct gl_context *ctx = brw-ctx; - /* BRW_NEW_VERTEX_PROGRAM */ - const struct gl_shader_program *vs_prog = - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; /* BRW_NEW_TRANSFORM_FEEDBACK */ + struct gl_transform_feedback_object *xfb_obj = + ctx-TransformFeedback.CurrentObject; const struct gl_transform_feedback_info *linked_xfb_info = - vs_prog-LinkedTransformFeedback; + xfb_obj-shader_program-LinkedTransformFeedback; uint16_t so_decl[128]; int buffer_mask = 0; int next_offset[4] = {0, 0, 0, 0}; @@ -260,7 +256,6 @@ const struct brw_tracked_state gen7_sol_state = { .dirty = { .mesa = (_NEW_LIGHT), .brw = (BRW_NEW_BATCH | - BRW_NEW_VERTEX_PROGRAM | BRW_NEW_VUE_MAP_GEOM_OUT | BRW_NEW_TRANSFORM_FEEDBACK) }, -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Ensure that transform feedback refers to the correct program.
On 22 January 2014 08:20, Kenneth Graunke kenn...@whitecape.org wrote: On 01/22/2014 06:07 AM, Paul Berry wrote: @@ -376,25 +376,48 @@ _mesa_compute_max_transform_feedback_vertices( **/ +/** + * Figure out which stage of the pipeline is the source of transform feedback + * data given the current context state, and return its gl_shader_program. + * + * If no active program can generate transform feedback data (i.e. no vertex + * shader is active), returns NULL. + */ +static struct gl_shader_program * +get_xfb_source(struct gl_context *ctx) +{ + int i; + for (i = MESA_SHADER_FRAGMENT - 1; i = MESA_SHADER_VERTEX; i--) { I think this would be clearer as: for (i = MESA_SHADER_GEOMETRY; i = MESA_SHADER_VERTEX; i--) { ... } Note that the pipeline ordering is: Vertex - Tess. Control - Tess. Eval - Geometry - Transform Feedback (http://www.opengl.org/wiki/Rendering_Pipeline_Overview) So either implementation would work even with tessellation shaders. Either way, this series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Cc: 10.0 mesa-sta...@lists.freedesktop.org That's a good point--I like your suggestion. Thanks for the review! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 23/23] i965/blorp: switch eu-emitter to use FS IR and fs_generator
On 22 January 2014 09:17, Topi Pohjolainen topi.pohjolai...@intel.comwrote: No regressions on IVB (piglit quick + unit tests). v2 (Paul): - no need to patch the unit tests anymore. Original logic was altered and unit tests updated to match the fs-generator - lrp emission moves from the blorp compiler core into the emitter here (previously there was a separate refactoring patch which is not really needed anymore as the lrp logic got refactored when the original lrp logic got fixed). - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the generator in fs_inst::target instead of hardcoding it CC: Paul Berry stereotype...@gmail.com Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com Thanks, Topi. The whole series is now: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3 7/8] glsl: remove remaining is_array variables
On 22 January 2014 03:33, Timothy Arceri t_arc...@yahoo.com.au wrote: Previously the reason we needed is_array was because we used array_size == NULL to represent both non-arrays and unsized arrays. Now that we use a non-NULL array_specifier to represent an unsized array, is_array is redundant. Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/ast.h | 32 +++- src/glsl/ast_to_hir.cpp | 10 +- src/glsl/ast_type.cpp | 6 ++ src/glsl/glsl_parser.yy | 30 +++--- src/glsl/glsl_parser_extras.cpp | 15 ++- 5 files changed, 35 insertions(+), 58 deletions(-) When I try to build this patch I get the compile error: CXX ast_to_hir.lo ../../src/glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue* ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*)': ../../src/glsl/ast_to_hir.cpp:2877:12: error: 'class ast_declaration' has no member named 'is_array' assert(!decl-is_array); ^ I'm guessing you missed this because you are building in release mode, so asserts didn't get compiled. If that's the case, I'd like to encourage you to do a debug build and double-check that piglit tests still pass. You can do that by adding the --enable-debug option when you run ./autogen.sh. With that fixed, the series is: Reviewed-by: Paul Berry stereotype...@gmail.com I'll try to push my patch glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays. within the next hour. Do you have commit access to Mesa or would you like me to push the series for you? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Replace *_generator::shader with is_glsl boolean.
The shader field in fs_generator, vec4_generator, and gen8_generator was only used for one purpose; to figure out if we were compiling an assembly shader or a GLSL shader. And it wasn't being used properly: in vec4 shaders we were always initializing it based on prog-_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we were compiling a geometry shader or a vertex shader. This was a fairly benign problem, since it's unlikely that a real-world program will try to mix and match GLSL and assembly shaders using separate shader objects. But it seems worth fixing. This patch replaces the shader field with a new is_glsl boolean, and initializes it based on information from the caller, so that it always refers to the correct shader stage. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 -- src/mesa/drivers/dri/i965/brw_fs.h| 8 +--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 12 ++-- src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 ++-- src/mesa/drivers/dri/i965/brw_vec4.h | 8 +--- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 11 +-- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 +-- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 13 ++--- src/mesa/drivers/dri/i965/gen8_generator.cpp | 6 -- src/mesa/drivers/dri/i965/gen8_generator.h| 5 +++-- src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 10 +- 11 files changed, 50 insertions(+), 40 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a0e4830..c0d65d5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3512,11 +3512,13 @@ brw_wm_fs_emit(struct brw_context *brw, struct brw_wm_compile *c, const unsigned *assembly = NULL; if (brw-gen = 8) { - gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE); + gen8_fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE, + shader != NULL); assembly = g.generate_assembly(v.instructions, simd16_instructions, final_assembly_size); } else { - fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE); + fs_generator g(brw, c, prog, fp, v.dual_src_output.file != BAD_FILE, + shader != NULL); assembly = g.generate_assembly(v.instructions, simd16_instructions, final_assembly_size); } diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index a903908..ad0aa99 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -512,7 +512,8 @@ public: struct brw_wm_compile *c, struct gl_shader_program *prog, struct gl_fragment_program *fp, -bool dual_source_output); +bool dual_source_output, +bool is_glsl); ~fs_generator(); const unsigned *generate_assembly(exec_list *simd8_instructions, @@ -615,7 +616,6 @@ private: struct brw_wm_compile *c; struct gl_shader_program *prog; - struct gl_shader *shader; const struct gl_fragment_program *fp; unsigned dispatch_width; /** 8 or 16 */ @@ -623,6 +623,7 @@ private: exec_list discard_halt_patches; bool dual_source_output; void *mem_ctx; + const bool is_glsl; }; /** @@ -637,7 +638,8 @@ public: struct brw_wm_compile *c, struct gl_shader_program *prog, struct gl_fragment_program *fp, - bool dual_source_output); + bool dual_source_output, + bool is_glsl); ~gen8_fs_generator(); const unsigned *generate_assembly(exec_list *simd8_instructions, diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index e701fc5..a8e81b8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -40,14 +40,14 @@ fs_generator::fs_generator(struct brw_context *brw, struct brw_wm_compile *c, struct gl_shader_program *prog, struct gl_fragment_program *fp, - bool dual_source_output) + bool dual_source_output, + bool is_glsl) - : brw(brw), c(c), prog(prog), fp(fp), dual_source_output(dual_source_output) + : brw(brw), c(c), prog(prog), fp(fp), + dual_source_output(dual_source_output), is_glsl(is_glsl) { ctx = brw-ctx; - shader = prog ? prog-_LinkedShaders[MESA_SHADER_FRAGMENT] : NULL; - mem_ctx = c; p = rzalloc(mem_ctx, struct brw_compile); @@ -1301,7 +1301,7 @@ fs_generator::generate_code(exec_list *instructions)
Re: [Mesa-dev] Two small blorp fixes found during refactoring
On 21 January 2014 08:07, Topi Pohjolainen topi.pohjolai...@intel.comwrote: These are not known to have caused any problems but they would get fixed when switching into fs-generator. Both fixes need corresponding patching of the unit tests and fixing them aforehand makes the switch possible without any changes in the tests. Topi Pohjolainen (2): i965/blorp: patch jump counters also for endif i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 48 ++ .../drivers/dri/i965/test_blorp_blit_eu_gen.cpp| 30 +++--- 2 files changed, 47 insertions(+), 31 deletions(-) Series is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 1/8] mesa: Add ARB_arrays_of_arrays
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 2 files changed, 2 insertions(+) Since this is a GLSL extension, you also need to update glcpp_parser_create() (in glcpp-parse.y) to call add_builtin_define(parser, GL_ARB_arrays_of_arrays) if extensions-ARB_arrays_of_arrays is set. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com Personally I would also add the extension to the _mesa_glsl_supported_extensions[] table in this patch, as well as add the bools ARB_arrays_of_arrays_enable and ARB_arrays_of_arrays_warn to _mesa_glsl_parse_state. I see you've done that as part of patch 3/8 instead. I suppose that's a matter of personal preference though. diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 2e0ccc3..95eb7ca 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -80,6 +80,7 @@ static const struct extension extension_table[] = { /* ARB Extensions */ { GL_ARB_ES2_compatibility, o(ARB_ES2_compatibility), GL, 2009 }, { GL_ARB_ES3_compatibility, o(ARB_ES3_compatibility), GL, 2012 }, + { GL_ARB_arrays_of_arrays, o(ARB_arrays_of_arrays),GL, 2012 }, { GL_ARB_base_instance, o(ARB_base_instance), GL, 2011 }, { GL_ARB_blend_func_extended, o(ARB_blend_func_extended), GL, 2009 }, { GL_ARB_clear_buffer_object, o(dummy_true), GL, 2012 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 33df682..eef67a8 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3366,6 +3366,7 @@ struct gl_extensions GLboolean ANGLE_texture_compression_dxt; GLboolean ARB_ES2_compatibility; GLboolean ARB_ES3_compatibility; + GLboolean ARB_arrays_of_arrays; GLboolean ARB_base_instance; GLboolean ARB_blend_func_extended; GLboolean ARB_color_buffer_float; -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 2/8] glsl: Add arrays_of_arrays to yacc definition
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/glsl_parser.yy | 153 +--- 1 file changed, 66 insertions(+), 87 deletions(-) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 1c56d6f..6d63668 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1611,19 +1565,55 @@ storage_qualifier: } ; -type_specifier: - type_specifier_nonarray - | type_specifier_nonarray '[' ']' +array_specifier: + '[' ']' { - $$ = $1; - $$-is_array = true; - $$-array_size = NULL; + void *ctx = state; + $$ = new(ctx) ast_array_specifier(); + $$-is_unsized_array = true; + $$-set_location(yylloc); + } + | '[' constant_expression ']' + { + void *ctx = state; + $$ = new(ctx) ast_array_specifier(); + $$-array_dimensions.push_tail( $2-link); + $$-is_unsized_array = false; + $$-set_location(yylloc); + } + | array_specifier '[' ']' + { + if (!state-ARB_arrays_of_arrays_enable) { + _mesa_glsl_error( @1, state, + #version 120 / GL_ARB_arrays_of_arrays + required for defining arrays of arrays); This error message is confusing. It could be reasonably interpreted as saying that either version 120 or GL_ARB_arrays_of_arrays is sufficient to allow arrays of arrays, which isn't true. I would just drop the mention of version 120 entirely--the only reason it's significant at all is that version 120 is the first version of GLSL to mention that arrays of arrays aren't allowed, and that was just because it was left out of version 110 as an oversight. + } else { + _mesa_glsl_error( @1, state, + only the outermost array dimension can + be unsized); + } To prevent uninitialized data in the error condition, I'd recommend doing $$ = $1; here. Otherwse we might crash before the user can see the error message. } - | type_specifier_nonarray '[' constant_expression ']' + | array_specifier '[' constant_expression ']' + { + if (!state-ARB_arrays_of_arrays_enable) { + _mesa_glsl_error( @1, state, + #version 120 / GL_ARB_arrays_of_arrays + required for defining arrays of arrays); Similar comment about dropping #version 120 from the error message here. + } + I think we need to add: $$ = $1; here. + $$-array_dimensions.push_tail( $3-link); + $$-dimension_count++; + $$-set_location(yylloc); Assuming I'm right about adding $$ = $1; above, there's no need to call set_location() here. The location was already set when the ast_array_specifier was created. + } + ; + +type_specifier: + type_specifier_nonarray + | type_specifier_nonarray array_specifier { $$ = $1; $$-is_array = true; - $$-array_size = $3; + $$-array_specifier = $2; } ; With those changes, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com Note, however, that some of my review comments on patch 3/8 will probably affect the content of this patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 4/8] glsl: only call mark_max_array if we are assigning an array
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/ast_to_hir.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) Can you say more in the commit message about why you're making this change? Is this fixing an existing bug? Is it preparing for a later change (which would otherwise create a bug if the change weren't made)? Or are you doing it just because it seems reasonable to do? diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 326aa58..d02c9ff 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -830,8 +830,10 @@ do_assignment(exec_list *instructions, struct _mesa_glsl_parse_state *state, rhs-type-array_size()); d-type = var-type; } - mark_whole_array_access(rhs); - mark_whole_array_access(lhs); + if (lhs-type-is_array()) { + mark_whole_array_access(rhs); + mark_whole_array_access(lhs); + } } /* Most callers of do_assignment (assign, add_assign, pre_inc/dec, -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.
Most of the time it is not necessary to perform type inference to compile GLSL; the type of every expression can be inferred from the contents of the expression itself (and previous type declarations). The exception is aggregate initializers: their type is determined by the LHS of the variable being assigned to. For example, in the statement: mat2 foo = { { 1, 2 }, { 3, 4 } }; the type of { 1, 2 } is only known to be vec2 (as opposed to, say, ivec2, uvec2, int[2], or a struct) because of the fact that the result is being assigned to a mat2. Previous to this patch, we handled this situation by doing some type inference during parsing: when parsing a declaration like the one above, we would call _mesa_set_aggregate_type(), which would infer the type of each aggregate initializer and store it in the corresponding ast_aggregate_initializer::constructor_type field. Since this happened at parse time, we couldn't do the type inference using glsl_type objects; we had to use ast_type_specifiers, which are much more awkward to work with. Things are about to get more complicated when we add support for ARB_arrays_of_arrays. This patch simplifies things by postponing the call to _mesa_set_aggregate_type() until ast-to-hir time, when we have access to glsl_type objects. As a side benefit, we only need to have one call to _mesa_set_aggregate_type() now, instead of six. --- Timothy: I was inspired to write this patch by the complexities you encountered during [PATCH V2 5/8] glsl: Aggregate initializer support for arrays of array. Can you try rebasing your series on top of this patch to see if it simplifies things? I believe that with these changes, you should be able to drop patch 5/8 entirely. src/glsl/ast.h | 16 +++-- src/glsl/ast_function.cpp | 8 +-- src/glsl/ast_to_hir.cpp | 7 +++ src/glsl/glsl_parser.yy | 27 src/glsl/glsl_parser_extras.cpp | 136 +++- 5 files changed, 46 insertions(+), 148 deletions(-) diff --git a/src/glsl/ast.h b/src/glsl/ast.h index 76911f0..b24052b 100644 --- a/src/glsl/ast.h +++ b/src/glsl/ast.h @@ -296,7 +296,16 @@ public: /* empty */ } - ast_type_specifier *constructor_type; + /** +* glsl_type of the aggregate, which is inferred from the LHS of whatever +* the aggregate is being used to initialize. This can't be inferred at +* parse time (since the parser deals with ast_type_specifiers, not +* glsl_types), so the parser leaves it NULL. However, the ast-to-hir +* conversion code makes sure to fill it in with the appropriate type +* before hir() is called. +*/ + const glsl_type *constructor_type; + virtual ir_rvalue *hir(exec_list *instructions, struct _mesa_glsl_parse_state *state); }; @@ -978,9 +987,8 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, YYLTYPE loc, YYLTYPE idx_loc); extern void -_mesa_ast_set_aggregate_type(const ast_type_specifier *type, - ast_expression *expr, - _mesa_glsl_parse_state *state); +_mesa_ast_set_aggregate_type(const glsl_type *type, + ast_expression *expr); void emit_function(_mesa_glsl_parse_state *state, ir_function *f); diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 2d05d07..4c5b0e4 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -1687,14 +1687,12 @@ ast_aggregate_initializer::hir(exec_list *instructions, { void *ctx = state; YYLTYPE loc = this-get_location(); - const char *name; if (!this-constructor_type) { _mesa_glsl_error(loc, state, type of C-style initializer unknown); return ir_rvalue::error_value(ctx); } - const glsl_type *const constructor_type = - this-constructor_type-glsl_type(name, state); + const glsl_type *const constructor_type = this-constructor_type; if (!state-ARB_shading_language_420pack_enable) { _mesa_glsl_error(loc, state, C-style initialization requires the @@ -1702,12 +1700,12 @@ ast_aggregate_initializer::hir(exec_list *instructions, return ir_rvalue::error_value(ctx); } - if (this-constructor_type-is_array) { + if (constructor_type-is_array()) { return process_array_constructor(instructions, constructor_type, loc, this-expressions, state); } - if (this-constructor_type-structure) { + if (constructor_type-is_record()) { return process_record_constructor(instructions, constructor_type, loc, this-expressions, state); } diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 4cc8eb1..8d13610 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2593,6 +2593,13 @@ process_initializer(ir_variable *var, ast_declaration *decl, ? attribute : varying);
Re: [Mesa-dev] [PATCH V2 5/8] glsl: Aggregate initializer support for arrays of array
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/ast.h | 19 +++- src/glsl/ast_function.cpp | 14 +++-- src/glsl/ast_to_hir.cpp | 29 +- src/glsl/glsl_parser.yy | 36 +- src/glsl/glsl_parser_extras.cpp | 68 +++-- 5 files changed, 139 insertions(+), 27 deletions(-) I believe I've found an easier solution to this: please see [PATCH] glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays., which I believe makes this patch unnecessary. diff --git a/src/glsl/ast.h b/src/glsl/ast.h index 4dda32e..8bccca7 100644 --- a/src/glsl/ast.h +++ b/src/glsl/ast.h @@ -308,10 +308,16 @@ public: : ast_expression(ast_aggregate, NULL, NULL, NULL), constructor_type(NULL) { - /* empty */ + array_dimension = 1; + is_array = false; } ast_type_specifier *constructor_type; + unsigned array_dimension; + + ast_array_specifier *array_specifier; + bool is_array; + virtual ir_rvalue *hir(exec_list *instructions, struct _mesa_glsl_parse_state *state); }; @@ -588,6 +594,11 @@ public: struct _mesa_glsl_parse_state *state) const; + const struct glsl_type *glsl_type(const char **name, + struct _mesa_glsl_parse_state *state, + bool skip_outer_dimension) + const; + virtual void print(void) const; ir_rvalue *hir(exec_list *, struct _mesa_glsl_parse_state *); @@ -1005,4 +1016,10 @@ extern void check_builtin_array_max_size(const char *name, unsigned size, YYLTYPE loc, struct _mesa_glsl_parse_state *state); +extern const glsl_type * +process_array_type(YYLTYPE *loc, const glsl_type *base, + ast_array_specifier *array_specifier, + struct _mesa_glsl_parse_state *state, + bool skip_first_dim); + #endif /* AST_H */ diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 2d05d07..3b38cb6 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -1693,8 +1693,18 @@ ast_aggregate_initializer::hir(exec_list *instructions, _mesa_glsl_error(loc, state, type of C-style initializer unknown); return ir_rvalue::error_value(ctx); } - const glsl_type *const constructor_type = - this-constructor_type-glsl_type(name, state); + const glsl_type *constructor_type = + this-constructor_type-glsl_type(name, state, +this-array_dimension 1 !this-is_array); + + /* This only handles vec4 foo[..]. The earlier constructor_type-glsl_type(...) +* call already handled the vec4[..] foo case. +*/ + if (this-is_array) { + constructor_type = + process_array_type(loc, constructor_type, this-array_specifier, +state, this-array_dimension 1); + } if (!state-ARB_shading_language_420pack_enable) { _mesa_glsl_error(loc, state, C-style initialization requires the diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index d02c9ff..226d128 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1817,10 +1817,11 @@ process_array_size(exec_node *node, return result; } -static const glsl_type * +const glsl_type * process_array_type(YYLTYPE *loc, const glsl_type *base, ast_array_specifier *array_specifier, - struct _mesa_glsl_parse_state *state) + struct _mesa_glsl_parse_state *state, + bool skip_first_dim) { const glsl_type *array_type = NULL; const glsl_type *element_type = base; @@ -1865,6 +1866,8 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, unsigned array_size; for (/* nothing */; !node-is_head_sentinel(); node = node-prev) { + if (skip_first_dim node-prev-is_head_sentinel()) +break; array_size = process_array_size(node, state); array_type_temp = glsl_type::get_array_instance(array_type_temp, array_size); @@ -1891,6 +1894,14 @@ const glsl_type * ast_type_specifier::glsl_type(const char **name, struct _mesa_glsl_parse_state *state) const { + return this-glsl_type(name, state, false); +} + +const glsl_type * +ast_type_specifier::glsl_type(const char **name, + struct _mesa_glsl_parse_state *state, + bool skip_outer_dim) const +{ const struct glsl_type *type; type = state-symbols-get_type(this-type_name); @@ -1898,7 +1909,8 @@
Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/ast_to_hir.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 226d128..62b7ec2 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions, if (state-is_version(120, 300)) break; /* FALLTHROUGH */ +case GLSL_TYPE_ARRAY: + if (state-ARB_arrays_of_arrays_enable) + break; + /* FALLTHROUGH */ default: _mesa_glsl_error( loc, state, vertex shader input / attribute cannot have I see two problems with this: 1. Since the case above has a fall-through, this has an unintended side effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes will be allowed prior to GLSL 1.20. Although it's unlikely that anyone will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it would be nice to get the logic right. 2. When the type of the input is an array of arrays, we need to recurse through it to make sure it doesn't contain any prohibited types (e.g. an array of arrays of structs should still be prohibited). I think the easiest solution to this would be to replace the line: const glsl_type *check_type = var-type-is_array() ? var-type-fields.array : var-type; with something like this: const glsl_type *check_type = var-type; while (check_type-is_array()) check_type = check_type-element_type(); Then we can leave the switch statement alone. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 6/8] glsl: Allow arrays of arrays as input to vertex shader
On 21 January 2014 18:23, Paul Berry stereotype...@gmail.com wrote: On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/ast_to_hir.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 226d128..62b7ec2 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3176,6 +3176,10 @@ ast_declarator_list::hir(exec_list *instructions, if (state-is_version(120, 300)) break; /* FALLTHROUGH */ +case GLSL_TYPE_ARRAY: + if (state-ARB_arrays_of_arrays_enable) + break; + /* FALLTHROUGH */ default: _mesa_glsl_error( loc, state, vertex shader input / attribute cannot have I see two problems with this: 1. Since the case above has a fall-through, this has an unintended side effect: if ARB_arrays_of_arrays is enabled, then integer vertex attributes will be allowed prior to GLSL 1.20. Although it's unlikely that anyone will ever try to write a GLSL 1.10 shader using ARB_arrays_of_arrays, it would be nice to get the logic right. 2. When the type of the input is an array of arrays, we need to recurse through it to make sure it doesn't contain any prohibited types (e.g. an array of arrays of structs should still be prohibited). I think the easiest solution to this would be to replace the line: const glsl_type *check_type = var-type-is_array() ? var-type-fields.array : var-type; with something like this: const glsl_type *check_type = var-type; while (check_type-is_array()) check_type = check_type-element_type(); Then we can leave the switch statement alone. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com As an aside, this is just one example of what I suspect are many places where the GLSL compiler makes the implicit assumption that an array can't contain an array. Once we think ARB_arrays_of_arrays support is complete, it would probably be worth it for someone with a lot of familiarity with the GLSL front end to grep for is_array() to make sure we haven't missed anything :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 7/8] glsl: create type name for arrays of arrays
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: We need to insert outermost dimensions in the correct spot otherwise the dimension order will be backwards Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/glsl_types.cpp | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index 12d4ac0..f7f3117 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -300,8 +300,18 @@ glsl_type::glsl_type(const glsl_type *array, unsigned length) : if (length == 0) snprintf(n, name_length, %s[], array-name); - else - snprintf(n, name_length, %s[%u], array-name, length); + else { + /* insert outermost dimensions in the correct spot */ + const char *pos = strchr(array-name, '['); + if (pos) { + int idx = pos - array-name; + snprintf(n, idx+1, %s, array-name); + snprintf(n + idx, name_length, [%u], length); + snprintf(n + strlen(n), name_length, %s, array-name + idx); Technically the last two snprintf's are unsafe, since the second argument is a length, so we need to subtract the offset we supplied to the first argument. Also, there's no need to split the second snprintf from the third; both of them can be combined to: snprintf(n + idx, name_length - idx, [%u]%s, length, array-name + idx); Good catch on noticing that it's necessary to reverse the order of the dimensions. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com + } else { + snprintf(n, name_length, %s[%u], array-name, length); + } + } this-name = n; } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 8/8] docs: Mark ARB_arrays_of_arrays as started
On 21 January 2014 04:19, Timothy Arceri t_arc...@yahoo.com.au wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 3/8] glsl: Add array specifier to ast code
On 21 January 2014 14:54, Timothy Arceri t_arc...@yahoo.com.au wrote: Hi Paul Thanks for the thorough review and taking the time to explain how to make improvements. I can already see a number of improvements that can be made to patch 5 based on the suggestions you have already made so you could hold off reviewing that one if you havent started already and I've post a new series in a couple of days. No problem. As you can see, I already sent out my own ideas about patch 5/8. Sorry about the mid-air collision on that--I didn't notice your reply until I had already finished my review of the whole series. Hopefully my patch will do what you need, and you won't need to work on patch 5 any further. Thanks for working on this! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Add image type to the GLSL IR.
On 15 January 2014 11:42, Francisco Jerez curroje...@riseup.net wrote: v2: Reuse the glsl_sampler_dim enum for images. Reuse the glsl_type::sampler_* fields instead of creating new ones specific to image types. Reuse the same constructor as for samplers adding a new 'base_type' argument. --- Is this what you had in mind Paul? Yes, that seems reasonable. Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.
On 15 January 2014 13:53, Francisco Jerez curroje...@riseup.net wrote: Paul Berry stereotype...@gmail.com writes: On 2 December 2013 11:31, Francisco Jerez curroje...@riseup.net wrote: +/** + * Get either of the 8-component halves of a 16-component register. + */ +static inline fs_reg +half(const fs_reg reg, unsigned idx) +{ + assert(idx == 0 || (reg.file != HW_REG reg.file != IMM)); + return byte_offset(reg, 8 * idx * reg.stride * type_sz(reg.type)); +} + I'd like to see a comment explaining that it's safe to call this function on a register with 32 bits per component, since brw_reg.h's byte_offset() handles byte offsets greater than 32 by incrementing the register number. I've added a comment on the allowed range of fs_reg::subreg_offset here [1]. Do you think that's clear enough? The comment you added is good, but it doesn't address my concern. My concern is that callers of half() might see the phrase Get either of the 8-component halves of a 16-component register and reasonably think this means get either of the 8-components halves of a register that stores 16 components in its 256 bits. Since most 16-wide FS registers are actually register pairs that store 16 components in 512 bits, they might be misled into thinking that half() doesn't work on these register pairs. Adding a comment to fs_reg::subreg_offset doesn't help that situation because it doesn't change the misleading text above the half() function. How about changing the comment above half() to say this: /** * Get either of the 8-component halves of a 16-component register. * * Note: this also works if \c reg represents a SIMD16 pair of registers. */ not realize that is is safe to call it on a register with 32 bits per component, because they might erroneously think oh, this function only works on 16-component registers; that is, registers that Thanks. Also, for sanity's sake, it would be nice for this function to include the assertion: assert(idx 2); With those changes, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com [1] http://cgit.freedesktop.org/~currojerez/mesa/commit/?h=image-load-storeid=14ec9dfd42fec1ee5e5e7f48f98a36fe620cf4e6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/23] i965/fs: Remove fs_reg::sechalf.
On 21 January 2014 19:21, Paul Berry stereotype...@gmail.com wrote: not realize that is is safe to call it on a register with 32 bits per component, because they might erroneously think oh, this function only works on 16-component registers; that is, registers that Oops, sorry about the extraneous sentence fragment there. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/23] i965/fs: Take into account reg_offset consistently for MRF regs.
On 15 January 2014 14:01, Francisco Jerez curroje...@riseup.net wrote: Paul Berry stereotype...@gmail.com writes: On 2 December 2013 11:31, Francisco Jerez curroje...@riseup.net wrote: Until now it was only being taken into account in the VEC4 back-end but not in the FS back-end. Do it in both cases. --- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++ src/mesa/drivers/dri/i965/brw_shader.h | 7 --- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 2c36d9f..f918f7e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -615,4 +615,4 @@ bool brw_do_channel_expressions(struct exec_list *instructions); bool brw_do_vector_splitting(struct exec_list *instructions); bool brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog); -struct brw_reg brw_reg_from_fs_reg(fs_reg *reg); +struct brw_reg brw_reg_from_fs_reg(fs_reg *reg, unsigned dispatch_width); diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 8d310a1..1de59eb 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -981,8 +981,9 @@ static uint32_t brw_file_from_reg(fs_reg *reg) } struct brw_reg -brw_reg_from_fs_reg(fs_reg *reg) +brw_reg_from_fs_reg(fs_reg *reg, unsigned dispatch_width) { + const int reg_size = 4 * dispatch_width; What happens when reg.type is UW and dispatch_width is 16? In that case, we would compute reg_size == 64, but the correct value seems like it's actually 32 in this case. Are we perhaps relying on reg.type being a 32-bit type? If so, maybe we should add an assertion: assert(type_sz(reg.type) == 4); Nope, reg_size is supposed to be the size in bytes of a ::reg_offset unit, i.e. one hardware register in SIMD8 mode and two hardware registers in SIMD16 mode as the comment at the definition of ::reg_offset explains. The fixed factor of four is intentional and correct no matter what the register type is. Thanks. Ok, I see. It appears that both this function *and* the comment above reg_offset are assuming that the data type is 32 bit. The comment above reg_offset says: * For pre-register-allocation GRFs and MRFs, this is in units of a * float per pixel (1 hardware register for SIMD8 or SIMD4x2 mode, * or 2 registers for SIMD16 mode). For uniforms, this is in units * of 1 float. but when we get around to adding support for double-precision floats (a feature of GL 4.0), this will no longer work; for double precision types we'll need reg_offset to be measured in units of at least 4 hardware registers in SIMD16 mode to avoid overlap. Similarly, for types that are 16 bits, if we consider reg_offset to be measured in units of 2 hardware registers in SIMD16 mode, we're actually wasting registers, since all 16 values actually fit in a single hardware register. That's not really a big deal right now, since we use 16-bit types so rarely in the FS back-end. In fact, I bet we never use a nonzero reg_offset on a 16-bit type. It still seems to me that it would be worth putting an assertion here to help alert us to what needs to change when we add double precision support (or if someday we have hardware that supports half float computation). I'm not 100% sure what the assertion could be. assert(type_sz(reg-type) == 4); was an optimistic guess. We might have to do assert(type_sz(reg-type) == 4 || reg-reg_offset == 0); in order to avoid tripping on the rare cases where we currently use 16-bit types in fragment shaders. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter
On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote: diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index b189aa2..dcfd82b 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const struct brw_reg src0, true /* eot */, use_header); } + +void +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type, + const struct brw_reg dst, + const struct brw_reg src_1, + const struct brw_reg src_2) +{ + if (texture_data_type == BRW_REGISTER_TYPE_F) + brw_ADD(func, dst, src_1, src_2); + else + brw_AVG(func, dst, src_1, src_2); +} It's a bit of an awkward split to have most of the algorithm for combining samples in brw_blorp_blit_program::manual_blend_average(), but the choice of whether to use ADD or AVG is here in brw_blorp_eu_emitter::emit_combine(). How about if we replace texture_data_type with a bool called combine_using_add? That way someone reading manual_blend_average() won't have to refer to emit_combine() to understand what the algorithm does; and similarly someone reading emt_combine() won't have to look at manual_blend_average() to understand why we use ADD for floats and AVG for ints. With that change, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 27/42] i965/blorp: wrap LRP
On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote: The split of the emission of the two halfs into single emission call prapares for fs_generator support that already does similar thing. No regressions seen on IVB (unit tests and piglit quick). Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 26 ++ src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h | 13 + 2 files changed, 23 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index 1b7310b..b95104e 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -1679,29 +1679,23 @@ brw_blorp_blit_program::manual_blend_bilinear(unsigned num_samples) } #define SAMPLE(x, y) offset(texture_data[x], y) - brw_set_access_mode(func, BRW_ALIGN_16); - brw_set_compression_control(func, BRW_COMPRESSION_NONE); for (int index = 3; index 0; ) { /* Since we're doing SIMD16, 4 color channels fits in to 8 registers. * Counter value of 8 in 'for' loop below is used to interpolate all * the color components. */ - for (int k = 0; k 8; ++k) - brw_LRP(func, - vec8(SAMPLE(index - 1, k)), - offset(x_frac, k 1), - SAMPLE(index, k), - SAMPLE(index - 1, k)); + for (int k = 0; k 8; k += 2) + emit_lrp(vec8(SAMPLE(index - 1, k)), + offset(x_frac, k 1), This line can just be x_frac now, because k is always even, so (k 1) is always zero. + SAMPLE(index, k), + SAMPLE(index - 1, k)); index -= 2; } - for (int k = 0; k 8; ++k) - brw_LRP(func, - vec8(SAMPLE(0, k)), - offset(y_frac, k 1), - vec8(SAMPLE(2, k)), - vec8(SAMPLE(0, k))); - brw_set_compression_control(func, BRW_COMPRESSION_COMPRESSED); - brw_set_access_mode(func, BRW_ALIGN_1); + for (int k = 0; k 8; k += 2) + emit_lrp(vec8(SAMPLE(0, k)), + offset(y_frac, k 1), Same comment applies here. + vec8(SAMPLE(2, k)), + vec8(SAMPLE(0, k))); #undef SAMPLE } With those two things fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 37/42] i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote: diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h index 1ecf076..3f2301c 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h @@ -162,6 +162,24 @@ protected: brw_RNDD(func, dst, src); } + inline void emit_if(int op, + const struct brw_reg x, + const struct brw_reg y) + { + brw_CMP(func, vec16(brw_null_reg()), op, x, y); + brw_IF(func, BRW_EXECUTE_16); + } Can we maybe call tihs emit_cmp_if(), so that it's clear when looking at the caller that it performs both a CMP and an IF? With that change, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 38/42] i965/fs: allow unit tests to dump the final patched assembly
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote: Unit tests comparing generated blorp programs to known good need to have the dump in designated file instead of in default standard output. The comparison also expects the jump counters of if-else-instructions to be correctly set and hence the dump needs to be taken _after_ 'patch_IF_ELSE()' is run (the default dump of the fs_generator does this before). Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/brw_fs.h | 7 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 +-- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 9bef07c..d40d0a8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -510,10 +510,13 @@ public: const unsigned *generate_assembly(exec_list *simd8_instructions, exec_list *simd16_instructions, - unsigned *assembly_size); + unsigned *assembly_size, + bool dump_enabled = false, + FILE *dump_file = stdout); Nit pick: rather than add 2 args, wouldn't it be easier to just add the dump_file arg, with a default value of NULL, and only dump instructions if it is non-NULL? Either way, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 39/42] i965/fs: introduce blorp specific rt-write for fs_generator
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote: diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index df91235..4c159e6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -190,6 +190,21 @@ fs_generator::generate_fb_write(fs_inst *inst) mark_surface_used(surf_index); } +void +fs_generator::generate_blorp_fb_write(fs_inst *inst) +{ + brw_fb_WRITE(p, +16 /* dispatch_width */, +inst-base_mrf, +brw_reg_from_fs_reg(inst-src[0]), +BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE, +1 /* BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX */, This seems like it would lead to a maintenance burden if anyone ever decides to change BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to a number other than 1. Can we move the declaration of BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX to somewhere that's accessible to this file so that we can just pass it in directly? Alternatively, we could store it in inst-target. With that issue addressed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com +inst-mlen, +0, +true, +inst-header_present); +} + ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 41/42] i965/eu: introduce blorp specific flavour of lrp
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote: This is rather ugly but as I couldn't think of anything better for now and wanted to get the rest of the series under review, I left it as it is. Even though immediately surrounding code has tabs this piece is written space-indented. Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com This patch is unnecessary. The blorp specific flavour of lrp is actually a previously undiscovered bug in blorp: blorp should be using BRW_COMPRESSION_2NDHALF for its second half LRPs. I believe the bug may be benign, but we should fix it anyhow, and I think we can fix it by just dropping this patch entirely. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/42] i965/blorp: move emission of sample combining into eu-emitter
On 20 January 2014 19:36, Paul Berry stereotype...@gmail.com wrote: On 20 December 2013 06:38, Topi Pohjolainen topi.pohjolai...@intel.comwrote: diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index b189aa2..dcfd82b 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -148,3 +148,15 @@ brw_blorp_eu_emitter::emit_render_target_write(const struct brw_reg src0, true /* eot */, use_header); } + +void +brw_blorp_eu_emitter::emit_combine(unsigned texture_data_type, + const struct brw_reg dst, + const struct brw_reg src_1, + const struct brw_reg src_2) +{ + if (texture_data_type == BRW_REGISTER_TYPE_F) + brw_ADD(func, dst, src_1, src_2); + else + brw_AVG(func, dst, src_1, src_2); +} It's a bit of an awkward split to have most of the algorithm for combining samples in brw_blorp_blit_program::manual_blend_average(), but the choice of whether to use ADD or AVG is here in brw_blorp_eu_emitter::emit_combine(). How about if we replace texture_data_type with a bool called combine_using_add? That way someone reading manual_blend_average() won't have to refer to emit_combine() to understand what the algorithm does; and similarly someone reading emt_combine() won't have to look at manual_blend_average() to understand why we use ADD for floats and AVG for ints. On further reflection, I think it would be even better to replace the texture_data_type argument with an opcode argument--that way the caller can pass in BRW_OPCODE_ADD or BRW_OPCODE_AVG. Once we get to patch 42/42, this function can be changed to just plumb the opcode straight through into the fs_inst constructor. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 42/42] i965/blorp: switch eu-emitter to use FS IR and fs_generator
On 20 December 2013 06:39, Topi Pohjolainen topi.pohjolai...@intel.comwrote: Unfortunately the unit tests need to be patched as well. This is because the direct eu-emitter only patches jump counters for if-else (see patch_IF_ELSE()) while the fs_generator patches the endif as well (see brw_set_uip_jip()). I think the fact that blorp wasn't patching jump counters for if-else was a previously undiscovered bug in blorp. I suspect that the reason we never discovered it was because blorp doesn't contain complex enough flow control to cause an endif to ever execute a jump. I'd recommend splitting this patch into two patches: 1. fix the previously undiscovered blorp bug (by having blorp call brw_set_uip_jip() at the appropriate time) and update the unit tests. 2. switch eu-emitter to use FS IR and fs_generator. That will make it a lot easier to reassure ourselves that the changes are correct, and in the unlikely event that we later discover that they aren't, it'll make it easier to bisect to the problem. With this patch split, it is: Reviewed-by: Paul Berry stereotype...@gmail.com I've sent comments on patches 24, 27, 37, 38, 39, and 41. All the others are: Reviewed-by: Paul Berry stereotype...@gmail.com I really like how you structured this series, Topi. It was easy to follow what you did and to reassure myself that it was correct. And I'm really glad you did it, since it led us to discover two preexisting bugs! Nice work. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Reserve space for Vertex Count in GS outputs.
On 20 January 2014 22:56, Kenneth Graunke kenn...@whitecape.org wrote: v2: Also increment ir-offset in the GS visitor, rather than at the final assembly generation stage (requested by Paul). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 6 ++ src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 +++ 2 files changed, 13 insertions(+) Hey Paul, I think I implemented all of your feedback from last time. (Previously, I'd done this in gen8_vec4_generator, and you asked me to move it to the visitor. This patch does that - and it is indeed much nicer.) --Ken Looks good, thanks! Reviewed-by: Paul Berry stereotype...@gmail.com diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..c40112b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -194,6 +194,12 @@ do_gs_prog(struct brw_context *brw, c.prog_data.output_vertex_size_hwords * 32 * gp-program.VerticesOut; output_size_bytes += 32 * c.prog_data.control_data_header_size_hwords; + /* Broadwell stores Vertex Count as a full 8 DWord (32 byte) URB output, +* which comes before the control header. +*/ + if (brw-gen = 8) + output_size_bytes += 32; + assert(output_size_bytes = 1); if (output_size_bytes GEN7_MAX_GS_URB_ENTRY_SIZE_BYTES) return false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index f33c80d..f0351ea 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -267,6 +267,13 @@ vec4_gs_visitor::emit_urb_write_opcode(bool complete) vec4_instruction *inst = emit(GS_OPCODE_URB_WRITE); inst-offset = c-prog_data.control_data_header_size_hwords; + + /* We need to increment Global Offset by 1 to make room for Broadwell's +* extra Vertex Count payload at the beginning of the URB entry. +*/ + if (brw-gen = 8) + inst-offset++; + inst-urb_write_flags = BRW_URB_WRITE_PER_SLOT_OFFSET; return inst; } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality
On 16 January 2014 05:37, Tapani Pälli tapani.pa...@intel.com wrote: On 01/15/2014 06:13 PM, Paul Berry wrote: On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/shaderapi.c | 44 ++-- 1 file changed, 38 insertions(+), 6 deletions(-) + char *data = mesa_program_serialize(shProg, size); + + /* we have more data that can fit to user given buffer */ + if (size bufSize) { + _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (data) + free(data); Why would we ever expect mesa_program_serialize to set size to a nonzero value but return NULL? It seems like this could only happen if there's a bug, in which case this really ought to be assert(data !=NULL); Also, it's safe to call free() on a NULL pointer. According to the C standard, freeing a NULL pointer does nothing. sure, will fix + return; + } + + if (data) { Similarly, this if-statement is unnecessary. it is required for memcpy but not for free Oops, you're right. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program
On 15 January 2014 02:46, Tapani tapani.pa...@intel.com wrote: On 01/14/2014 07:53 PM, Paul Berry wrote: On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: +static void +serialize_uniform_storage(gl_uniform_storage *uni, memory_writer blob) I don't think this is right. The ARB_get_program_binary spec says: 8. How does ProgramBinary interact with uniform values, including shader-specified defaults? RESOLVED: All uniforms specified in the binary are reset to their shader- specified defaults, or to zero if they aren't specified, when the program binary is loaded. The spec language has been updated to specify this behavior. The OES_get_program_binary spec doesn't mention uniforms at all, but I believe this is not intended to indicate a behavioural difference--it's just a consequence of the fact that ARB_get_program_binary was written later, so they had a chance to clarify things. In fact, issue #7 in ARB_get_program_binary specifically says: 7. How does this spec differ from the OES_get_program_binary and why? ... There are other areas where language was clarified, but this is meant to be consistent with the intent of the original specification and not to alter previously established behavior. So I believe we shouldn't be serializing uniform values. For me this seemed much easier way to serialize than recreate it though. Would it be enough if I reset the default values in place? Yes, I think that would be reasonable. Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization
On 14 January 2014 02:35, Tapani Pälli tapani.pa...@intel.com wrote: On 01/13/2014 08:27 PM, Paul Berry wrote: On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 147 +++ 1 file changed, 147 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..a6c6b55 --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,147 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) {} + + ~memory_writer() + { + free(memory); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) int write_ ##type (const type data) {\ + return write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + int write_bool(bool data) + { + uint8_t val = data; + return write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + int write(const void *data, int32_t size) + { + if (!memory || pos (int32_t)(curr_size - size)) + if (grow(size)) +return -1; + + memcpy(memory + pos, data, size); + + pos += size; + return 0; + } + + int overwrite(const void *data, int32_t size, int32_t offset) + { + if (offset 0 || offset + size pos) + return -1; + memcpy(memory + offset, data, size); + return 0; + } + + int write_string(const char *str) + { + if (!str) + return -1; + char terminator = '\0'; + write(str, strlen(str)); + write(terminator, 1); C strings include a terminator, so there's no reason to write out the string contents and the terminator separtely. You can just do: write(str, strlen(str) + 1); Also, don't forget to propagate the return code to the caller: return write(str, strlen(str) + 1); Ah right, will do. It could be that this needs to be modified back to how it was though (to write to length of the string too). I think it is otherwise impossible for the reader side to know if terminator really exists and it might continue to read forever (or call strlen for unterminated string). I don't know of any C library function that does exactly what you want. Of course you could always code up your own loop. But I think I agree that storing the length first is the best approach. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache
On 14 January 2014 03:35, Tapani Pälli tapani.pa...@intel.com wrote: On 01/14/2014 01:24 AM, Paul Berry wrote: On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: + var-state_slots = NULL; + + if (var-num_state_slots 0) { + var-state_slots = ralloc_array(var, ir_state_slot, + var-num_state_slots); Security issue: if the shader blob is corrupted, num_state_slots could be outrageously large, causing this memory allocation to fail (which would then cause the loop below to perform illegal memory accesses). We should validate that num_state_slots is within a reasonable range, and if it isn't, abort reading the binary blob. What is the expected reasonable range for state_slots? I believe the uniform with the most state slots is gl_LightSource, which has 96. What I'd suggest doing (in a separate patch) is: add a #define MAX_NUM_STATE_SLOTS 96 somewhere, and then in builtin_variable_generator::add_uniform(), right after setting uni-num_state_slots, add: assert(uni-num_state_slots = MAX_NUM_STATE_SLOTS); That way, in the unlikely event that we ever add a GL feature that requires more than this number of state slots, the assertion failure will alert us that we need to increase MAX_NUM_STATE_SLOTS. + /* fill parse state from shader header information */ + switch (shader-Type) { + case GL_VERTEX_SHADER: + state-target = MESA_SHADER_VERTEX; + break; + case GL_FRAGMENT_SHADER: + state-target = MESA_SHADER_FRAGMENT; + break; + case GL_GEOMETRY_SHADER_ARB: + state-target = MESA_SHADER_GEOMETRY; + break; There should be a default case here to handle corrupted blobs with an illegal type. I'm now doing _mesa_shader_enum_to_shader_stage(shader-Type) here which will assert if type is illegal. An assertion isn't good enough, because assertions don't get checked in release builds. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Add GS support to INTEL_DEBUG=shader_time.
Previously, time spent in geometry shaders would be counted as part of the vertex shader time. --- src/mesa/drivers/dri/i965/brw_context.h | 3 +++ src/mesa/drivers/dri/i965/brw_program.c | 10 +- src/mesa/drivers/dri/i965/brw_vec4.cpp| 6 +++--- src/mesa/drivers/dri/i965/brw_vec4.h | 9 - src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 10 -- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 3 ++- src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp | 3 ++- 8 files changed, 37 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9c51646..972eb9f 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -783,6 +783,9 @@ enum shader_time_shader_type { ST_VS, ST_VS_WRITTEN, ST_VS_RESET, + ST_GS, + ST_GS_WRITTEN, + ST_GS_RESET, ST_FS8, ST_FS8_WRITTEN, ST_FS8_RESET, diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index abfc921..a6a2403 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -288,7 +288,7 @@ get_written_and_reset(struct brw_context *brw, int i, uint64_t *written, uint64_t *reset) { enum shader_time_shader_type type = brw-shader_time.types[i]; - assert(type == ST_VS || type == ST_FS8 || type == ST_FS16); + assert(type == ST_VS || type == ST_GS || type == ST_FS8 || type == ST_FS16); /* Find where we recorded written and reset. */ int wi, ri; @@ -340,6 +340,8 @@ brw_report_shader_time(struct brw_context *brw) switch (type) { case ST_VS_WRITTEN: case ST_VS_RESET: + case ST_GS_WRITTEN: + case ST_GS_RESET: case ST_FS8_WRITTEN: case ST_FS8_RESET: case ST_FS16_WRITTEN: @@ -349,6 +351,7 @@ brw_report_shader_time(struct brw_context *brw) continue; case ST_VS: + case ST_GS: case ST_FS8: case ST_FS16: get_written_and_reset(brw, i, written, reset); @@ -372,6 +375,7 @@ brw_report_shader_time(struct brw_context *brw) switch (type) { case ST_VS: + case ST_GS: case ST_FS8: case ST_FS16: total_by_type[type] += scaled[i]; @@ -432,6 +436,9 @@ brw_report_shader_time(struct brw_context *brw) case ST_VS: stage = vs; break; + case ST_GS: + stage = gs; + break; case ST_FS8: stage = fs8; break; @@ -449,6 +456,7 @@ brw_report_shader_time(struct brw_context *brw) printf(\n); print_shader_time_line(total, vs, -1, total_by_type[ST_VS], total); + print_shader_time_line(total, gs, -1, total_by_type[ST_GS], total); print_shader_time_line(total, fs8, -1, total_by_type[ST_FS8], total); print_shader_time_line(total, fs16, -1, total_by_type[ST_FS16], total); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 9d3735a..2e319e38 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1494,10 +1494,10 @@ vec4_visitor::emit_shader_time_end() */ emit(ADD(diff, src_reg(diff), src_reg(-2u))); - emit_shader_time_write(ST_VS, src_reg(diff)); - emit_shader_time_write(ST_VS_WRITTEN, src_reg(1u)); + emit_shader_time_write(st_base, src_reg(diff)); + emit_shader_time_write(st_written, src_reg(1u)); emit(BRW_OPCODE_ELSE); - emit_shader_time_write(ST_VS_RESET, src_reg(1u)); + emit_shader_time_write(st_reset, src_reg(1u)); emit(BRW_OPCODE_ENDIF); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index d4029d8..f2ecb2b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -231,7 +231,10 @@ public: struct brw_shader *shader, void *mem_ctx, bool debug_flag, -bool no_spills); +bool no_spills, +shader_time_shader_type st_base, +shader_time_shader_type st_written, +shader_time_shader_type st_reset); ~vec4_visitor(); dst_reg dst_null_f() @@ -547,6 +550,10 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + const shader_time_shader_type st_base; + const shader_time_shader_type st_written; + const shader_time_shader_type st_reset; }; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 5d5b169..8d8db5e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -41,7 +41,8 @@
Re: [Mesa-dev] [wip 9/9] mesa: OES_get_program_binary extension functionality
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/shaderapi.c | 44 ++-- 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 57511e8..c07b226 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -57,6 +57,7 @@ #include ../glsl/ir.h #include ../glsl/ir_uniform.h #include ../glsl/program.h +#include ../glsl/shader_cache.h /** Define this to enable shader substitution (see below) */ #define SHADER_SUBST 0 @@ -1632,8 +1633,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length, if (length != NULL) *length = 0; - (void) binaryFormat; - (void) binary; + size_t size = 0; + char *data = mesa_program_serialize(shProg, size); + + /* we have more data that can fit to user given buffer */ + if (size bufSize) { + _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (data) + free(data); Why would we ever expect mesa_program_serialize to set size to a nonzero value but return NULL? It seems like this could only happen if there's a bug, in which case this really ought to be assert(data !=NULL); Also, it's safe to call free() on a NULL pointer. According to the C standard, freeing a NULL pointer does nothing. + return; + } + + if (data) { Similarly, this if-statement is unnecessary. + memcpy(binary, data, size); + free(data); + } + + if (length != NULL) + *length = size; + + *binaryFormat = 0; } void GLAPIENTRY @@ -1647,10 +1666,23 @@ _mesa_ProgramBinary(GLuint program, GLenum binaryFormat, if (!shProg) return; - (void) binaryFormat; - (void) binary; - (void) length; - _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (length = 0) + return; In the case of an invalid length, we need to make sure to set the program's LinkStatus to false. Also, the information log needs to be cleared, in accordance with this text from OES_get_program_binary: If ProgramBinaryOES failed, any information about a previous link or load of that program object is lost. Thus, a failed load does not restore the old state of program. + + /* free possible existing data and initialize structure */ + _mesa_free_shader_program_data(ctx, shProg); + _mesa_init_shader_program(ctx, shProg); + + /* fill structure from a binary blob */ + if (mesa_program_deserialize(shProg, binary, length)) { + _mesa_error(ctx, GL_INVALID_VALUE, glProgramBinary(binary incompatible)); This seems wrong to me. From the OES_get_program_binary spec: ... binaryFormat and binary must be those returned by a previous call to GetProgramBinaryOES, and length must be the length of the program binary as returned by GetProgramBinaryOES or GetProgramiv with pname PROGRAM_BINARY_LENGTH_OES. The program binary will fail to load if these conditions are not met. ... A program object's program binary is replaced by calls to LinkProgram or ProgramBinaryOES. Either command sets the program object's LINK_STATUS to TRUE or FALSE, as queried with GetProgramiv, to reflect success or failure. Either command also updates its information log, queried with GetProgramInfoLog, to provide details about warnings or errors. I believe this means that if deserialization fails, it should not be a GL error. It should simply be treated as a link failure. + return; + } + + /* driver specific link, optimizations and what not */ + ctx-Driver.LinkShader(ctx, shProg); Now I'm confused. I thought a major part of the purpose of this extension was that it would store the post-link program, so that not only does glProgramBinary() avoid the runtime penalty of parsing and compiling, it also avoids the runtime penalty of link-time optimizations. Calling ctx-Driver.LinkShader seems like it defeats that purpose. It seems like what we ought to be doing is store the data that ctx-Driver.LinkShader *produces* in the binary blob, so that once it's loaded there's no further linking necessary. If there is a small amount of driver-specific hook necessary, that should be placed in a new ctx-Driver function rather than incurring the overhead of another link. + + _mesa_ValidateProgram(program); I don't think this is correct. ValidateProgram() doesn't mean check that the program is well-formed. It means check whether the program can execute given the current GL state. A lot of GL apps compile all their shaders during startup, long before they've established the correct GL state for running those programs. It's reasonable to assume that apps using this extension will make their calls to glProgramBinary() during startup as well. So calling ValidateProgram() from glProgramBinary() will
Re: [Mesa-dev] [wip 0/9] GL_OES_get_program_binary extension
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Hi; Here's another take on the get_program_binary extension. I've rewritten big chunks of it based on Paul's review and comments earlier. Here's a brief list out of my head of things changed: - uses mesa compilation time as verification method for cache, not git sha - much smaller serialization code size, now as part of IR classes - smaller binary blob size - does not dump separate 'prototypes block' for functions but instead deserialization iterates over the whole blob for functions first - hashtables written faster than previously (iterated only once) - no 'unique_id' for ir_variables but using the address as identifier - patch set broken down to more individual patches/changes Mesa branch with these patches applied on top: http://cgit.freedesktop.org/~tpalli/mesa/log/?h=oes_get_program_binary I have plans for the addition of driver backend data but I was hoping this to be as a separate addition on top. Basically I extend gl_shader_program struct and offer API for drivers to dump their binary data there, some proto here: http://cgit.freedesktop.org/~tpalli/mesa/log/?h=driver_interface (in order to work more data needed like the keys but this is just test) I have also 'automatic' cache implementation here that can be used to verify the cache functionality with any application (without extension): http://cgit.freedesktop.org/~tpalli/mesa/log/?h=new_serialization Any comments appreciated; Tapani Pälli (9): glsl: memory_writer helper class for data serialization glsl: serialize methods for IR instructions glsl: memory_map helper class for data deserialization glsl: add MESA_SHADER_CACHE_MAGIC string for shader binary cache glsl: export populate_symbol_table function glsl: ir_deserializer class for the binary shader cache mesa: iterate method for string_to_uint_map glsl: functions to serialize gl_shader and gl_shader_program mesa: OES_get_program_binary extension functionality I sent comments on all patches but 4, 5, and 7. Those patches are: Reviewed-by: Paul Berry stereotype...@gmail.com I see that you've already responded to some of my comments. I'm on a business trip this week so I have limited time to answer email, but I'll try to respond to your responses as soon as I can. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 8/9] glsl: functions to serialize gl_shader and gl_shader_program
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: diff --git a/src/glsl/shader_cache.cpp b/src/glsl/shader_cache.cpp new file mode 100644 index 000..4b5de45 --- /dev/null +++ b/src/glsl/shader_cache.cpp @@ -0,0 +1,489 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include main/shaderobj.h +#include main/uniforms.h +#include main/macros.h +#include program/hash_table.h +#include ir_serialize.h +#include ir_deserializer.h +#include shader_cache_magic.h + +/** + * It is currently unknown if we need these to be available. + * There can be calls in mesa/main (like glGetAttachedShaders) that use + * the Shaders list. + */ +const bool STORE_UNLINKED_SHADERS = false; I think it really exaggerates our level of uncertainty to say that it is currently unknown whether we need unlinked shaders to be available. Speaking for myself at least, I'm quite convinced that we don't. AFAICT, OES_get_program_binary was purposefully architected so taht we don't need to store the unlinked shaders. GL has always maintained two independent collections of state about any given program: pre-link state and post-link state. There's a one-way flow of information from the pre-link state to the post-link state--once you've linked a program using glLinkShader(), there's no way to get the unlinked information back. This isn't a problem for most users of GL because most people don't try to modify the pre-link state after linking the program. However it's perfectly permissible for someone to create a program, attach shaders, link, and then detatch all the shaders. Since detatching the shaders affects pre-link state, the program would still work after this point (since the post-link information would not have changed since the link) but it would be impossible to query information about the individual shaders anymore. A similar situation exists when the client uses OES_get_attached_shaders. glProgramBinaryOES() bypasses the pre-link state completely, and drops the compiled binary directly into the post-link state. From the OES_get_program_binary spec: Note that ProgramBinaryOES disregards any shader objects attached to the program object, as these shader objects are used only by LinkProgram. And later: 7. Can BindAttribLocation be called after ProgramBinaryOES to remap an attribute location used by the program binary? RESOLVED: No. BindAttribLocation only affects the result of a subsequent call to LinkProgram. LinkProgram operates on the attached shader objects and replaces any program binary loaded prior to LinkProgram. So there is no mechanism to remap an attribute location after loading a program binary. However, an application is free to remap an attribute location prior to retrieving the program binary. By calling BindAttribLocation followed by LinkProgram, an application can remap the attribute location. If this is followed by a call to GetProgramBinaryOES, the retrieved program binary will include the desired attribute location assignment. So if the user creates a program and calls glProgramBinaryOES() followed by glGetAttachedShaders(), they will see no shaders, since the pre-link state is still in its initial configuration of having no shaders attached. IMHO, trying to plan for the contingency case where we're wrong about this is just going to lead to confusion and make the code hard to maintain. I think we should drop this const, and remove the code that would have been executed if STORE_UNLINKED_SHADERS were true. In the unlikely event that it turns out we were wrong about this (or Khronos makes a change to the spec that makes it necessary to store unlinked shaders) we can always change the code in a future patch. Also, incidentally, the way you've
Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 147 +++ 1 file changed, 147 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..a6c6b55 --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,147 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) {} + + ~memory_writer() + { + free(memory); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) int write_ ##type (const type data) {\ + return write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + int write_bool(bool data) + { + uint8_t val = data; + return write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + int write(const void *data, int32_t size) + { + if (!memory || pos (int32_t)(curr_size - size)) + if (grow(size)) +return -1; + + memcpy(memory + pos, data, size); + + pos += size; + return 0; + } + + int overwrite(const void *data, int32_t size, int32_t offset) + { + if (offset 0 || offset + size pos) + return -1; + memcpy(memory + offset, data, size); + return 0; + } + + int write_string(const char *str) + { + if (!str) + return -1; + char terminator = '\0'; + write(str, strlen(str)); + write(terminator, 1); C strings include a terminator, so there's no reason to write out the string contents and the terminator separtely. You can just do: write(str, strlen(str) + 1); Also, don't forget to propagate the return code to the caller: return write(str, strlen(str) + 1); + return 0; + } + + inline int32_t position() { return pos; } + + +private: + + /* reallocate more memory */ + int grow(int32_t size) + { + int32_t new_size = 2 * (curr_size + size); + char *more_mem = (char *) realloc(memory, new_size); + if (more_mem == NULL) { + free(memory); + memory = NULL; + return -1; + } else { + memory = more_mem; + curr_size = new_size; + return 0; + } + } + + /* allocated memory */ + char *memory; + + /* current size of the whole allocation */ + int32_t curr_size; + + /* write position / size of the data written */ + int32_t pos; +}; + +#endif /* ifdef __cplusplus */ + +#endif /* MEMORY_WRITER_H */ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 2/9] glsl: serialize methods for IR instructions
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: diff --git a/src/glsl/ir_serialize.cpp b/src/glsl/ir_serialize.cpp new file mode 100644 index 000..30ca018 --- /dev/null +++ b/src/glsl/ir_serialize.cpp @@ -0,0 +1,392 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include ir_serialize.h + + +/** + * Wraps serialization of an ir instruction, writes ir_type + * and length of each instruction package as a header for it + */ +void +ir_instruction::serialize(memory_writer mem) +{ + uint32_t data_len = 0; + uint8_t ir_type = this-ir_type; + mem.write_uint8_t(ir_type); + + int32_t start_pos = mem.position(); + mem.write_uint32_t(data_len); + + this-serialize_data(mem); + + data_len = mem.position() - start_pos - sizeof(data_len); + mem.overwrite(data_len, sizeof(data_len), start_pos); This function isn't checking the return values from mem.write_*(), so there's no way for it to detect failure. Also, since this function returns void, there's no way for it to notify the caller of failure. A similar comment applies to all of the other serialize*() functions in this patch. (Of course, considering our previous discussion about potentially removing these int return values, this issue may be moot). +} + + + + +static void +serialize_glsl_type(const glsl_type *type, memory_writer mem) The last time I reviewed this series, I mentioned the idea of making a hashtable that maps each glsl_type to a small integer, so that we could serialize each type just once (see http://lists.freedesktop.org/archives/mesa-dev/2013-November/047740.html). At the time, it sounded like you liked that idea. Have you made that change? It looks to me like you've stopped serializing the built-in types, but user-defined types are still serialized each time they occur. With those two issues addressed, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 3/9] glsl: memory_map helper class for data deserialization
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_map.h | 174 ++ 1 file changed, 174 insertions(+) create mode 100644 src/glsl/memory_map.h diff --git a/src/glsl/memory_map.h b/src/glsl/memory_map.h new file mode 100644 index 000..1b68b72 --- /dev/null +++ b/src/glsl/memory_map.h @@ -0,0 +1,174 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_MAP_H +#define MEMORY_MAP_H + +#include fcntl.h +#include unistd.h +#include sys/mman.h +#include sys/stat.h + +#ifdef __cplusplus + +/** + * Helper class to read data + * + * Class can read either from user given memory or from a file. On Linux + * file reading wraps around the Posix functions for mapping a file into + * the process's address space. Other OS may need different implementation. + */ +class memory_map +{ +public: + memory_map() : + mode(memory_map::READ_MEM), + fd(0), + cache_size(0), + cache_mmap(NULL), + cache_mmap_p(NULL) + { + /* only used by read_string() */ + mem_ctx = ralloc_context(NULL); + } + + /* read from disk */ + int map(const char *path) + { + struct stat stat_info; + if (stat(path, stat_info) != 0) + return -1; As before, I'm not thrilled with the use of -1 to mean failure and 0 to mean success, because it forces the caller to use counterintuitive if statements. I'd prefer for map() to return a bool with true meaning success and false meaning failure. + + mode = memory_map::READ_MAP; + cache_size = stat_info.st_size; + + fd = open(path, O_RDONLY); + if (fd) { + cache_mmap_p = cache_mmap = (char *) +mmap(NULL, cache_size, PROT_READ, MAP_PRIVATE, fd, 0); + return (cache_mmap == MAP_FAILED) ? -1 : 0; MAP_FAILED is a nonzero value, so if this error condition ever occurs, the destructor will errneously try to call munmap(). What I'd recommend doing instead is: void *mmap_result = mmap(...); if (mmap_result == MAP_FAILED) { close(fd); return -1; } cache_mmap_p = cache_mmap = (char *) mmap_result; return 0; + } + return -1; + } + + /* read from memory */ + int map(const void *memory, size_t size) + { + cache_mmap_p = cache_mmap = (char *) memory; + cache_size = size; + return 0; + } IMHO, functions that cannot fail should return void. + + /* wrap a portion from another map */ + int map(memory_map map, size_t size) + { + cache_mmap_p = cache_mmap = map.cache_mmap_p; + cache_size = size; + map.ffwd(size); + return 0; + } + + ~memory_map() { + if (cache_mmap mode == READ_MAP) { + munmap(cache_mmap, cache_size); + close(fd); + } + ralloc_free(mem_ctx); + } + + /* move read pointer forward */ + inline void ffwd(int len) + { + cache_mmap_p += len; + } + + inline void jump(unsigned pos) + { + cache_mmap_p = cache_mmap + pos; + } + + + /* position of read pointer */ + inline uint32_t position() + { + return cache_mmap_p - cache_mmap; + } + + inline char *read_string() + { + char *str = ralloc_strdup(mem_ctx, cache_mmap_p); + ffwd(strlen(str)+1); + return str; This is problematic from a security perspective. If the client provides corrupted data that ends in a truncated string (lacking a null terminator) that could cause ralloc_strdup() to try to read beyond the end of the file. We need to make sure the code doesn't try to read beyond the end of file, even if
Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: + + +/** + * Reads header part of the binary blob. Main purpose of this header is to + * validate that cached shader was produced with same Mesa driver version. + */ +int +ir_deserializer::read_header(struct gl_shader *shader) +{ + char *cache_magic_id = map-read_string(); + char *driver_vendor = map-read_string(); + char *driver_renderer = map-read_string(); + + /* only used or debug output, silence compiler warning */ + (void) driver_vendor; + (void) driver_renderer; A single version of Mesa potentially supports many different hardware types, and those different hardware types may define different values of GLSL built-in constants. They also may require core Mesa to do different sets of lowering passes during compilation. So we can't just ignore driver_vendor and driver_renderer. We need to reject the binary blob if they don't match. + + shader-Version = map-read_uint32_t(); + shader-Type = map-read_uint32_t(); + shader-IsES = map-read_uint8_t(); + + CACHE_DEBUG(%s: version %d, type 0x%x, %s (mesa %s)\n[%s %s]\n, + __func__, shader-Version, shader-Type, + (shader-IsES) ? glsl es : desktop glsl, + cache_magic_id, driver_vendor, driver_renderer); + + const char *magic = mesa_get_shader_cache_magic(); + + if (memcmp(cache_magic_id, magic, strlen(magic))) + return DIFFERENT_MESA_VER; If cache_magic_id is foobar and magic is foo, this will erroneusly consider them equal. The correct way to do this is to use strcmp(). + + /* post-link data */ + shader-num_samplers = map-read_uint32_t(); + shader-active_samplers = map-read_uint32_t(); + shader-shadow_samplers = map-read_uint32_t(); + shader-num_uniform_components = map-read_uint32_t(); + shader-num_combined_uniform_components = map-read_uint32_t(); + shader-uses_builtin_functions = map-read_uint8_t(); + + map-read(shader-Geom, sizeof(shader-Geom)); + + for (unsigned i = 0; i MAX_SAMPLERS; i++) + shader-SamplerUnits[i] = map-read_uint8_t(); + + for (unsigned i = 0; i MAX_SAMPLERS; i++) + shader-SamplerTargets[i] = (gl_texture_index) map-read_int32_t(); + + return 0; +} + + +const glsl_type * +ir_deserializer::read_glsl_type() +{ + char *name = map-read_string(); + uint32_t type_size = map-read_uint32_t(); + + const glsl_type *existing_type = + state-symbols-get_type(name); + + /* if type exists, move read pointer forward and return type */ + if (existing_type) { + map-ffwd(type_size); + return existing_type; + } + + uint8_t base_type = map-read_uint8_t(); + uint32_t length = map-read_uint32_t(); + uint8_t vector_elms = map-read_uint8_t(); + uint8_t matrix_cols = map-read_uint8_t(); + uint8_t interface_packing = map-read_uint8_t(); + + /* array type has additional element_type information */ + if (base_type == GLSL_TYPE_ARRAY) { + const glsl_type *element_type = read_glsl_type(); + if (!element_type) { + CACHE_DEBUG(error reading array element type\n); + return NULL; + } + return glsl_type::get_array_instance(element_type, length); + } + + /* structures have fields containing of names and types */ + else if (base_type == GLSL_TYPE_STRUCT || + base_type == GLSL_TYPE_INTERFACE) { + glsl_struct_field *fields = ralloc_array(mem_ctx, + glsl_struct_field, length); + + if (!fields) + return glsl_type::error_type; + + for (unsigned k = 0; k length; k++) { + uint8_t row_major, interpolation, centroid; + int32_t location; + char *field_name = map-read_string(); + fields[k].name = _mesa_strdup(field_name); + fields[k].type = read_glsl_type(); + row_major = map-read_uint8_t(); + location = map-read_int32_t(); + interpolation = map-read_uint8_t(); + centroid = map-read_uint8_t(); + fields[k].row_major = row_major; + fields[k].location = location; + fields[k].interpolation = interpolation; + fields[k].centroid = centroid; Another security issue: if the binary blob is corrupted, length may be outrageously large (e.g. 0x). We need a way for this loop to bail out and exit if it tries to read past the end of the binary blob. + } + + const glsl_type *ret_type = NULL; + + if (base_type == GLSL_TYPE_STRUCT) + ret_type = glsl_type::get_record_instance(fields, length, name); + else if (base_type == GLSL_TYPE_INTERFACE) + ret_type = glsl_type::get_interface_instance(fields, +length, (glsl_interface_packing) interface_packing, name); + + /* free allocated memory */ + for (unsigned k = 0; k length; k++) + free((void *)fields[k].name); + ralloc_free(fields); + + return ret_type;
Re: [Mesa-dev] [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages.
On 9 January 2014 22:17, Chris Forbes chr...@ijw.co.nz wrote: This is a nice cleanup; I like that this brings both writes to prog-LastClipDistanceArraySize together -- but it looks like the behavior changes slightly. Previously, if there was no VS and no GS, then we would never write prog-LastClipDistanceArraySize. Now we'll read an old junk value (potentially from a previous linking of the same program object with different shaders attached) from prog-Vert.ClipDistanceArraySize (since we never called validate_vertex_shader_executable) -- but we'll never end up actually using it, since it's only used for transform feedback of gl_ClipDistance. Yeah, that's a good point. I agree that it's completely benign, but I think I could have made things clearer. I've changed it to this: if (num_shaders[MESA_SHADER_GEOMETRY] 0) prog-LastClipDistanceArraySize = prog-Geom.ClipDistanceArraySize; else if (num_shaders[MESA_SHADER_VERTEX] 0) prog-LastClipDistanceArraySize = prog-Vert.ClipDistanceArraySize; else prog-LastClipDistanceArraySize = 0; /* Not used */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c.
On 9 January 2014 20:03, Chris Forbes chr...@ijw.co.nz wrote: This is a slightly odd construction (although copied from the existing code): + if ((shProg == NULL) || (shProg-_LinkedShaders[stage] == NULL)) +shProg = NULL; You're right. This would be much better as: if ((shProg != NULL) (shProg-_LinkedShaders[stage] == NULL)) shProg = NULL; I'll update it in a follow-up patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().
On 9 January 2014 22:32, Chris Forbes chr...@ijw.co.nz wrote: Minor nit, but could CS be done after the ordered pipeline stages, for consistency? Sure, no problem. In fact, I just realized that if I change the type parameter of use_shader_program to gl_shader_stage, I can call it in a loop. I'll do that in a follow-up patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.
On 10 January 2014 11:41, Kenneth Graunke kenn...@whitecape.org wrote: On 01/09/2014 06:19 PM, Paul Berry wrote: This is possible now that ctx-Shader.CurrentProgram is an array. --- src/mesa/main/texstate.c | 75 +++- 1 file changed, 29 insertions(+), 46 deletions(-) diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c index b9c76da..905a9d5 100644 --- a/src/mesa/main/texstate.c +++ b/src/mesa/main/texstate.c @@ -526,27 +526,20 @@ static void update_texture_state( struct gl_context *ctx ) { GLuint unit; - struct gl_program *fprog = NULL; - struct gl_program *vprog = NULL; - struct gl_program *gprog = NULL; + struct gl_program *prog[MESA_SHADER_STAGES]; GLbitfield enabledFragUnits = 0x0; - - if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkStatus) { - vprog = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-_LinkedShaders[MESA_SHADER_VERTEX]-Program; - } - - if (ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY] - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-LinkStatus) { - gprog = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-_LinkedShaders[MESA_SHADER_GEOMETRY]-Program; - } - - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] - ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-LinkStatus) { - fprog = ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-_LinkedShaders[MESA_SHADER_FRAGMENT]-Program; - } - else if (ctx-FragmentProgram._Enabled) { - fprog = ctx-FragmentProgram.Current-Base; + int i; + + for (i = 0; i MESA_SHADER_STAGES; i++) { + if (ctx-Shader.CurrentProgram[i] + ctx-Shader.CurrentProgram[i]-LinkStatus) { + prog[i] = ctx-Shader.CurrentProgram[i]-_LinkedShaders[i]-Program; + } else { + if (i == MESA_SHADER_FRAGMENT ctx-FragmentProgram._Enabled) +prog[i] = ctx-FragmentProgram.Current-Base; + else +prog[i] = NULL; + } } /* TODO: only set this if there are actual changes */ @@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx ) */ for (unit = 0; unit ctx-Const.MaxCombinedTextureImageUnits; unit++) { struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit]; - GLbitfield enabledVertTargets = 0x0; - GLbitfield enabledFragTargets = 0x0; - GLbitfield enabledGeomTargets = 0x0; + GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES]; GLbitfield enabledTargets = 0x0; GLuint texIndex; @@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx ) * by a fragment program/program. When multiple flags are set, we'll * settle on the one with highest priority (see below). */ - if (vprog) { - enabledVertTargets |= vprog-TexturesUsed[unit]; + for (i = 0; i MESA_SHADER_STAGES; i++) { + if (prog[i]) +enabledTargetsByStage[i] = prog[i]-TexturesUsed[unit]; + else if (i == MESA_SHADER_FRAGMENT) +enabledTargetsByStage[i] = texUnit-Enabled; + else +enabledTargetsByStage[i] = 0; + enabledTargets |= enabledTargetsByStage[i]; This doesn't look equivalent. Previously, enabled{Vert,Frag,Geom}Targets would accumulate bits over every iteration through units. Now, enabledTargetsByStage only reflects the most recent unit. enabledTargets still accumulates properly, but you use enabledTargetsByStage below, so I think it needs to as well. You just need to use |= in both places, like the old code. It's definitely possible that I got something wrong in this patch--I didn't completely grok what the code was doing and instead strived to just preserve the existing semantics. But I'm pretty sure I got the enabledTargetsByStage part right. Here's what the code used to look like--the enabled{Vert,Frag,Geom}Targets variables were all reset to 0 every time through the loop: for (unit = 0; unit ctx-Const.MaxCombinedTextureImageUnits; unit++) { struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit]; GLbitfield enabledVertTargets = 0x0; GLbitfield enabledFragTargets = 0x0; GLbitfield enabledGeomTargets = 0x0; GLbitfield enabledTargets = 0x0; GLuint texIndex; /* Get the bitmask of texture target enables. * enableBits will be a mask of the TEXTURE_*_BIT flags indicating * which texture targets are enabled (fixed function) or referenced * by a fragment program/program. When multiple flags are set, we'll * settle on the one with highest priority (see below). */ if (vprog) { enabledVertTargets |= vprog-TexturesUsed[unit]; } if (gprog) { enabledGeomTargets |= gprog-TexturesUsed[unit]; } if (fprog
[Mesa-dev] [PATCH] i965: Ensure that all necessary state is re-emitted if we run out of aperture.
Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/brw_draw.c | 6 ++ src/mesa/drivers/dri/i965/brw_state.h| 1 + src/mesa/drivers/dri/i965/brw_state_upload.c | 14 ++ 3 files changed, 21 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index b898cd3..ec56a21 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -499,6 +499,12 @@ retry: } } } + + /* Now that we know we haven't run out of aperture space, we can safely + * reset the dirty bits. + */ + if (brw-state.dirty.brw) + brw_clear_dirty_bits(brw); } if (brw-always_flush_batch) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index a148125..e9428fb 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -143,6 +143,7 @@ brw_depthbuffer_format(struct brw_context *brw); * brw_state.c */ void brw_upload_state(struct brw_context *brw); +void brw_clear_dirty_bits(struct brw_context *brw); void brw_init_state(struct brw_context *brw); void brw_destroy_state(struct brw_context *brw); diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 1eb3a79..7490df4 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -573,6 +573,20 @@ void brw_upload_state(struct brw_context *brw) fprintf(stderr, \n); } } +} + +/** + * Clear dirty bits to account for the fact that the state emitted by + * brw_upload_state() has been committed to the hardware. This is a separate + * call from brw_upload_state() because it's possible that after the call to + * brw_upload_state(), we will discover that we've run out of aperture space, + * and need to rewind the batch buffer to the state it had before the + * brw_upload_state() call. + */ +void +brw_clear_dirty_bits(struct brw_context *brw) +{ + struct brw_state_flags *state = brw-state.dirty; memset(state, 0, sizeof(*state)); } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] mesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target.
On 8 January 2014 12:21, Emil Velikov emil.l.veli...@gmail.com wrote: On 08/01/14 19:20, Paul Berry wrote: diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h index 4015b4c..648233c 100644 --- a/src/mesa/program/program.h +++ b/src/mesa/program/program.h @@ -207,6 +207,24 @@ _mesa_program_enum_to_shader_stage(GLenum v) } } + +static inline GLenum +_mesa_shader_stage_to_program(gl_shader_stage stage) +{ + switch (stage) { + case MESA_SHADER_VERTEX: + return GL_VERTEX_PROGRAM_ARB; + case MESA_SHADER_FRAGMENT: + return GL_FRAGMENT_PROGRAM_ARB; + case MESA_SHADER_GEOMETRY: + return GL_GEOMETRY_PROGRAM_NV; + } + + ASSERT(0); Hi Paul Can you use a normal assert that prints a somewhat informative message - similar to what you did in shader_stage_to_ptarget()? Thanks Emil Sure, no problem. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/30] mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.
In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. --- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/program/ir_to_mesa.cpp| 2 +- src/mesa/program/program.h | 19 +-- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +- 4 files changed, 4 insertions(+), 21 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index cf9ca4b..141f8a4 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -127,7 +127,7 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) continue; struct gl_program *prog = -ctx-Driver.NewProgram(ctx, _mesa_program_index_to_target(stage), +ctx-Driver.NewProgram(ctx, _mesa_shader_stage_to_program(stage), shader-base.Name); if (!prog) return false; diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index f6c229c..af6f59f 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -3053,7 +3053,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) _mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program, linked_prog); if (!ctx-Driver.ProgramStringNotify(ctx, - _mesa_program_index_to_target(i), + _mesa_shader_stage_to_program(i), linked_prog)) { return GL_FALSE; } diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h index 0e350cd..f666e30 100644 --- a/src/mesa/program/program.h +++ b/src/mesa/program/program.h @@ -209,7 +209,7 @@ _mesa_program_enum_to_shader_stage(GLenum v) static inline GLenum -_mesa_shader_stage_to_program(gl_shader_stage stage) +_mesa_shader_stage_to_program(unsigned stage) { switch (stage) { case MESA_SHADER_VERTEX: @@ -225,23 +225,6 @@ _mesa_shader_stage_to_program(gl_shader_stage stage) } -static inline GLenum -_mesa_program_index_to_target(GLuint i) -{ - static const GLenum enums[] = { - GL_VERTEX_PROGRAM_ARB, - GL_GEOMETRY_PROGRAM_NV, - GL_FRAGMENT_PROGRAM_ARB - }; - STATIC_ASSERT(Elements(enums) == MESA_SHADER_STAGES); - if(i = MESA_SHADER_STAGES) { - assert(!Unexpected program index); - return 0; - } else - return enums[i]; -} - - /* Cast wrappers from gl_program to gl_vertex/geometry/fragment_program */ static inline struct gl_fragment_program * diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 73c39eb..a7dfa67 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5319,7 +5319,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) _mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program, linked_prog); if (!ctx-Driver.ProgramStringNotify(ctx, - _mesa_program_index_to_target(i), + _mesa_shader_stage_to_program(i), linked_prog)) { _mesa_reference_program(ctx, prog-_LinkedShaders[i]-Program, NULL); -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/30] main: Allow ctx == NULL in _mesa_validate_shader_target().
This will allow this function to be used in circumstances where there is no context available, such as when building built-in GLSL functions. --- src/mesa/main/shaderapi.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 716e659..2ab0a0c 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -174,13 +174,20 @@ _mesa_copy_string(GLchar *dst, GLsizei maxLength, bool _mesa_validate_shader_target(const struct gl_context *ctx, GLenum type) { + /* Note: when building built-in GLSL functions, this function may be +* invoked with ctx == NULL. In that case, we can only validate that it's +* a shader target we recognize, not that it's supported in the current +* context. But that's fine--we don't need any further validation than +* that when building built-in GLSL functions. +*/ + switch (type) { case GL_FRAGMENT_SHADER: - return ctx-Extensions.ARB_fragment_shader; + return ctx == NULL || ctx-Extensions.ARB_fragment_shader; case GL_VERTEX_SHADER: - return ctx-Extensions.ARB_vertex_shader; + return ctx == NULL || ctx-Extensions.ARB_vertex_shader; case GL_GEOMETRY_SHADER_ARB: - return _mesa_has_geometry_shaders(ctx); + return ctx == NULL || _mesa_has_geometry_shaders(ctx); default: return false; } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/30] mesa: Make validate_shader_target() non-static.
--- src/mesa/main/shaderapi.c | 8 src/mesa/main/shaderapi.h | 3 +++ 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 6042fa8..716e659 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -171,8 +171,8 @@ _mesa_copy_string(GLchar *dst, GLsizei maxLength, * \param type Shader target * */ -static bool -validate_shader_target(const struct gl_context *ctx, GLenum type) +bool +_mesa_validate_shader_target(const struct gl_context *ctx, GLenum type) { switch (type) { case GL_FRAGMENT_SHADER: @@ -273,7 +273,7 @@ create_shader(struct gl_context *ctx, GLenum type) struct gl_shader *sh; GLuint name; - if (!validate_shader_target(ctx, type)) { + if (!_mesa_validate_shader_target(ctx, type)) { _mesa_error(ctx, GL_INVALID_ENUM, CreateShader(type)); return 0; } @@ -1739,7 +1739,7 @@ _mesa_UseShaderProgramEXT(GLenum type, GLuint program) GET_CURRENT_CONTEXT(ctx); struct gl_shader_program *shProg = NULL; - if (!validate_shader_target(ctx, type)) { + if (!_mesa_validate_shader_target(ctx, type)) { _mesa_error(ctx, GL_INVALID_ENUM, glUseShaderProgramEXT(type)); return; } diff --git a/src/mesa/main/shaderapi.h b/src/mesa/main/shaderapi.h index 4822e32..10f810c 100644 --- a/src/mesa/main/shaderapi.h +++ b/src/mesa/main/shaderapi.h @@ -215,6 +215,9 @@ _mesa_copy_linked_program_data(gl_shader_stage type, const struct gl_shader_program *src, struct gl_program *dst); +extern bool +_mesa_validate_shader_target(const struct gl_context *ctx, GLenum type); + #ifdef __cplusplus } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/30] mesa: Remove ad-hoc arrays of gl_shader_program.
Now that we have a ctx-Shader.CurrentProgram array, we can just use it directly. --- src/mesa/main/context.c | 6 +- src/mesa/state_tracker/st_draw.c| 6 +- src/mesa/state_tracker/st_program.c | 6 +- 3 files changed, 3 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 026d7aa..5855f15 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1853,13 +1853,9 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) #ifdef DEBUG if (ctx-Shader.Flags GLSL_LOG) { - struct gl_shader_program *shProg[MESA_SHADER_STAGES]; + struct gl_shader_program **shProg = ctx-Shader.CurrentProgram; gl_shader_stage i; - shProg[MESA_SHADER_VERTEX] = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; - shProg[MESA_SHADER_GEOMETRY] = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; - shProg[MESA_SHADER_FRAGMENT] = ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]; - for (i = 0; i MESA_SHADER_STAGES; i++) { if (shProg[i] == NULL || shProg[i]-_Used || shProg[i]-_LinkedShaders[i] == NULL) diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index 75a71f1..85677c3 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -131,11 +131,7 @@ setup_index_buffer(struct st_context *st, static void check_uniforms(struct gl_context *ctx) { - struct gl_shader_program *shProg[3] = { - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX], - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], - ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT], - }; + struct gl_shader_program **shProg = ctx-Shader.CurrentProgram; unsigned j; for (j = 0; j 3; j++) { diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index fe3dafe..ef2abdb 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -1195,11 +1195,7 @@ st_get_gp_variant(struct st_context *st, void st_print_shaders(struct gl_context *ctx) { - struct gl_shader_program *shProg[3] = { - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX], - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], - ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT], - }; + struct gl_shader_program **shProg = ctx-Shader.CurrentProgram; unsigned j; for (j = 0; j 3; j++) { -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/30] i965: Fix comments to refer to the new ctx-Shader.CurrentProgram array.
--- src/mesa/drivers/dri/i965/brw_wm_state.c | 4 ++-- src/mesa/drivers/dri/i965/gen6_wm_state.c | 4 ++-- src/mesa/drivers/dri/i965/gen7_wm_state.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index 303a2eb..514dfb2 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -112,8 +112,8 @@ brw_upload_wm_unit(struct brw_context *brw) wm-thread1.depth_coef_urb_read_offset = 1; /* Use ALT floating point mode for ARB fragment programs, because they * require 0^0 == 1. Even though _CurrentFragmentProgram is used for -* rendering, CurrentFragmentProgram is used for this check to -* differentiate between the GLSL and non-GLSL cases. +* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check +* to differentiate between the GLSL and non-GLSL cases. */ if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) wm-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c b/src/mesa/drivers/dri/i965/gen6_wm_state.c index 5188aa8..585c0c5 100644 --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c @@ -137,8 +137,8 @@ upload_wm_state(struct brw_context *brw) /* Use ALT floating point mode for ARB fragment programs, because they * require 0^0 == 1. Even though _CurrentFragmentProgram is used for -* rendering, CurrentFragmentProgram is used for this check to -* differentiate between the GLSL and non-GLSL cases. +* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check +* to differentiate between the GLSL and non-GLSL cases. */ if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) dw2 |= GEN6_WM_FLOATING_POINT_MODE_ALT; diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c index 7f2a50c..284f8b6 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c @@ -170,8 +170,8 @@ upload_ps_state(struct brw_context *brw) /* Use ALT floating point mode for ARB fragment programs, because they * require 0^0 == 1. Even though _CurrentFragmentProgram is used for -* rendering, CurrentFragmentProgram is used for this check to -* differentiate between the GLSL and non-GLSL cases. +* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check +* to differentiate between the GLSL and non-GLSL cases. */ /* BRW_NEW_FRAGMENT_PROGRAM */ if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/30] glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.
Linker loops that iterate through all the stages in the pipeline need to use MESA_SHADER_FRAGMENT as a bound, so that we can add an additional MESA_SHADER_COMPUTE stage, without it being erroneously included in the pipeline. --- src/glsl/linker.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index f3fd66f..7461b17 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2094,7 +2094,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) unsigned prev; - for (prev = 0; prev MESA_SHADER_STAGES; prev++) { + for (prev = 0; prev = MESA_SHADER_FRAGMENT; prev++) { if (prog-_LinkedShaders[prev] != NULL) break; } @@ -2102,7 +2102,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) /* Validate the inputs of each stage with the output of the preceding * stage. */ - for (unsigned i = prev + 1; i MESA_SHADER_STAGES; i++) { + for (unsigned i = prev + 1; i = MESA_SHADER_FRAGMENT; i++) { if (prog-_LinkedShaders[i] == NULL) continue; @@ -2197,7 +2197,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) } unsigned first; - for (first = 0; first MESA_SHADER_STAGES; first++) { + for (first = 0; first = MESA_SHADER_FRAGMENT; first++) { if (prog-_LinkedShaders[first] != NULL) break; } @@ -2229,7 +2229,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) * eliminated if they are (transitively) not used in a later stage. */ int last, next; - for (last = MESA_SHADER_STAGES-1; last = 0; last--) { + for (last = MESA_SHADER_FRAGMENT; last = 0; last--) { if (prog-_LinkedShaders[last] != NULL) break; } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages.
Rather than maintain separately named arrays and counts for vertex, geometry, and fragment shaders, just maintain these as arrays indexed by the gl_shader_type enum. --- src/glsl/linker.cpp | 114 ++-- 1 file changed, 39 insertions(+), 75 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index e820f0f..f3fd66f 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -1994,19 +1994,14 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) /* Separate the shaders into groups based on their type. */ - struct gl_shader **vert_shader_list; - unsigned num_vert_shaders = 0; - struct gl_shader **frag_shader_list; - unsigned num_frag_shaders = 0; - struct gl_shader **geom_shader_list; - unsigned num_geom_shaders = 0; - - vert_shader_list = (struct gl_shader **) - calloc(prog-NumShaders, sizeof(struct gl_shader *)); - frag_shader_list = (struct gl_shader **) - calloc(prog-NumShaders, sizeof(struct gl_shader *)); - geom_shader_list = (struct gl_shader **) - calloc(prog-NumShaders, sizeof(struct gl_shader *)); + struct gl_shader **shader_list[MESA_SHADER_STAGES]; + unsigned num_shaders[MESA_SHADER_STAGES]; + + for (int i = 0; i MESA_SHADER_STAGES; i++) { + shader_list[i] = (struct gl_shader **) + calloc(prog-NumShaders, sizeof(struct gl_shader *)); + num_shaders[i] = 0; + } unsigned min_version = UINT_MAX; unsigned max_version = 0; @@ -2022,20 +2017,9 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) goto done; } - switch (prog-Shaders[i]-Stage) { - case MESA_SHADER_VERTEX: -vert_shader_list[num_vert_shaders] = prog-Shaders[i]; -num_vert_shaders++; -break; - case MESA_SHADER_FRAGMENT: -frag_shader_list[num_frag_shaders] = prog-Shaders[i]; -num_frag_shaders++; -break; - case MESA_SHADER_GEOMETRY: -geom_shader_list[num_geom_shaders] = prog-Shaders[i]; -num_geom_shaders++; -break; - } + gl_shader_stage shader_type = prog-Shaders[i]-Stage; + shader_list[shader_type][num_shaders[shader_type]] = prog-Shaders[i]; + num_shaders[shader_type]++; } /* In desktop GLSL, different shader versions may be linked together. In @@ -2052,7 +2036,8 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) /* Geometry shaders have to be linked with vertex shaders. */ - if (num_geom_shaders 0 num_vert_shaders == 0) { + if (num_shaders[MESA_SHADER_GEOMETRY] 0 + num_shaders[MESA_SHADER_VERTEX] == 0) { linker_error(prog, Geometry shader must be linked with vertex shader\n); goto done; @@ -2067,55 +2052,37 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) /* Link all shaders for a particular stage and validate the result. */ - if (num_vert_shaders 0) { - gl_shader *const sh = -link_intrastage_shaders(mem_ctx, ctx, prog, vert_shader_list, -num_vert_shaders); - - if (!prog-LinkStatus) -goto done; - - validate_vertex_shader_executable(prog, sh); - if (!prog-LinkStatus) -goto done; - prog-LastClipDistanceArraySize = prog-Vert.ClipDistanceArraySize; + for (int stage = 0; stage MESA_SHADER_STAGES; stage++) { + if (num_shaders[stage] 0) { + gl_shader *const sh = +link_intrastage_shaders(mem_ctx, ctx, prog, shader_list[stage], +num_shaders[stage]); - _mesa_reference_shader(ctx, prog-_LinkedShaders[MESA_SHADER_VERTEX], -sh); - } - - if (num_frag_shaders 0) { - gl_shader *const sh = -link_intrastage_shaders(mem_ctx, ctx, prog, frag_shader_list, -num_frag_shaders); - - if (!prog-LinkStatus) -goto done; + if (!prog-LinkStatus) +goto done; - validate_fragment_shader_executable(prog, sh); - if (!prog-LinkStatus) -goto done; + switch (stage) { + case MESA_SHADER_VERTEX: +validate_vertex_shader_executable(prog, sh); +break; + case MESA_SHADER_GEOMETRY: +validate_geometry_shader_executable(prog, sh); +break; + case MESA_SHADER_FRAGMENT: +validate_fragment_shader_executable(prog, sh); +break; + } + if (!prog-LinkStatus) +goto done; - _mesa_reference_shader(ctx, prog-_LinkedShaders[MESA_SHADER_FRAGMENT], -sh); + _mesa_reference_shader(ctx, prog-_LinkedShaders[stage], sh); + } } - if (num_geom_shaders 0) { - gl_shader *const sh = -link_intrastage_shaders(mem_ctx, ctx, prog, geom_shader_list, -num_geom_shaders); - -
[Mesa-dev] [PATCH 07/30] mesa: Fold long lines introduced by the previous patch.
--- src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 6 -- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 5 +++-- src/mesa/drivers/dri/i965/brw_vs.c | 5 +++-- src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 6 -- src/mesa/main/context.c | 6 -- src/mesa/main/state.c| 9 ++--- src/mesa/main/transformfeedback.c| 3 ++- src/mesa/state_tracker/st_atom_constbuf.c| 9 ++--- 8 files changed, 32 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c index 1ccf2e2..0795e56 100644 --- a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c @@ -69,7 +69,8 @@ brw_upload_gs_ubo_surfaces(struct brw_context *brw) struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; + struct gl_shader_program *prog = + ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; if (!prog) return; @@ -93,7 +94,8 @@ brw_upload_gs_abo_surfaces(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; + struct gl_shader_program *prog = + ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; if (prog) { /* CACHE_NEW_GS_PROG */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 7a803cb..e2a4a38 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -290,8 +290,9 @@ brw_upload_gs_prog(struct brw_context *brw) if (!brw_search_cache(brw-cache, BRW_GS_PROG, key, sizeof(key), stage_state-prog_offset, brw-gs.prog_data)) { - bool success = do_gs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], -gp, key); + bool success = + do_gs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], gp, +key); assert(success); } brw-gs.base.prog_data = brw-gs.prog_data-base.base; diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 351a83e..971b2d1 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -481,8 +481,9 @@ static void brw_upload_vs_prog(struct brw_context *brw) if (!brw_search_cache(brw-cache, BRW_VS_PROG, key, sizeof(key), brw-vs.base.prog_offset, brw-vs.prog_data)) { - bool success = do_vs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX], - vp, key); + bool success = + do_vs_prog(brw, ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX], vp, +key); (void) success; assert(success); } diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c index 2fbcdf9..e707d32 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c @@ -130,7 +130,8 @@ brw_upload_vs_ubo_surfaces(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; + struct gl_shader_program *prog = + ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; if (!prog) return; @@ -154,7 +155,8 @@ brw_upload_vs_abo_surfaces(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; + struct gl_shader_program *prog = + ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; if (prog) { /* CACHE_NEW_VS_PROG */ diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 8e978eb..026d7aa 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1780,7 +1780,8 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY], errMsg)) { _mesa_warning(ctx, Shader program %u is invalid: %s, - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-Name, errMsg); + ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-Name, + errMsg); } } #endif @@ -1801,7 +1802,8 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT], errMsg)) { _mesa_warning(ctx, Shader program %u is
[Mesa-dev] [PATCH 18/30] glsl/cs: update main.cpp to use the .comp extension for compute shaders.
--- src/glsl/main.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp index afc15cb..864c929 100644 --- a/src/glsl/main.cpp +++ b/src/glsl/main.cpp @@ -364,6 +364,8 @@ main(int argc, char **argv) shader-Type = GL_GEOMETRY_SHADER; else if (strncmp(.frag, ext, 5) == 0) shader-Type = GL_FRAGMENT_SHADER; + else if (strncmp(.comp, ext, 5) == 0) + shader-Type = GL_COMPUTE_SHADER; else usage_fail(argv[0]); shader-Stage = _mesa_shader_enum_to_shader_stage(shader-Type); -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/30] mesa: Replace ctx-Shader.Current{Vertex, Fragment, Geometry}Program with an array.
These are replaced with ctx-Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \ -print0 | xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' --- src/mesa/drivers/common/meta.c | 6 ++--- src/mesa/drivers/dri/i965/brw_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_vs.c | 4 ++-- src/mesa/drivers/dri/i965/brw_vs_state.c | 2 +- src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen6_sol.c | 6 ++--- src/mesa/drivers/dri/i965/gen6_vs_state.c| 2 +- src/mesa/drivers/dri/i965/gen6_wm_state.c| 2 +- src/mesa/drivers/dri/i965/gen7_sol_state.c | 4 ++-- src/mesa/drivers/dri/i965/gen7_vs_state.c| 2 +- src/mesa/drivers/dri/i965/gen7_wm_state.c| 2 +- src/mesa/main/api_validate.c | 10 src/mesa/main/context.c | 30 src/mesa/main/ff_fragment_shader.cpp | 8 +++ src/mesa/main/mtypes.h | 7 +++--- src/mesa/main/shaderapi.c| 12 +- src/mesa/main/state.c| 8 +++ src/mesa/main/texstate.c | 18 +++--- src/mesa/main/transformfeedback.c| 8 +++ src/mesa/state_tracker/st_atom_clip.c| 2 +- src/mesa/state_tracker/st_atom_constbuf.c| 6 ++--- src/mesa/state_tracker/st_cb_drawpixels.c| 2 +- src/mesa/state_tracker/st_draw.c | 6 ++--- src/mesa/state_tracker/st_program.c | 6 ++--- src/mesa/swrast/s_fragprog.c | 2 +- 28 files changed, 84 insertions(+), 85 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 7b41876..5643e3c 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -618,11 +618,11 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) } _mesa_reference_shader_program(ctx, save-VertexShader, - ctx-Shader.CurrentVertexProgram); + ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]); _mesa_reference_shader_program(ctx, save-GeometryShader, - ctx-Shader.CurrentGeometryProgram); + ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]); _mesa_reference_shader_program(ctx, save-FragmentShader, - ctx-Shader.CurrentFragmentProgram); + ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]); _mesa_reference_shader_program(ctx, save-ActiveShader, ctx-Shader.ActiveProgram); diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index faa8f94..1ba26de 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -187,7 +187,7 @@ static void populate_key(struct brw_context *brw, /* BRW_NEW_TRANSFORM_FEEDBACK */ if (_mesa_is_xfb_active_and_unpaused(ctx)) { const struct gl_shader_program *shaderprog = -ctx-Shader.CurrentVertexProgram; +ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]; const struct gl_transform_feedback_info *linked_xfb_info = shaderprog-LinkedTransformFeedback; int i; diff --git a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c index 5661941..1ccf2e2 100644 --- a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c @@ -69,7 +69,7 @@ brw_upload_gs_ubo_surfaces(struct brw_context *brw) struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentGeometryProgram; + struct gl_shader_program *prog = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]; if (!prog) return; @@ -93,7 +93,7 @@ brw_upload_gs_abo_surfaces(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; /* _NEW_PROGRAM */ - struct gl_shader_program *prog = ctx-Shader.CurrentGeometryProgram; + struct gl_shader_program *prog =
[Mesa-dev] [PATCH 20/30] mesa/cs: Create the gl_compute_program struct, and the code to initialize it.
--- src/mesa/main/mtypes.h | 7 +++ src/mesa/program/program.c | 20 src/mesa/program/program.h | 5 + 3 files changed, 32 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 8b88d75..e6c3a22 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2167,6 +2167,13 @@ struct gl_fragment_program }; +/** Compute program object */ +struct gl_compute_program +{ + struct gl_program Base; /** base class */ +}; + + /** * State common to vertex and fragment programs. */ diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index 3c19e8c..d21bfa0 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -279,6 +279,21 @@ _mesa_init_vertex_program( struct gl_context *ctx, struct gl_vertex_program *pro /** + * Initialize a new compute program object. + */ +struct gl_program * +_mesa_init_compute_program(struct gl_context *ctx, + struct gl_compute_program *prog, GLenum target, + GLuint id) +{ + if (prog) + return _mesa_init_program_struct( ctx, prog-Base, target, id ); + else + return NULL; +} + + +/** * Initialize a new geometry program object. */ struct gl_program * @@ -324,6 +339,11 @@ _mesa_new_program(struct gl_context *ctx, GLenum target, GLuint id) CALLOC_STRUCT(gl_geometry_program), target, id); break; + case GL_COMPUTE_PROGRAM_NV: + prog = _mesa_init_compute_program(ctx, +CALLOC_STRUCT(gl_compute_program), +target, id); + break; default: _mesa_problem(ctx, bad target in _mesa_new_program); prog = NULL; diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h index 84aa8cb..cab7d71 100644 --- a/src/mesa/program/program.h +++ b/src/mesa/program/program.h @@ -84,6 +84,11 @@ _mesa_init_geometry_program(struct gl_context *ctx, GLenum target, GLuint id); extern struct gl_program * +_mesa_init_compute_program(struct gl_context *ctx, + struct gl_compute_program *prog, + GLenum target, GLuint id); + +extern struct gl_program * _mesa_new_program(struct gl_context *ctx, GLenum target, GLuint id); extern void -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/30] mesa: Change redundant code into loops in texstate.c.
This is possible now that ctx-Shader.CurrentProgram is an array. --- src/mesa/main/texstate.c | 75 +++- 1 file changed, 29 insertions(+), 46 deletions(-) diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c index b9c76da..905a9d5 100644 --- a/src/mesa/main/texstate.c +++ b/src/mesa/main/texstate.c @@ -526,27 +526,20 @@ static void update_texture_state( struct gl_context *ctx ) { GLuint unit; - struct gl_program *fprog = NULL; - struct gl_program *vprog = NULL; - struct gl_program *gprog = NULL; + struct gl_program *prog[MESA_SHADER_STAGES]; GLbitfield enabledFragUnits = 0x0; - - if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-LinkStatus) { - vprog = ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]-_LinkedShaders[MESA_SHADER_VERTEX]-Program; - } - - if (ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY] - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-LinkStatus) { - gprog = ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]-_LinkedShaders[MESA_SHADER_GEOMETRY]-Program; - } - - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] - ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-LinkStatus) { - fprog = ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]-_LinkedShaders[MESA_SHADER_FRAGMENT]-Program; - } - else if (ctx-FragmentProgram._Enabled) { - fprog = ctx-FragmentProgram.Current-Base; + int i; + + for (i = 0; i MESA_SHADER_STAGES; i++) { + if (ctx-Shader.CurrentProgram[i] + ctx-Shader.CurrentProgram[i]-LinkStatus) { + prog[i] = ctx-Shader.CurrentProgram[i]-_LinkedShaders[i]-Program; + } else { + if (i == MESA_SHADER_FRAGMENT ctx-FragmentProgram._Enabled) +prog[i] = ctx-FragmentProgram.Current-Base; + else +prog[i] = NULL; + } } /* TODO: only set this if there are actual changes */ @@ -562,9 +555,7 @@ update_texture_state( struct gl_context *ctx ) */ for (unit = 0; unit ctx-Const.MaxCombinedTextureImageUnits; unit++) { struct gl_texture_unit *texUnit = ctx-Texture.Unit[unit]; - GLbitfield enabledVertTargets = 0x0; - GLbitfield enabledFragTargets = 0x0; - GLbitfield enabledGeomTargets = 0x0; + GLbitfield enabledTargetsByStage[MESA_SHADER_STAGES]; GLbitfield enabledTargets = 0x0; GLuint texIndex; @@ -574,25 +565,16 @@ update_texture_state( struct gl_context *ctx ) * by a fragment program/program. When multiple flags are set, we'll * settle on the one with highest priority (see below). */ - if (vprog) { - enabledVertTargets |= vprog-TexturesUsed[unit]; + for (i = 0; i MESA_SHADER_STAGES; i++) { + if (prog[i]) +enabledTargetsByStage[i] = prog[i]-TexturesUsed[unit]; + else if (i == MESA_SHADER_FRAGMENT) +enabledTargetsByStage[i] = texUnit-Enabled; + else +enabledTargetsByStage[i] = 0; + enabledTargets |= enabledTargetsByStage[i]; } - if (gprog) { - enabledGeomTargets |= gprog-TexturesUsed[unit]; - } - - if (fprog) { - enabledFragTargets |= fprog-TexturesUsed[unit]; - } - else { - /* fixed-function fragment program */ - enabledFragTargets |= texUnit-Enabled; - } - - enabledTargets = enabledVertTargets | enabledFragTargets | - enabledGeomTargets; - texUnit-_ReallyEnabled = 0x0; if (enabledTargets == 0x0) { @@ -624,7 +606,7 @@ update_texture_state( struct gl_context *ctx ) } if (!texUnit-_ReallyEnabled) { - if (fprog) { + if (prog[MESA_SHADER_FRAGMENT]) { /* If we get here it means the shader is expecting a texture * object, but there isn't one (or it's incomplete). Use the * fallback texture. @@ -654,25 +636,26 @@ update_texture_state( struct gl_context *ctx ) ctx-Texture._EnabledUnits |= (1 unit); - if (enabledFragTargets) + if (enabledTargetsByStage[MESA_SHADER_FRAGMENT]) enabledFragUnits |= (1 unit); - if (!fprog) + if (!prog[MESA_SHADER_FRAGMENT]) update_tex_combine(ctx, texUnit); } /* Determine which texture coordinate sets are actually needed */ - if (fprog) { + if (prog[MESA_SHADER_FRAGMENT]) { const GLuint coordMask = (1 MAX_TEXTURE_COORD_UNITS) - 1; ctx-Texture._EnabledCoordUnits - = (fprog-InputsRead VARYING_SLOT_TEX0) coordMask; + = (prog[MESA_SHADER_FRAGMENT]-InputsRead VARYING_SLOT_TEX0) + coordMask; } else { ctx-Texture._EnabledCoordUnits = enabledFragUnits; } - if (!fprog || !vprog) + if (!prog[MESA_SHADER_FRAGMENT] || !prog[MESA_SHADER_VERTEX]) update_texgen(ctx); } -- 1.8.5.2 ___ mesa-dev mailing list
[Mesa-dev] [PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.
This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum. Also, where it is trivial to do so, it adds a compute shader case to switch statements that switch based on the type of shader. This avoids unhandled switch case compiler warnings. --- src/glsl/ast_to_hir.cpp | 9 + src/glsl/builtin_variables.cpp| 17 + src/glsl/standalone_scaffolding.h | 2 ++ src/mesa/main/context.c | 8 src/mesa/main/mtypes.h| 3 ++- src/mesa/main/shaderapi.c | 5 + src/mesa/main/shaderobj.h | 2 ++ src/mesa/program/prog_print.c | 3 +++ src/mesa/program/program.h| 4 9 files changed, 52 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index ef8e699..a26745d 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2093,6 +2093,12 @@ validate_explicit_location(const struct ast_type_qualifier *qual, fail = true; break; + + case MESA_SHADER_COMPUTE: + _mesa_glsl_error(loc, state, + compute shader variables cannot be given + explicit locations); + return; }; if (fail) { @@ -2275,6 +2281,9 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual, if (var-data.mode == ir_var_shader_in) var-data.invariant = true; break; + case MESA_SHADER_COMPUTE: + /* Invariance isn't meaningful in compute shaders. */ + break; } } diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index f630923..17ae087 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -356,6 +356,7 @@ public: void generate_vs_special_vars(); void generate_gs_special_vars(); void generate_fs_special_vars(); + void generate_cs_special_vars(); void generate_varyings(); private: @@ -866,6 +867,16 @@ builtin_variable_generator::generate_fs_special_vars() /** + * Generate variables which only exist in compute shaders. + */ +void +builtin_variable_generator::generate_cs_special_vars() +{ + /* TODO: finish this. */ +} + + +/** * Add a single varying variable. The variable's type and direction (input * or output) are adjusted as appropriate for the type of shader being * compiled. For geometry shaders using {ARB,EXT}_geometry_shader4, @@ -886,6 +897,9 @@ builtin_variable_generator::add_varying(int slot, const glsl_type *type, case MESA_SHADER_FRAGMENT: add_input(slot, type, name); break; + case MESA_SHADER_COMPUTE: + /* Compute shaders don't have varyings. */ + break; } } @@ -973,5 +987,8 @@ _mesa_glsl_initialize_variables(exec_list *instructions, case MESA_SHADER_FRAGMENT: gen.generate_fs_special_vars(); break; + case MESA_SHADER_COMPUTE: + gen.generate_cs_special_vars(); + break; } } diff --git a/src/glsl/standalone_scaffolding.h b/src/glsl/standalone_scaffolding.h index 327fef2..df783af 100644 --- a/src/glsl/standalone_scaffolding.h +++ b/src/glsl/standalone_scaffolding.h @@ -58,6 +58,8 @@ _mesa_shader_enum_to_shader_stage(GLenum v) return MESA_SHADER_FRAGMENT; case GL_GEOMETRY_SHADER: return MESA_SHADER_GEOMETRY; + case GL_COMPUTE_SHADER: + return MESA_SHADER_COMPUTE; default: assert(!bad value in _mesa_shader_enum_to_shader_stage()); return MESA_SHADER_VERTEX; diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 5855f15..b0cf5da 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -498,6 +498,14 @@ init_program_limits(struct gl_context *ctx, gl_shader_stage stage, prog-MaxInputComponents = 16 * 4; /* old limit not to break tnl and swrast */ prog-MaxOutputComponents = 16 * 4; /* old limit not to break tnl and swrast */ break; + case MESA_SHADER_COMPUTE: + prog-MaxParameters = 0; /* not meaningful for compute shaders */ + prog-MaxAttribs = 0; /* not meaningful for compute shaders */ + prog-MaxAddressRegs = 0; /* not meaningful for compute shaders */ + prog-MaxUniformComponents = 4 * MAX_UNIFORMS; + prog-MaxInputComponents = 0; /* not meaningful for compute shaders */ + prog-MaxOutputComponents = 0; /* not meaningful for compute shaders */ + break; default: assert(0 Bad shader stage in init_program_limits()); } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 7ba7b10..8b88d75 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -404,9 +404,10 @@ typedef enum MESA_SHADER_VERTEX = 0, MESA_SHADER_GEOMETRY = 1, MESA_SHADER_FRAGMENT = 2, + MESA_SHADER_COMPUTE = 3, } gl_shader_stage; -#define MESA_SHADER_STAGES (MESA_SHADER_FRAGMENT + 1) +#define MESA_SHADER_STAGES (MESA_SHADER_COMPUTE + 1) /** diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 61ac0e3..519b200
[Mesa-dev] [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program().
--- src/mesa/main/shaderapi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 519b200..5188e9c 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -993,6 +993,7 @@ _mesa_use_program(struct gl_context *ctx, struct gl_shader_program *shProg) { use_shader_program(ctx, GL_VERTEX_SHADER, shProg); use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, shProg); + use_shader_program(ctx, GL_COMPUTE_SHADER, shProg); use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg); _mesa_active_program(ctx, shProg, glUseProgram); -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/30] glsl/cs: Populate default values for ctx-Const.Program[MESA_SHADER_COMPUTE].
--- src/glsl/main.cpp | 4 src/glsl/standalone_scaffolding.cpp | 4 2 files changed, 8 insertions(+) diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp index 03b7c78..afc15cb 100644 --- a/src/glsl/main.cpp +++ b/src/glsl/main.cpp @@ -50,6 +50,10 @@ initialize_context(struct gl_context *ctx, gl_api api) */ ctx-Const.GLSLVersion = glsl_version; ctx-Extensions.ARB_ES3_compatibility = true; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not used */ switch (ctx-Const.GLSLVersion) { case 100: diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index fe66067..ab92da8 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -140,6 +140,10 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32; ctx-Const.MaxDrawBuffers = 1; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ + ctx-Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not used */ /* Set up default shader compiler options. */ struct gl_shader_compiler_options options; -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it.
--- src/mesa/drivers/dri/i965/brw_context.h | 8 src/mesa/drivers/dri/i965/brw_program.c | 11 +++ 2 files changed, 19 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index df32ccb..abc1783 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -316,6 +316,14 @@ struct brw_fragment_program { GLuint id; /** serial no. to identify frag progs, never re-used */ }; + +/** Subclass of Mesa compute program */ +struct brw_compute_program { + struct gl_compute_program program; + unsigned id; /** serial no. to identify frag progs, never re-used */ +}; + + struct brw_shader { struct gl_shader base; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 90844e5..2d92acb 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -113,6 +113,17 @@ static struct gl_program *brwNewProgram( struct gl_context *ctx, } } + case GL_COMPUTE_PROGRAM_NV: { + struct brw_compute_program *prog = CALLOC_STRUCT(brw_compute_program); + if (prog) { + prog-id = get_new_program_id(brw-intelScreen); + + return _mesa_init_compute_program(ctx, prog-program, target, id); + } else { + return NULL; + } + } + default: assert(!Unsupported target in brwNewProgram()); return NULL; -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/30] meta: Replace save_state::{Vertex, Geometry, Fragment}Shader with an array.
Since ctx-Shader.Current{Vertex,Geometry,Fragment}Program is an array, this allows some meta code to be rolled up into loops. --- src/mesa/drivers/common/meta.c | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 5643e3c..2eeb09b 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -138,9 +138,7 @@ struct save_state GLboolean FragmentProgramEnabled; struct gl_fragment_program *FragmentProgram; GLboolean ATIFragmentShaderEnabled; - struct gl_shader_program *VertexShader; - struct gl_shader_program *GeometryShader; - struct gl_shader_program *FragmentShader; + struct gl_shader_program *Shader[MESA_SHADER_STAGES]; struct gl_shader_program *ActiveShader; /** MESA_META_STENCIL_TEST */ @@ -617,12 +615,10 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) _mesa_set_enable(ctx, GL_FRAGMENT_SHADER_ATI, GL_FALSE); } - _mesa_reference_shader_program(ctx, save-VertexShader, - ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX]); - _mesa_reference_shader_program(ctx, save-GeometryShader, - ctx-Shader.CurrentProgram[MESA_SHADER_GEOMETRY]); - _mesa_reference_shader_program(ctx, save-FragmentShader, - ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT]); + for (int i = 0; i MESA_SHADER_STAGES; i++) { + _mesa_reference_shader_program(ctx, save-Shader[i], + ctx-Shader.CurrentProgram[i]); + } _mesa_reference_shader_program(ctx, save-ActiveShader, ctx-Shader.ActiveProgram); @@ -829,6 +825,7 @@ _mesa_meta_end(struct gl_context *ctx) { struct save_state *save = ctx-Meta-Save[ctx-Meta-SaveStackDepth - 1]; const GLbitfield state = save-SavedState; + int i; /* After starting a new occlusion query, initialize the results to the * values saved previously. The driver will then continue to increment @@ -960,23 +957,24 @@ _mesa_meta_end(struct gl_context *ctx) save-ATIFragmentShaderEnabled); } - if (ctx-Extensions.ARB_vertex_shader) -_mesa_use_shader_program(ctx, GL_VERTEX_SHADER, save-VertexShader); + if (ctx-Extensions.ARB_vertex_shader) { +_mesa_use_shader_program(ctx, GL_VERTEX_SHADER, + save-Shader[MESA_SHADER_VERTEX]); + } if (_mesa_has_geometry_shaders(ctx)) _mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, - save-GeometryShader); + save-Shader[MESA_SHADER_GEOMETRY]); if (ctx-Extensions.ARB_fragment_shader) _mesa_use_shader_program(ctx, GL_FRAGMENT_SHADER, - save-FragmentShader); + save-Shader[MESA_SHADER_FRAGMENT]); _mesa_reference_shader_program(ctx, ctx-Shader.ActiveProgram, save-ActiveShader); - _mesa_reference_shader_program(ctx, save-VertexShader, NULL); - _mesa_reference_shader_program(ctx, save-GeometryShader, NULL); - _mesa_reference_shader_program(ctx, save-FragmentShader, NULL); + for (i = 0; i MESA_SHADER_STAGES; i++) + _mesa_reference_shader_program(ctx, save-Shader[i], NULL); _mesa_reference_shader_program(ctx, save-ActiveShader, NULL); } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader.
--- src/mapi/glapi/gen/ARB_compute_shader.xml | 40 +++ src/mapi/glapi/gen/Makefile.am| 1 + src/mapi/glapi/gen/gl_API.xml | 4 ++- src/mapi/glapi/gen/gl_genexec.py | 1 + src/mesa/Makefile.sources | 1 + src/mesa/SConscript | 1 + src/mesa/main/compute.c | 54 +++ src/mesa/main/compute.h | 38 ++ src/mesa/main/tests/dispatch_sanity.cpp | 4 +-- 9 files changed, 141 insertions(+), 3 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_compute_shader.xml create mode 100644 src/mesa/main/compute.c create mode 100644 src/mesa/main/compute.h diff --git a/src/mapi/glapi/gen/ARB_compute_shader.xml b/src/mapi/glapi/gen/ARB_compute_shader.xml new file mode 100644 index 000..1db373e --- /dev/null +++ b/src/mapi/glapi/gen/ARB_compute_shader.xml @@ -0,0 +1,40 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +!-- Note: no GLX protocol info yet. -- + + +OpenGLAPI + +category name=GL_ARB_compute_shader number=122 + enum name=COMPUTE_SHADER value=0x91B9/ + enum name=MAX_COMPUTE_UNIFORM_BLOCKS value=0x91BB/ + enum name=MAX_COMPUTE_TEXTURE_IMAGE_UNITS value=0x91BC/ + enum name=MAX_COMPUTE_IMAGE_UNIFORMS value=0x91BD/ + enum name=MAX_COMPUTE_SHARED_MEMORY_SIZE value=0x8262/ + enum name=MAX_COMPUTE_UNIFORM_COMPONENTS value=0x8263/ + enum name=MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS value=0x8264/ + enum name=MAX_COMPUTE_ATOMIC_COUNTERS value=0x8265/ + enum name=MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS value=0x8266/ + enum name=MAX_COMPUTE_WORK_GROUP_INVOCATIONS value=0x90EB/ + enum name=MAX_COMPUTE_WORK_GROUP_COUNTvalue=0x91BE/ + enum name=MAX_COMPUTE_WORK_GROUP_SIZE value=0x91BF/ + enum name=COMPUTE_WORK_GROUP_SIZE value=0x8267/ + enum name=UNIFORM_BLOCK_REFERENCED_BY_COMPUTE_SHADER value=0x90EC/ + enum name=ATOMIC_COUNTER_BUFFER_REFERENCED_BY_COMPUTE_SHADER value=0x90ED/ + enum name=DISPATCH_INDIRECT_BUFFERvalue=0x90EE/ + enum name=DISPATCH_INDIRECT_BUFFER_BINDINGvalue=0x90EF/ + enum name=COMPUTE_SHADER_BIT value=0x0020/ + + function name=DispatchCompute offset=assign +param name=num_groups_x type=GLuint/ +param name=num_groups_y type=GLuint/ +param name=num_groups_z type=GLuint/ + /function + + function name=DispatchComputeIndirect offset=assign +param name=indirect type=GLintptr/ + /function +/category + +/OpenGLAPI diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 65bd913..a9cf2a31 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -91,6 +91,7 @@ API_XML = \ ARB_base_instance.xml \ ARB_blend_func_extended.xml \ ARB_color_buffer_float.xml \ + ARB_compute_shader.xml \ ARB_copy_buffer.xml \ ARB_debug_output.xml \ ARB_depth_buffer_float.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 697b2ec..d6cd1f5 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8464,7 +8464,9 @@ xi:include href=ARB_clear_buffer_object.xml xmlns:xi=http://www.w3.org/2001/XInclude/ -!-- ARB extensions #122...#123 -- +xi:include href=ARB_compute_shader.xml xmlns:xi=http://www.w3.org/2001/XInclude/ + +!-- ARB extension #123 -- xi:include href=ARB_texture_view.xml xmlns:xi=http://www.w3.org/2001/XInclude/ diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index b557b3b..e376da3 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -57,6 +57,7 @@ header = /** #include main/clear.h #include main/clip.h #include main/colortab.h +#include main/compute.h #include main/condrender.h #include main/context.h #include main/convolve.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 39525bc..2fdc2b6 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -23,6 +23,7 @@ MAIN_FILES = \ $(SRCDIR)main/clear.c \ $(SRCDIR)main/clip.c \ $(SRCDIR)main/colortab.c \ + $(SRCDIR)main/compute.c \ $(SRCDIR)main/condrender.c \ $(SRCDIR)main/context.c \ $(SRCDIR)main/convolve.c \ diff --git a/src/mesa/SConscript b/src/mesa/SConscript index bb9b304..24264c8 100644 --- a/src/mesa/SConscript +++ b/src/mesa/SConscript @@ -51,6 +51,7 @@ main_sources = [ 'main/clear.c', 'main/clip.c', 'main/colortab.c', +'main/compute.c', 'main/condrender.c', 'main/context.c', 'main/convolve.c', diff --git a/src/mesa/main/compute.c
[Mesa-dev] [PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders.
Fixes piglit test: spec/ARB_compute_shader/linker/mix_compute_and_non_compute --- src/glsl/linker.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 11e0651..f1344ea 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2107,6 +2107,13 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) goto done; } + /* Compute shaders have additional restrictions. */ + if (num_shaders[MESA_SHADER_COMPUTE] 0 + num_shaders[MESA_SHADER_COMPUTE] != prog-NumShaders) { + linker_error(prog, Compute shaders may not be linked with any other + type of shader\n); + } + for (unsigned int i = 0; i MESA_SHADER_STAGES; i++) { if (prog-_LinkedShaders[i] != NULL) ctx-Driver.DeleteShader(ctx, prog-_LinkedShaders[i]); -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/30] mesa: Start implementing compute shaders.
This is the first of several planned patch series to implement the extension ARB_compute_shader in Mesa. This series allows the Mesa front-end to parse and compile a do-nothing compute shader--that is, one which contains nothing but a compute shader input layout declaration and an empty main() function. Since compute shader support is not yet complete, I haven't enabled it. To try it out, set the environment variable INTEL_COMPUTE_SHADER=1. With that environment variable set, the series passes all of the ARB_compute_shader tests I recently sent to the Piglit mailing list, except for the minmax test. Patches 01-12 do preparatory refactoring in order to make adding compute shaders (and other future shader stages) easier. Patch 13 adds the extension enable flags for compute shaders, both in the context and in the GLSL compiler. Patch 14 adds the ARB_compute_shader functions and enums to Mesa's dispatch code generation logic. The functions don't do anything yet. Patch 15 changes the linker so that once the compute shader stage has been added, it won't mistakenly consider it to be part of the rest of the graphics pipeline. Patches 16-20 add MESA_SHADER_COMPUTE to the gl_shader_stage enum, and update the rest of the code base to accommodate its presence in the enum. Patches 21-23 implement some compute shader constants. Patches 24-26 implement logic to handle compute shader work group sizes (also known as local sizes). Patches 27-28 implement two other basic rules of compute shaders (they don't have user-defined ins/outs, and they can't be mixed with other shader types). Finally, patches 29-30 implement the necessary back-end code for the i965 driver to allow compute shaders to be turned on with an environment variable. Once these patches land, I plan to start working on the i965 back end. [PATCH 01/30] mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program. [PATCH 02/30] mesa: Make validate_shader_target() non-static. [PATCH 03/30] main: Allow ctx == NULL in _mesa_validate_shader_target(). [PATCH 04/30] mesa: use _mesa_validate_shader_target() more frequently. [PATCH 05/30] glsl/linker: Refactor in preparation for adding more shader stages. [PATCH 06/30] mesa: Replace ctx-Shader.Current{Vertex,Fragment,Geometry}Program with an array. [PATCH 07/30] mesa: Fold long lines introduced by the previous patch. [PATCH 08/30] i965: Fix comments to refer to the new ctx-Shader.CurrentProgram array. [PATCH 09/30] meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array. [PATCH 10/30] mesa: Remove ad-hoc arrays of gl_shader_program. [PATCH 11/30] mesa: Change redundant code into loops in shaderapi.c. [PATCH 12/30] mesa: Change redundant code into loops in texstate.c. [PATCH 13/30] mesa/cs: Add extension enable flags for ARB_compute_shader. [PATCH 14/30] mesa/cs: Add dispatch API stubs for ARB_compute_shader. [PATCH 15/30] glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound. [PATCH 16/30] mesa/cs: Add a MESA_SHADER_COMPUTE stage and update statements. [PATCH 17/30] glsl/cs: Populate default values for ctx-Const.Program[MESA_SHADER_COMPUTE]. [PATCH 18/30] glsl/cs: update main.cpp to use the .comp extension for compute shaders. [PATCH 19/30] mesa/cs: Handle compute shaders in _mesa_use_program(). [PATCH 20/30] mesa/cs: Create the gl_compute_program struct, and the code to initialize it. [PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant. [PATCH 22/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant. [PATCH 23/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant. [PATCH 24/30] glsl/cs: Handle compute shader local_size_{x,y,z} declaration. [PATCH 25/30] mesa/cs: Handle compute shader local size during linking. [PATCH 26/30] main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE. [PATCH 27/30] glsl/cs: Prohibit user-defined ins/outs in compute shaders. [PATCH 28/30] glsl/cs: Prohibit mixing of compute and non-compute shaders. [PATCH 29/30] i965/cs: Create the brw_compute_program struct, and the code to initialize it. [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 21/30] mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
--- src/glsl/builtin_variables.cpp | 27 +++ src/glsl/glsl_parser_extras.cpp | 4 src/glsl/glsl_parser_extras.h | 3 +++ src/glsl/main.cpp | 3 +++ src/glsl/standalone_scaffolding.cpp | 3 +++ src/mesa/main/context.c | 5 + src/mesa/main/get.c | 8 src/mesa/main/mtypes.h | 3 +++ 8 files changed, 56 insertions(+) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index 17ae087..171bf08 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -390,6 +390,7 @@ private: enum ir_variable_mode mode, int slot); ir_variable *add_uniform(const glsl_type *type, const char *name); ir_variable *add_const(const char *name, int value); + ir_variable *add_const_ivec3(const char *name, int x, int y, int z); void add_varying(int slot, const glsl_type *type, const char *name, const char *name_as_gs_input); @@ -530,6 +531,25 @@ builtin_variable_generator::add_const(const char *name, int value) } +ir_variable * +builtin_variable_generator::add_const_ivec3(const char *name, int x, int y, +int z) +{ + ir_variable *const var = add_variable(name, glsl_type::ivec3_type, + ir_var_auto, -1); + ir_constant_data data; + memset(data, 0, sizeof(data)); + data.i[0] = x; + data.i[1] = y; + data.i[2] = z; + var-constant_value = new(var) ir_constant(glsl_type::ivec3_type, data); + var-constant_initializer = + new(var) ir_constant(glsl_type::ivec3_type, data); + var-data.has_initializer = true; + return var; +} + + void builtin_variable_generator::generate_constants() { @@ -660,6 +680,13 @@ builtin_variable_generator::generate_constants() add_const(gl_MaxTessControlAtomicCounters, 0); add_const(gl_MaxTessEvaluationAtomicCounters, 0); } + + if (state-is_version(430, 0) || state-ARB_compute_shader_enable) { + add_const_ivec3(gl_MaxComputeWorkGroupSize, + state-Const.MaxComputeWorkGroupSize[0], + state-Const.MaxComputeWorkGroupSize[1], + state-Const.MaxComputeWorkGroupSize[2]); + } } diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 33a43c5..8524fc6 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -123,6 +123,10 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx, this-Const.MaxCombinedAtomicCounters = ctx-Const.MaxCombinedAtomicCounters; this-Const.MaxAtomicBufferBindings = ctx-Const.MaxAtomicBufferBindings; + /* Compute shader constants */ + for (unsigned i = 0; i Elements(this-Const.MaxComputeWorkGroupSize); i++) + this-Const.MaxComputeWorkGroupSize[i] = ctx-Const.MaxComputeWorkGroupSize[i]; + this-current_function = NULL; this-toplevel_ir = NULL; this-found_return = false; diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 468707c..be34fd9 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -250,6 +250,9 @@ struct _mesa_glsl_parse_state { unsigned MaxFragmentAtomicCounters; unsigned MaxCombinedAtomicCounters; unsigned MaxAtomicBufferBindings; + + /* ARB_compute_shader */ + unsigned MaxComputeWorkGroupSize[3]; } Const; /** diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp index 864c929..bb2054f 100644 --- a/src/glsl/main.cpp +++ b/src/glsl/main.cpp @@ -50,6 +50,9 @@ initialize_context(struct gl_context *ctx, gl_api api) */ ctx-Const.GLSLVersion = glsl_version; ctx-Extensions.ARB_ES3_compatibility = true; + ctx-Const.MaxComputeWorkGroupSize[0] = 1024; + ctx-Const.MaxComputeWorkGroupSize[1] = 1024; + ctx-Const.MaxComputeWorkGroupSize[2] = 64; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index ab92da8..e8eb529 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -140,6 +140,9 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 32; ctx-Const.MaxDrawBuffers = 1; + ctx-Const.MaxComputeWorkGroupSize[0] = 1024; + ctx-Const.MaxComputeWorkGroupSize[1] = 1024; + ctx-Const.MaxComputeWorkGroupSize[2] = 64; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; ctx-Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ diff --git
[Mesa-dev] [PATCH 25/30] mesa/cs: Handle compute shader local size during linking.
--- src/glsl/linker.cpp | 64 +++ src/mesa/main/mtypes.h| 17 + src/mesa/main/shaderapi.c | 7 ++ 3 files changed, 88 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 7461b17..11e0651 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -1284,6 +1284,69 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program *prog, prog-Geom.VerticesOut = linked_shader-Geom.VerticesOut; } + +/** + * Perform cross-validation of compute shader local_size_{x,y,z} layout + * qualifiers for the attached compute shaders, and propagate them to the + * linked CS and linked shader program. + */ +static void +link_cs_input_layout_qualifiers(struct gl_shader_program *prog, +struct gl_shader *linked_shader, +struct gl_shader **shader_list, +unsigned num_shaders) +{ + for (int i = 0; i 3; i++) + linked_shader-Comp.LocalSize[i] = 0; + + /* This function is called for all shader stages, but it only has an effect +* for compute shaders. +*/ + if (linked_shader-Stage != MESA_SHADER_COMPUTE) + return; + + /* From the ARB_compute_shader spec, in the section describing local size +* declarations: +* +* If multiple compute shaders attached to a single program object +* declare local work-group size, the declarations must be identical; +* otherwise a link-time error results. Furthermore, if a program +* object contains any compute shaders, at least one must contain an +* input layout qualifier specifying the local work sizes of the +* program, or a link-time error will occur. +*/ + for (unsigned sh = 0; sh num_shaders; sh++) { + struct gl_shader *shader = shader_list[sh]; + + if (shader-Comp.LocalSize[0] != 0) { + if (linked_shader-Comp.LocalSize[0] != 0) { +for (int i = 0; i 3; i++) { + if (linked_shader-Comp.LocalSize[i] != + shader-Comp.LocalSize[i]) { + linker_error(prog, compute shader defined with conflicting + local sizes\n); + return; + } +} + } + for (int i = 0; i 3; i++) +linked_shader-Comp.LocalSize[i] = shader-Comp.LocalSize[i]; + } + } + + /* Just do the intrastage - interstage propagation right now, +* since we already know we're in the right type of shader program +* for doing it. +*/ + if (linked_shader-Comp.LocalSize[0] == 0) { + linker_error(prog, compute shader didn't declare local size\n); + return; + } + for (int i = 0; i 3; i++) + prog-Comp.LocalSize[i] = linked_shader-Comp.LocalSize[i]; +} + + /** * Combine a group of shaders for a single stage to generate a linked shader * @@ -1389,6 +1452,7 @@ link_intrastage_shaders(void *mem_ctx, ralloc_steal(linked, linked-UniformBlocks); link_gs_inout_layout_qualifiers(prog, linked, shader_list, num_shaders); + link_cs_input_layout_qualifiers(prog, linked, shader_list, num_shaders); populate_symbol_table(linked); diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index e0c88a7..56e120b 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2171,6 +2171,11 @@ struct gl_fragment_program struct gl_compute_program { struct gl_program Base; /** base class */ + + /** +* Size specified using local_size_{x,y,z}. +*/ + unsigned LocalSize[3]; }; @@ -2606,6 +2611,18 @@ struct gl_shader_program 0 if not present. */ } Vert; + /** +* Compute shader state - copied into gl_compute_program by +* _mesa_copy_linked_program_data(). +*/ + struct { + /** + * If this shader contains a compute stage, size specified using + * local_size_{x,y,z}. Otherwise undefined. + */ + unsigned LocalSize[3]; + } Comp; + /* post-link info: */ unsigned NumUserUniformStorage; struct gl_uniform_storage *UniformStorage; diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 5188e9c..053f27b 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1846,6 +1846,13 @@ _mesa_copy_linked_program_data(gl_shader_stage type, dst_gp-UsesEndPrimitive = src-Geom.UsesEndPrimitive; } break; + case MESA_SHADER_COMPUTE: { + struct gl_compute_program *dst_cp = (struct gl_compute_program *) dst; + int i; + for (i = 0; i 3; i++) + dst_cp-LocalSize[i] = src-Comp.LocalSize[i]; + } + break; default: break; } -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev