Re: [Mesa-dev] RFC: mesa/st dynamic sampler support in tgsi
On Mon, Aug 4, 2014 at 11:54 PM, Ilia Mirkin imir...@alum.mit.edu wrote: Another apporach I've tried is to just use a TEMP register as the indirect offset here. Unfortunately this gets destroyed by st_glsl_to_tgsi's various optimizations which assume that temp registers can't be reladdr's and so messes up the values. I started adding support for that, but then quickly realized that was probably not the right thing to do. So... should I increase the number of address registers to 1? Or is there some other simple approach that I'm missing? Thanks, -ilia The problem you are having with the optimizations in glsl_to_tgsi is because, as you say, it doesn't check the reladdr fields when calculating live intervals. I'm currently in the middle of reworking the live interval calculation - if you don't mind waiting a day or two for me to finish the patch set and send it to the list, I can add that change to the reworked function in the patch set. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: mesa/st dynamic sampler support in tgsi
On Tue, Aug 5, 2014 at 3:23 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Aug 5, 2014 at 4:14 PM, Bryan Cain bryanca...@gmail.com wrote: On Mon, Aug 4, 2014 at 11:54 PM, Ilia Mirkin imir...@alum.mit.edu wrote: Another apporach I've tried is to just use a TEMP register as the indirect offset here. Unfortunately this gets destroyed by st_glsl_to_tgsi's various optimizations which assume that temp registers can't be reladdr's and so messes up the values. I started adding support for that, but then quickly realized that was probably not the right thing to do. So... should I increase the number of address registers to 1? Or is there some other simple approach that I'm missing? Thanks, -ilia The problem you are having with the optimizations in glsl_to_tgsi is because, as you say, it doesn't check the reladdr fields when calculating live intervals. I'm currently in the middle of reworking the live interval calculation - if you don't mind waiting a day or two for me to finish the patch set and send it to the list, I can add that change to the reworked function in the patch set. I can certainly wait a little while. However from everything I can tell, all current reladdr's are in the ADDR register file, so perhaps I should just stick with that? Either way is workable from glsl_to_tgsi's standpoint. I assume the gallium interface people have a better idea of what the preferred solution is. [Resending because I forgot to CC the list before.] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_tgsi: remove unnecessary dead code elimination pass
With the more advanced dead code elimination pass already being run, eliminate_dead_code was making no difference in instruction count, and had an undesirable O(n^2) runtime. So remove it and rename eliminate_dead_code_advanced to eliminate_dead_code. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 50 +++- 1 file changed, 5 insertions(+), 45 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 6eb6c8a..b0e0782 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -460,8 +460,7 @@ public: int get_last_temp_write(int index); void copy_propagate(void); - void eliminate_dead_code(void); - int eliminate_dead_code_advanced(void); + int eliminate_dead_code(void); void merge_registers(void); void renumber_registers(void); @@ -3663,7 +3662,8 @@ glsl_to_tgsi_visitor::copy_propagate(void) } /* - * Tracks available PROGRAM_TEMPORARY registers for dead code elimination. + * On a basic block basis, tracks available PROGRAM_TEMPORARY registers for dead + * code elimination. * * The glsl_to_tgsi_visitor lazily produces code assuming that this pass * will occur. As an example, a TXP production after copy propagation but @@ -3676,48 +3676,9 @@ glsl_to_tgsi_visitor::copy_propagate(void) * and after this pass: * * 0: TXP TEMP[2], INPUT[4].xyyw, texture[0], 2D; - * - * FIXME: assumes that all functions are inlined (no support for BGNSUB/ENDSUB) - * FIXME: doesn't eliminate all dead code inside of loops; it steps around them - */ -void -glsl_to_tgsi_visitor::eliminate_dead_code(void) -{ - int i; - - for (i=0; i this-next_temp; i++) { - int last_read = get_last_temp_read(i); - int j = 0; - - foreach_list_safe(node, this-instructions) { - glsl_to_tgsi_instruction *inst = (glsl_to_tgsi_instruction *) node; - - if (inst-dst.file == PROGRAM_TEMPORARY inst-dst.index == i - j last_read) - { -inst-remove(); -delete inst; - } - - j++; - } - } -} - -/* - * On a basic block basis, tracks available PROGRAM_TEMPORARY registers for dead - * code elimination. This is less primitive than eliminate_dead_code(), as it - * is per-channel and can detect consecutive writes without a read between them - * as dead code. However, there is some dead code that can be eliminated by - * eliminate_dead_code() but not this function - for example, this function - * cannot eliminate an instruction writing to a register that is never read and - * is the only instruction writing to that register. - * - * The glsl_to_tgsi_visitor lazily produces code assuming that this pass - * will occur. */ int -glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void) +glsl_to_tgsi_visitor::eliminate_dead_code(void) { glsl_to_tgsi_instruction **writes = rzalloc_array(mem_ctx, glsl_to_tgsi_instruction *, @@ -5245,9 +5206,8 @@ get_mesa_program(struct gl_context *ctx, /* Perform optimizations on the instructions in the glsl_to_tgsi_visitor. */ v-simplify_cmp(); v-copy_propagate(); - while (v-eliminate_dead_code_advanced()); + while (v-eliminate_dead_code()); - v-eliminate_dead_code(); v-merge_registers(); v-renumber_registers(); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: implement multisample textures
On 10/25/2013 01:35 PM, Emil Velikov wrote: On 21/10/13 23:23, Bryan Cain wrote: This is a port of 4da54c91d24da (nvc0: implement multisample textures) to nv50. When coupled with the patch to only report 16 texture samplers (to fix crashes), all of the Piglit tests in spec/arb_texture_multisample pass. Hello Bryan, Big thanks for your work. As promised here is a quick piglit summary on my nv96 pass/fail/crash 69/32/27 * dmesg does not spit anything nouveau related during the tests * any geometry shader related tests were skipped (piglit: info: Failed to create GL 3.2 core context) * all the crashes are due to the following assert codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed. PASSarb_texture_multisample-* PASSfb-completeness/* FAILsample-position/* FAILtexelFetch fs sampler2DMS 4* CRASH texelFetch fs sampler2DMSArray 4* FAILtexelFetch/*-*s-isampler2DMS CRASH texelFetch/*-*s-isampler2DMSArray PASStextureSize/* Hope you find this useful :) No real world apps that use multisample textures were tested, yet. Cheers Emil Hi Emil, Thanks for testing on nv96. It seems, though, that I messed up my piglit-run command and didn't include all of the relevant tests as a result. Now that I've fixed that, I'm seeing the same failures and crashes on my nva5. It seems that multisampling is broken with texelFetch (both the texelFetch and sample-position tests use it) but works otherwise, unless it turns out not to produce the right results in real world applications for pre-nva3 cards. I'm going to take some time this weekend to see what's going on with multisampling and texelFetch. Thanks again, Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: implement multisample textures
On 10/25/2013 04:11 PM, Christoph Bumiller wrote: On 25.10.2013 20:35, Emil Velikov wrote: On 21/10/13 23:23, Bryan Cain wrote: This is a port of 4da54c91d24da (nvc0: implement multisample textures) to nv50. When coupled with the patch to only report 16 texture samplers (to fix crashes), all of the Piglit tests in spec/arb_texture_multisample pass. Hello Bryan, Big thanks for your work. As promised here is a quick piglit summary on my nv96 pass/fail/crash 69/32/27 * dmesg does not spit anything nouveau related during the tests * any geometry shader related tests were skipped (piglit: info: Failed to create GL 3.2 core context) * all the crashes are due to the following assert codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed. I'm not sure how you'd get 4 arguments there (x y layer sample ?). There's no mip maps for multisample textures. But either way you're probably going to have to do things by hand: E.g. MS8 textures contain contiguous 4x2 rectangles of samples for each pixel, so you multiply x by 4 and y by 2 to arrive at the sub-rectangle and then add the correct offsets for the sample id as seen in get_sample_position (store the info in a constant buffer, that has to be updated when texture changes). You might want to use a lookup table like in nve4 compute (look for MS sample coordinate offsets) to map sample id to coordinate offset, that one works for any sample count as long as you don't use the ALT modes (nve4 doesn't need to for textures, but for images/surfaces/UAVs/RATs where the whole VM address calculation is done by hand). You're probably right. I don't know why MSAA appears to work for me, but there's probably something wrong with the output that I haven't noticed. I'll work on implementing it properly this weekend. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: implement multisample textures
On 10/25/2013 05:05 PM, Christoph Bumiller wrote: On 25.10.2013 23:51, Bryan Cain wrote: On 10/25/2013 04:11 PM, Christoph Bumiller wrote: On 25.10.2013 20:35, Emil Velikov wrote: On 21/10/13 23:23, Bryan Cain wrote: This is a port of 4da54c91d24da (nvc0: implement multisample textures) to nv50. When coupled with the patch to only report 16 texture samplers (to fix crashes), all of the Piglit tests in spec/arb_texture_multisample pass. Hello Bryan, Big thanks for your work. As promised here is a quick piglit summary on my nv96 pass/fail/crash 69/32/27 * dmesg does not spit anything nouveau related during the tests * any geometry shader related tests were skipped (piglit: info: Failed to create GL 3.2 core context) * all the crashes are due to the following assert codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed. I'm not sure how you'd get 4 arguments there (x y layer sample ?). There's no mip maps for multisample textures. But either way you're probably going to have to do things by hand: E.g. MS8 textures contain contiguous 4x2 rectangles of samples for each pixel, so you multiply x by 4 and y by 2 to arrive at the sub-rectangle and then add the correct offsets for the sample id as seen in get_sample_position (store the info in a constant buffer, that has to be updated when texture changes). You might want to use a lookup table like in nve4 compute (look for MS sample coordinate offsets) to map sample id to coordinate offset, that one works for any sample count as long as you don't use the ALT modes (nve4 doesn't need to for textures, but for images/surfaces/UAVs/RATs where the whole VM address calculation is done by hand). You're probably right. I don't know why MSAA appears to work for me, but there's probably something wrong with the output that I haven't noticed. I'll work on implementing it properly this weekend. MSAA itself (rendering and resolving) has been working before, the only thing that ARB_texture_multisample adds is texelFetch from MS resources. I really should read an extension's spec carefully before trying to implement it so that I don't waste other people's time. Sorry. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50: implement multisample textures
This is a port of 4da54c91d24da (nvc0: implement multisample textures) to nv50. When coupled with the patch to only report 16 texture samplers (to fix crashes), all of the Piglit tests in spec/arb_texture_multisample pass. --- .../nouveau/codegen/nv50_ir_lowering_nv50.cpp |5 ++- src/gallium/drivers/nouveau/nv50/nv50_context.c| 46 src/gallium/drivers/nouveau/nv50/nv50_miptree.c|2 + src/gallium/drivers/nouveau/nv50/nv50_screen.c |3 +- src/gallium/drivers/nouveau/nv50/nv50_tex.c| 20 +++-- 5 files changed, 70 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index caaf09f..d5d1f1e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -569,6 +569,7 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i) const int arg = i-tex.target.getArgCount(); const int dref = arg; const int lod = i-tex.target.isShadow() ? (arg + 1) : arg; + const int lyr = arg - (i-tex.target.isMS() ? 2 : 1); // dref comes before bias/lod if (i-tex.target.isShadow()) @@ -577,11 +578,11 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i) // array index must be converted to u32 if (i-tex.target.isArray()) { - Value *layer = i-getSrc(arg - 1); + Value *layer = i-getSrc(lyr); LValue *src = new_LValue(func, FILE_GPR); bld.mkCvt(OP_CVT, TYPE_U32, src, TYPE_F32, layer); bld.mkOp2(OP_MIN, TYPE_U32, src, src, bld.loadImm(NULL, 511)); - i-setSrc(arg - 1, src); + i-setSrc(lyr, src); if (i-tex.target.isCube()) { std::vectorValue * acube, a2d; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index b6bdf79..45e3f5d 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -194,6 +194,10 @@ nv50_invalidate_resource_storage(struct nouveau_context *ctx, return ref; } +static void +nv50_context_get_sample_position(struct pipe_context *, unsigned, unsigned, + float *); + struct pipe_context * nv50_create(struct pipe_screen *pscreen, void *priv) { @@ -237,6 +241,7 @@ nv50_create(struct pipe_screen *pscreen, void *priv) pipe-flush = nv50_flush; pipe-texture_barrier = nv50_texture_barrier; + pipe-get_sample_position = nv50_context_get_sample_position; if (!screen-cur_ctx) { screen-cur_ctx = nv50; @@ -315,3 +320,44 @@ nv50_bufctx_fence(struct nouveau_bufctx *bufctx, boolean on_flush) nv50_resource_validate(res, (unsigned)ref-priv_data); } } + +static void +nv50_context_get_sample_position(struct pipe_context *pipe, + unsigned sample_count, unsigned sample_index, + float *xy) +{ + static const uint8_t ms1[1][2] = { { 0x8, 0x8 } }; + static const uint8_t ms2[2][2] = { + { 0x4, 0x4 }, { 0xc, 0xc } }; /* surface coords (0,0), (1,0) */ + static const uint8_t ms4[4][2] = { + { 0x6, 0x2 }, { 0xe, 0x6 }, /* (0,0), (1,0) */ + { 0x2, 0xa }, { 0xa, 0xe } }; /* (0,1), (1,1) */ + static const uint8_t ms8[8][2] = { + { 0x1, 0x7 }, { 0x5, 0x3 }, /* (0,0), (1,0) */ + { 0x3, 0xd }, { 0x7, 0xb }, /* (0,1), (1,1) */ + { 0x9, 0x5 }, { 0xf, 0x1 }, /* (2,0), (3,0) */ + { 0xb, 0xf }, { 0xd, 0x9 } }; /* (2,1), (3,1) */ +#if 0 + /* NOTE: NVA3+ has alternative modes for MS2 and MS8, currently not used */ + static const uint8_t ms8_alt[8][2] = { + { 0x9, 0x5 }, { 0x7, 0xb }, /* (2,0), (1,1) */ + { 0xd, 0x9 }, { 0x5, 0x3 }, /* (3,1), (1,0) */ + { 0x3, 0xd }, { 0x1, 0x7 }, /* (0,1), (0,0) */ + { 0xb, 0xf }, { 0xf, 0x1 } }; /* (2,1), (3,0) */ +#endif + + const uint8_t (*ptr)[2]; + + switch (sample_count) { + case 0: + case 1: ptr = ms1; break; + case 2: ptr = ms2; break; + case 4: ptr = ms4; break; + case 8: ptr = ms8; break; + default: + assert(0); + return; /* bad sample count - undefined locations */ + } + xy[0] = ptr[sample_index][0] * 0.0625f; + xy[1] = ptr[sample_index][1] * 0.0625f; +} diff --git a/src/gallium/drivers/nouveau/nv50/nv50_miptree.c b/src/gallium/drivers/nouveau/nv50/nv50_miptree.c index 513d8f9..1963a4a 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_miptree.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_miptree.c @@ -277,6 +277,8 @@ nv50_miptree_init_layout_tiled(struct nv50_miptree *mt) */ d = mt-layout_3d ? pt-depth0 : 1; + assert(!mt-ms_mode || !pt-last_level); + for (l = 0; l = pt-last_level; ++l) { struct nv50_miptree_level *lvl = mt-level[l]; unsigned tsx, tsy, tsz; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index
Re: [Mesa-dev] [PATCH 06/34] draw/gs: fix allocation of buffer for GS output vertices
On Tue, Jul 30, 2013 at 8:46 PM, Paul Berry stereotype...@gmail.com wrote: On 29 July 2013 11:09, Zack Rusin za...@vmware.com wrote: That looks wrong to me. We already account for the other fields in the vertex_size. This patch came from Bryan Cain's original geometry shader patch series--I admit I'm not familiar enough with Gallium code to know how to fix it. Anyone want to step in and address Zack's comment? Or the Gallium-related comments on patches 08 and 24? If no one has time to work on Gallium geometry shaders right now, that's ok. I can pull the Gallium stuff out of this series and archive it in a branch until someone has time to pick it up. This patch is a year old, and I don't remember what it was supposed to fix. The Gallium geometry shader code has changed significantly in the last year, and it should be safe to leave this patch unmerged. If any problems come up as a result, they can be addressed then. - Original Message - From: Bryan Cain bryanca...@gmail.com Before, it accounted for the size of the vertices but not the other fields in the vertex_header struct, which caused memory corruption. --- src/gallium/auxiliary/draw/draw_gs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index cd63e2b..78727c6 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -560,7 +560,8 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, /* we allocate exactly one extra vertex per primitive to allow the GS to emit * overflown vertices into some area where they won't harm anyone */ output_verts-verts = - (struct vertex_header *)MALLOC(output_verts-vertex_size * + (struct vertex_header *)MALLOC(sizeof(struct vertex_header) + + output_verts-vertex_size * max_out_prims * shader-primitive_boundary); -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/tgsi: clarify (possibly change) TGSI_OPCODE_UCMP definition
On 05/08/2013 03:26 PM, Roland Scheidegger wrote: There's some other somewhat bogus code with comments saying UCMP should be used instead of some poor emulation with i2f instructions. Yeah, sorry about that. I wrote that code (and comment) a few days before the UCMP opcode was added to TGSI, and didn't notice until today that it was never updated to actually use UCMP. There are a few changes and clean-ups I've been wanting to make to glsl_to_tgsi anyway - I'll include an update to emit UCMP if someone hasn't already done so by the time I finish that and send out a patch set. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nouveau: emit and flush fence in fence_signalled if needed
The Mesa state tracker expects us to emit the fence even if it doesn't call fence_finish. Notably, this occurs when glClientWaitSync is called with timeout 0. Fixes Portal and Left 4 Dead 2, which were both stalling on startup by repeatedly calling glClientWaitSync with timeout 0 while waiting for commands to complete. --- src/gallium/drivers/nouveau/nouveau_fence.c | 36 ++- src/gallium/drivers/nouveau/nouveau_fence.h |1 + 2 files changed, 25 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c b/src/gallium/drivers/nouveau/nouveau_fence.c index dea146c..722be01 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.c +++ b/src/gallium/drivers/nouveau/nouveau_fence.c @@ -167,6 +167,25 @@ nouveau_fence_update(struct nouveau_screen *screen, boolean flushed) } } +boolean +nouveau_fence_ensure_flushed(struct nouveau_fence *fence) +{ + struct nouveau_screen *screen = fence-screen; + + if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { + nouveau_fence_emit(fence); + + if (fence == screen-fence.current) + nouveau_fence_new(screen, screen-fence.current, FALSE); + } + if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { + if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) + return FALSE; + } + + return TRUE; +} + #define NOUVEAU_FENCE_MAX_SPINS (1 31) boolean @@ -174,8 +193,9 @@ nouveau_fence_signalled(struct nouveau_fence *fence) { struct nouveau_screen *screen = fence-screen; - if (fence-state = NOUVEAU_FENCE_STATE_EMITTED) - nouveau_fence_update(screen, FALSE); + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; + nouveau_fence_update(screen, FALSE); return fence-state == NOUVEAU_FENCE_STATE_SIGNALLED; } @@ -189,16 +209,8 @@ nouveau_fence_wait(struct nouveau_fence *fence) /* wtf, someone is waiting on a fence in flush_notify handler? */ assert(fence-state != NOUVEAU_FENCE_STATE_EMITTING); - if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { - nouveau_fence_emit(fence); - - if (fence == screen-fence.current) - nouveau_fence_new(screen, screen-fence.current, FALSE); - } - if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { - if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) - return FALSE; - } + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; do { nouveau_fence_update(screen, FALSE); diff --git a/src/gallium/drivers/nouveau/nouveau_fence.h b/src/gallium/drivers/nouveau/nouveau_fence.h index 3984a9a..d497c7f 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.h +++ b/src/gallium/drivers/nouveau/nouveau_fence.h @@ -34,6 +34,7 @@ boolean nouveau_fence_new(struct nouveau_screen *, struct nouveau_fence **, boolean nouveau_fence_work(struct nouveau_fence *, void (*)(void *), void *); voidnouveau_fence_update(struct nouveau_screen *, boolean flushed); voidnouveau_fence_next(struct nouveau_screen *); +boolean nouveau_fence_ensure_flushed(struct nouveau_fence *); boolean nouveau_fence_wait(struct nouveau_fence *); boolean nouveau_fence_signalled(struct nouveau_fence *); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/st: Don't copy propagate from swizzles.
On 04/20/2013 12:40 PM, Fabian Bieler wrote: Do not propagate a copy if source and destination are identical. Otherwise code like MOV TEMP[0].xyzw, TEMP[0].wzyx mov TEMP[1].xyzw, TEMP[0].xyzw is changed to MOV TEMP[0].xyzw, TEMP[0].wzyx mov TEMP[1].xyzw, TEMP[0].wzyx --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index f2eb3e7..b5d0534 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3544,6 +3544,8 @@ glsl_to_tgsi_visitor::copy_propagate(void) /* If this is a copy, add it to the ACP. */ if (inst-op == TGSI_OPCODE_MOV inst-dst.file == PROGRAM_TEMPORARY + !(inst-dst.file == inst-src[0].file + inst-dst.index == inst-src[0].index) !inst-dst.reladdr !inst-saturate !inst-src[0].reladdr Nice catch. FYI, it seems that the almost identical copy_progagate function in ir_to_mesa also needs this fix. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Geometry shader support for nv50
The following patch set makes the necessary changes to support geometry shaders in the nv50 driver. There are no piglit tests yet for geometry shader corner cases yet, so these changes were tested with all of the GS demos in mesa/demos and several corner case tests, using Paul Berry's gs branch with support for GL_ARB_geometry_shader4. I have also confirmed that the set does not regress any of the piglit tests in tests/shaders. Although this set updates the nvc0 driver to handle the indirect addressing changes in nv50_ir_from_tgsi, it hasn't been tested yet since I don't have the hardware. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs
From: Christoph Bumiller e0425...@student.tuwien.ac.at v2 (Bryan Cain bryanca...@gmail.com): Fix emission of PFETCH instructions using the SHL form. --- src/gallium/drivers/nv50/codegen/nv50_ir.h |1 + .../drivers/nv50/codegen/nv50_ir_emit_nv50.cpp | 62 ++-- src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp |1 + 3 files changed, 59 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir.h b/src/gallium/drivers/nv50/codegen/nv50_ir.h index ae365af..9d29b34 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir.h +++ b/src/gallium/drivers/nv50/codegen/nv50_ir.h @@ -366,6 +366,7 @@ enum SVSemantic SV_CLOCK, SV_LBASE, SV_SBASE, + SV_VERTEX_STRIDE, SV_UNDEFINED, SV_LAST }; diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp index bc5a833..67b6298 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp @@ -87,6 +87,7 @@ private: void emitLOAD(const Instruction *); void emitSTORE(const Instruction *); void emitMOV(const Instruction *); + void emitRDSV(const Instruction *); void emitNOP(); void emitINTERP(const Instruction *); void emitPFETCH(const Instruction *); @@ -772,6 +773,29 @@ CodeEmitterNV50::emitMOV(const Instruction *i) } } +static inline uint8_t getSRegEncoding(const ValueRef ref) +{ + switch (SDATA(ref).sv.sv) { + case SV_PHYSID:return 0; + case SV_CLOCK: return 1; + case SV_VERTEX_STRIDE: return 3; +// case SV_PM_COUNTER:return 4 + SDATA(ref).sv.index; + case SV_SAMPLE_INDEX: return 8; + default: + assert(!no sreg for system value); + return 0; + } +} + +void +CodeEmitterNV50::emitRDSV(const Instruction *i) +{ + code[0] = 0x0001; + code[1] = 0x6000 | (getSRegEncoding(i-src(0)) 14); + defId(i-def(0), 2); + emitFlagsRd(i); +} + void CodeEmitterNV50::emitNOP() { @@ -794,15 +818,40 @@ CodeEmitterNV50::emitQUADOP(const Instruction *i, uint8_t lane, uint8_t quOp) srcId(i-src(0), 32 + 14); } +/* NOTE: This returns the base address of a vertex inside the primitive. + * src0 is an immediate, the index (not offset) of the vertex + * inside the primitive. XXX: signed or unsigned ? + * src1 (may be NULL) should use whatever units the hardware requires + * (on nv50 this is bytes, so, relative index * 4; signed 16 bit value). + */ void CodeEmitterNV50::emitPFETCH(const Instruction *i) { - code[0] = 0x1181; - code[1] = 0x0420 | (0xf 14); + const uint32_t prim = i-src(0).get()-reg.data.u32; + assert(prim = 127); - defId(i-def(0), 2); - srcAddr8(i-src(0), 9); - setAReg16(i, 0); + if (i-def(0).getFile() == FILE_ADDRESS) { + // shl $aX a[] 0 + code[0] = 0x0001 | ((DDATA(i-def(0)).id + 1) 2); + code[1] = 0xc020; + code[0] |= prim 7; + assert(!i-srcExists(1)); + } else + if (i-srcExists(1)) { + // ld b32 $rX a[$aX+base] + code[0] = 0x0001; + code[1] = 0x0420 | (0xf 14); + defId(i-def(0), 2); + code[0] |= prim 9; + setARegBits(SDATA(i-src(1)).id + 1); + } else { + // mov b32 $rX a[] + code[0] = 0x1001; + code[1] = 0x0420 | (0xf 14); + defId(i-def(0), 2); + code[0] |= prim 9; + } + emitFlagsRd(i); } void @@ -1620,6 +1669,9 @@ CodeEmitterNV50::emitInstruction(Instruction *insn) case OP_PFETCH: emitPFETCH(insn); break; + case OP_RDSV: + emitRDSV(insn); + break; case OP_LINTERP: case OP_PINTERP: emitINTERP(insn); diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp index c7121bf..743e566 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp @@ -265,6 +265,7 @@ static const char *SemanticStr[SV_LAST + 1] = CLOCK, LBASE, SBASE, + VERTEX_STRIDE, ?, (INVALID) }; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] nv50/ir: delay calculation of indirect addresses
Instead of emitting an SHL 4 io an address register on the TGSI ARL and UARL instructions, emit the shift when the loaded address is actually used. This is necessary because input vertex and attribute indices in geometry shaders on nv50 need to be shifted left by 2 instead of 4. --- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 34 +-- .../drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp | 97 +++- .../drivers/nvc0/codegen/nv50_ir_lowering_nvc0.cpp | 13 +++ 3 files changed, 131 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp index d8abccd..786328a 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp @@ -1114,6 +1114,7 @@ private: }; Value *getVertexBase(int s); + Value *getIndirectAddress(Value *ptr); DataArray *getArrayForFile(unsigned file, int idx); Value *fetchSrc(int s, int c); Value *acquireDst(int d, int c); @@ -1331,7 +1332,8 @@ Converter::getVertexBase(int s) if (tgsi.getSrc(s).isIndirect(1)) rel = fetchSrc(tgsi.getSrc(s).getIndirect(1), 0, NULL); vtxBaseValid |= 1 s; - vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(), mkImm(index), rel); + vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(2, FILE_ADDRESS), + mkImm(index), rel); } return vtxBase[s]; } @@ -1390,6 +1392,14 @@ Converter::getArrayForFile(unsigned file, int idx) } Value * +Converter::getIndirectAddress(Value *ptr) +{ + if(!ptr) + return NULL; + return mkOp2v(OP_SHL, TYPE_U32, getSSA(2, FILE_ADDRESS), ptr, mkImm(4)); +} + +Value * Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) { const int idx2d = src.is2D() ? src.getIndex(1) : 0; @@ -1401,7 +1411,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) assert(!ptr); return loadImm(NULL, info-immd.data[idx * 4 + swz]); case TGSI_FILE_CONSTANT: - return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + return mkLoadv(TYPE_U32, srcToSym(src, c), getIndirectAddress(ptr)); case TGSI_FILE_INPUT: if (prog-getType() == Program::TYPE_FRAGMENT) { // don't load masked inputs, won't be assigned a slot @@ -1409,9 +1419,13 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) return loadImm(NULL, swz == TGSI_SWIZZLE_W ? 1.0f : 0.0f); if (!ptr info-in[idx].sn == TGSI_SEMANTIC_FACE) return mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_FACE, 0)); - return interpolate(src, c, ptr); + return interpolate(src, c, getIndirectAddress(ptr)); } - return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + else if (ptr prog-getType() == Program::TYPE_GEOMETRY) + // nv50 and nvc0 need different things here, so let the lowering + // passes decide what to do with the address + return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + return mkLoadv(TYPE_U32, srcToSym(src, c), getIndirectAddress(ptr)); case TGSI_FILE_OUTPUT: assert(!load from output file); return NULL; @@ -1420,7 +1434,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) return mkOp1v(OP_RDSV, TYPE_U32, getSSA(), srcToSym(src, c)); default: return getArrayForFile(src.getFile(), idx2d)-load( - sub.cur-values, idx, swz, ptr); + sub.cur-values, idx, swz, getIndirectAddress(ptr)); } } @@ -1463,8 +1477,9 @@ Converter::storeDst(int d, int c, Value *val) break; } - Value *ptr = dst.isIndirect(0) ? - fetchSrc(dst.getIndirect(0), 0, NULL) : NULL; + Value *ptr = NULL; + if (dst.isIndirect(0)) + ptr = getIndirectAddress(fetchSrc(dst.getIndirect(0), 0, NULL)); if (info-io.genUserClip 0 dst.getFile() == TGSI_FILE_OUTPUT @@ -2166,12 +2181,11 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn) FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) { src0 = fetchSrc(0, c); mkCvt(OP_CVT, TYPE_S32, dst0[c], TYPE_F32, src0)-rnd = ROUND_M; - mkOp2(OP_SHL, TYPE_U32, dst0[c], dst0[c], mkImm(4)); } break; case TGSI_OPCODE_UARL: FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) - mkOp2(OP_SHL, TYPE_U32, dst0[c], fetchSrc(0, c), mkImm(4)); + mkOp1(OP_MOV, TYPE_U32, dst0[c], fetchSrc(0, c)); break; case TGSI_OPCODE_EX2: case TGSI_OPCODE_LG2: @@ -2704,7 +2718,7 @@ Converter::Converter(Program *ir, const tgsi::Source *code) : BuildUtil(ir), tData.setup(TGSI_FILE_TEMPORARY, 0, 0, tSize, 4, 4, tFile, 0); pData.setup(TGSI_FILE_PREDICATE, 0, 0, pSize, 4, 4, FILE_PREDICATE, 0); - aData.setup(TGSI_FILE_ADDRESS, 0, 0, aSize, 4, 4, FILE_ADDRESS, 0); + aData.setup(TGSI_FILE_ADDRESS, 0, 0, aSize, 4, 4, FILE_GPR, 0);
[Mesa-dev] [PATCH 3/3] nv50: add support for geometry shaders
Layer output probably doesn't work yet, but other than that everything seems to be working. --- .../drivers/nv50/codegen/nv50_ir_emit_nv50.cpp | 25 +++- src/gallium/drivers/nv50/nv50_program.c| 17 + src/gallium/drivers/nv50/nv50_shader_state.c |2 ++ src/gallium/drivers/nv50/nv50_tex.c|2 ++ 4 files changed, 40 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp index 67b6298..d1d8b52 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp @@ -493,7 +493,12 @@ CodeEmitterNV50::emitForm_MAD(const Instruction *i) setSrc(i, 1, 1); setSrc(i, 2, 2); - setAReg16(i, 1); + if (i-getIndirect(0, 0)) { + assert(!i-getIndirect(1, 0)); + setAReg16(i, 0); + } + else + setAReg16(i, 1); } // like default form, but 2nd source in slot 2, and no 3rd source @@ -512,7 +517,12 @@ CodeEmitterNV50::emitForm_ADD(const Instruction *i) setSrc(i, 0, 0); setSrc(i, 1, 2); - setAReg16(i, 1); + if (i-getIndirect(0, 0)) { + assert(!i-getIndirect(1, 0)); + setAReg16(i, 0); + } + else + setAReg16(i, 1); } // default short form (rr, ar, rc, gr) @@ -602,8 +612,11 @@ CodeEmitterNV50::emitLOAD(const Instruction *i) switch (sf) { case FILE_SHADER_INPUT: - // use 'mov' where we can - code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001; + if(progType == Program::TYPE_GEOMETRY) + code[0] = 0x1181; + else + // use 'mov' where we can + code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001; code[1] = 0x0020 | (i-lanes 14); if (typeSizeof(i-dType) == 4) code[1] |= 0x0400; @@ -1399,8 +1412,8 @@ CodeEmitterNV50::emitShift(const Instruction *i) void CodeEmitterNV50::emitOUT(const Instruction *i) { - code[0] = (i-op == OP_EMIT) ? 0xf200 : 0xf400; - code[1] = 0xc001; + code[0] = (i-op == OP_EMIT) ? 0xf201 : 0xf401; + code[1] = 0xc000; emitFlagsRd(i); } diff --git a/src/gallium/drivers/nv50/nv50_program.c b/src/gallium/drivers/nv50/nv50_program.c index c17ffdc..1229002 100644 --- a/src/gallium/drivers/nv50/nv50_program.c +++ b/src/gallium/drivers/nv50/nv50_program.c @@ -359,6 +359,23 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset) if (info-prop.fp.usesDiscard) prog-fp.flags[0] |= NV50_3D_FP_CONTROL_USES_KIL; } + else if (prog-type == PIPE_SHADER_GEOMETRY) { + switch(info-prop.gp.outputPrim) { + case PIPE_PRIM_POINTS: + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_POINTS; + break; + case PIPE_PRIM_LINE_STRIP: + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_LINE_STRIP; + break; + case PIPE_PRIM_TRIANGLE_STRIP: + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_TRIANGLE_STRIP; + break; + default: + assert(0); + break; + } + prog-gp.vert_count = info-prop.gp.maxVertices; + } if (prog-pipe.stream_output.num_outputs) prog-so = nv50_program_create_strmout_state(info, diff --git a/src/gallium/drivers/nv50/nv50_shader_state.c b/src/gallium/drivers/nv50/nv50_shader_state.c index 7f05243..613d257 100644 --- a/src/gallium/drivers/nv50/nv50_shader_state.c +++ b/src/gallium/drivers/nv50/nv50_shader_state.c @@ -193,6 +193,8 @@ nv50_gmtyprog_validate(struct nv50_context *nv50) struct nv50_program *gp = nv50-gmtyprog; if (gp) { + if(!nv50_program_validate(nv50, gp)) + return; BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_TEMP), 1); PUSH_DATA (push, gp-max_gpr); BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_RESULT), 1); diff --git a/src/gallium/drivers/nv50/nv50_tex.c b/src/gallium/drivers/nv50/nv50_tex.c index 40b264d..be70fda 100644 --- a/src/gallium/drivers/nv50/nv50_tex.c +++ b/src/gallium/drivers/nv50/nv50_tex.c @@ -293,6 +293,7 @@ void nv50_validate_textures(struct nv50_context *nv50) boolean need_flush; need_flush = nv50_validate_tic(nv50, 0); + need_flush = nv50_validate_tic(nv50, 1); need_flush |= nv50_validate_tic(nv50, 2); if (need_flush) { @@ -343,6 +344,7 @@ void nv50_validate_samplers(struct nv50_context *nv50) boolean need_flush; need_flush = nv50_validate_tsc(nv50, 0); + need_flush = nv50_validate_tsc(nv50, 1); need_flush |= nv50_validate_tsc(nv50, 2); if (need_flush) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GS plans?
On 01/29/2013 12:44 PM, Paul Berry wrote: On 29 January 2013 09:33, Ian Romanick i...@freedesktop.org mailto:i...@freedesktop.org wrote: On 01/29/2013 09:02 AM, Brian Paul wrote: Hi Bryan, Back in July you announced your work on geometry shaders: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024792.html But it looks like it hasn't been touched since October. I was wondering what plans you might have for that. I believe the Intel guys also were planning on tackling GS support in Mesa this year. Ian, Eric, do you have some idea of when you'll be working on that? Were you going to use Bryan's work? I believe Paul is going to chip away at a bit of it. I think he's planning to use some of Bryan's work. I believe they've been in contact, but I'm not 100% sure. Yes, we have. It looks like Bryan made a lot of headway before he set the branch aside in October, so I'm planning to use that as a starting point. My impression from talking to Bryan is that he doesn't have too much time/interest to put into geometry shaders these days, so for the moment I'm planning to take over responsibility for the branch myself. My first order of business is to write some piglit tests for geometry shaders, so that I don't have to push anything to Mesa master that isn't well-tested. I have an nVidia system that I can use as a reference platform to make sure my tests are correct. I also want to spend a little bit of time prototyping some i965 back-end code just to increase my familiarity with the problem domain and to find out if i965 hardware has any sharp corners I need to watch out for. Once I've done those two things, my plan is to start rebasing Bryan's patch series onto master and sending it out for review piece by piece. It's true that I don't have too much time/interest to put into geometry shaders anymore, but before you get to the rebasing part of your plan, I do hope to have reorganized my the commits from my geometry-shaders-rebase branch into a more modular form which would be suitable for review on the mailing list as a patch set. I've already started the process of figuring out what this should look like, so if all goes well it will be ready by the time it's needed. I'll notify you (and the mailing list) when it's ready. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for array and structure elements
On 10/29/2012 06:22 AM, Andreas Boll wrote: 2012/10/24 Andreas Boll andreas.boll@gmail.com: 2012/10/24 Marek Olšák mar...@gmail.com: On Tue, Oct 23, 2012 at 11:52 PM, Bryan Cain bryanca...@gmail.com wrote: On 10/23/2012 04:50 PM, Brian Paul wrote: On 10/23/2012 10:58 AM, Bryan Cain wrote: This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for operations on integer or boolean values dereferenced from an array or structure. Assertions have been added to get_opcode() to prevent this bug from reappearing in the future. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9146f24..cefc568 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, { int type = GLSL_TYPE_FLOAT; + assert(src0.type != GLSL_TYPE_ARRAY); + assert(src0.type != GLSL_TYPE_STRUCT); + assert(src1.type != GLSL_TYPE_ARRAY); + assert(src1.type != GLSL_TYPE_STRUCT); + if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) @@ -1074,8 +1079,12 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir) assert(index == storage-index + (int)i); } } else { -st_src_reg src(PROGRAM_STATE_VAR, index, - native_integers ? ir-type-base_type : GLSL_TYPE_FLOAT); + /* We use GLSL_TYPE_FLOAT here regardless of the actual type of + * the data being moved since MOV does not care about the type of + * data it is moving, and we don't want to declare registers with + * array or struct types. + */ +st_src_reg src(PROGRAM_STATE_VAR, index, GLSL_TYPE_FLOAT); src.swizzle = slots[i].swizzle; emit(ir, TGSI_OPCODE_MOV, dst, src); /* even a float takes up a whole vec4 reg in a struct/array. */ @@ -2042,6 +2051,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir) else src.swizzle = SWIZZLE_NOOP; + /* Change the register type to the element type of the array. */ + src.type = ir-type-base_type; + this-result = src; } @@ -2067,6 +2079,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_record *ir) this-result.swizzle = SWIZZLE_NOOP; this-result.index += offset; + this-result.type = ir-type-base_type; } /** @@ -2286,6 +2299,10 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) inst-dead_mask = inst-dst.writemask; } else { for (i = 0; i type_size(ir-lhs-type); i++) { + if (ir-rhs-type-is_array()) + r.type = ir-rhs-type-element_type()-base_type; + else if (ir-rhs-type-is_record()) + r.type = ir-rhs-type-fields.structure[i].type-base_type; emit(ir, TGSI_OPCODE_MOV, l, r); l.index++; r.index++; Thanks. That seems to fix the test program that Jose posted to the piglit list (vs-all-equal-bool-array). Did you do a full piglit regression test? If so: Reviewed-by: Brian Paul bri...@vmware.com But we should probably note this as a candidate for the stable branches. -Brian I did a piglit regression test with -t shaders, which I think should cover everything. And yes, this should be a candidate for the stable branches. I just forgot to mention that in the commit message. actually is -t shaders actually covers a very small subset of all GLSL tests. Most of them are in the spec group. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Tested-by: Andreas Boll andreas.boll@gmail.com I found no regression in piglit (quick-driver) on r600g (rv770), llvmpipe and softpipe. This patch fixes the following piglit tests: {fs,vs}-bool-array on r600g and softpipe {fs,vs}-int-array on r600g and softpipe vs-all-equal-bool-array on llvmpipe actually isThose tests pass for me on mesa 8.0.4. So this patch actually fixes regressions introduced somewhere on the road. Nice work Bryan! Sorry I've pushed this commit accidentally. I've reverted it for now. But I think this patch is ready to land on master. (with the stable note, the reviewed-by and tested-by) Andreas. Yes, it is. I just haven't gotten around to pushing it yet. You can do it if you'd like. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http
[Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for array and structure elements
This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for operations on integer or boolean values dereferenced from an array or structure. Assertions have been added to get_opcode() to prevent this bug from reappearing in the future. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9146f24..cefc568 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, { int type = GLSL_TYPE_FLOAT; + assert(src0.type != GLSL_TYPE_ARRAY); + assert(src0.type != GLSL_TYPE_STRUCT); + assert(src1.type != GLSL_TYPE_ARRAY); + assert(src1.type != GLSL_TYPE_STRUCT); + if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) @@ -1074,8 +1079,12 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir) assert(index == storage-index + (int)i); } } else { -st_src_reg src(PROGRAM_STATE_VAR, index, - native_integers ? ir-type-base_type : GLSL_TYPE_FLOAT); + /* We use GLSL_TYPE_FLOAT here regardless of the actual type of +* the data being moved since MOV does not care about the type of +* data it is moving, and we don't want to declare registers with +* array or struct types. +*/ +st_src_reg src(PROGRAM_STATE_VAR, index, GLSL_TYPE_FLOAT); src.swizzle = slots[i].swizzle; emit(ir, TGSI_OPCODE_MOV, dst, src); /* even a float takes up a whole vec4 reg in a struct/array. */ @@ -2042,6 +2051,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir) else src.swizzle = SWIZZLE_NOOP; + /* Change the register type to the element type of the array. */ + src.type = ir-type-base_type; + this-result = src; } @@ -2067,6 +2079,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_record *ir) this-result.swizzle = SWIZZLE_NOOP; this-result.index += offset; + this-result.type = ir-type-base_type; } /** @@ -2286,6 +2299,10 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) inst-dead_mask = inst-dst.writemask; } else { for (i = 0; i type_size(ir-lhs-type); i++) { + if (ir-rhs-type-is_array()) + r.type = ir-rhs-type-element_type()-base_type; + else if (ir-rhs-type-is_record()) + r.type = ir-rhs-type-fields.structure[i].type-base_type; emit(ir, TGSI_OPCODE_MOV, l, r); l.index++; r.index++; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] mishandling of bool comparisons in GLSL-TGSI translator
Hi. I'm still around to some extent, although I still need to submit the changes to the ML for my geometry shader work from a couple of months ago. I don't have much free time anymore, but I'll try to take a look at this soon. -Bryan On 10/15/2012 03:35 PM, Brian Paul wrote: Jose found a problem in the GLSL-TGSI translater with respect to boolean array comparisions when PIPE_SHADER_CAP_INTEGERS=1. I dug into it a bit. The attached shader_runner case shows the issue. Basically, the comparison of boolean values (which are unsigned integers) winds up being done with TGSI_OPCODE_SEQ instead of USEQ: VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], GENERIC[12] DCL CONST[0..1] DCL TEMP[0], LOCAL DCL TEMP[1], LOCAL IMM UINT32 {4294967295, 0, 0, 0} IMM FLT32 {1., 0., 0., 0.} 0: SEQ TEMP[0].x, CONST[0]., IMM[0].UINT operands! 1: F2I TEMP[0].x, -TEMP[0] 2: SEQ TEMP[1].x, CONST[1]., IMM[0].UINT operands! 3: F2I TEMP[1].x, -TEMP[1] 4: AND TEMP[0].x, TEMP[0]., TEMP[1]. 5: IF TEMP[0]. :0 6: MOV TEMP[0], IMM[1].xyyx 7: ELSE :0 8: MOV TEMP[0], IMM[1].yxyx 9: ENDIF 10: MOV OUT[1], TEMP[0] 11: MOV OUT[0], IN[0] 12: END The correct code sequence should be: 0: USEQ TEMP[0].x, CONST[0]., IMM[0]. 1: USEQ TEMP[1].x, CONST[1]., IMM[0]. 2: AND TEMP[0].x, TEMP[0]., TEMP[1]. 3: IF TEMP[0]. :0 4: MOV TEMP[0], IMM[1].xyyx 5: ELSE :0 6: MOV TEMP[0], IMM[1].yxyx 7: ENDIF 8: MOV OUT[1], TEMP[0] 9: MOV OUT[0], IN[0] 10: END The attached patch for glsl_to_tgsi_visitor::get_opcode() fixes the issue but the first assert which I added fails if it's enabled. AFAICT, by time we get into glsl_to_tgsi_visitor::get_opcode() we should have scalar operands, not arrays. It's not clear to me how to fix that. I'm hoping someone who's spent more time on glsl_to_tgsi.cpp (is Bryan Cain still around?) can take a look at this. Thanks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for dereferenced array elements
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 1 file changed, 8 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9146f24..caa7e94 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, { int type = GLSL_TYPE_FLOAT; + assert(src0.type != GLSL_TYPE_ARRAY); + assert(src0.type != GLSL_TYPE_STRUCT); + assert(src1.type != GLSL_TYPE_ARRAY); + assert(src1.type != GLSL_TYPE_STRUCT); + if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) @@ -2042,6 +2047,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir) else src.swizzle = SWIZZLE_NOOP; + /* Change the register type to the element type of the array. */ + src.type = ir-array-type-element_type()-base_type; + this-result = src; } -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for dereferenced array elements
On 10/15/2012 06:47 PM, Brian Paul wrote: On 10/15/2012 04:06 PM, Bryan Cain wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 1 file changed, 8 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9146f24..caa7e94 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, { int type = GLSL_TYPE_FLOAT; + assert(src0.type != GLSL_TYPE_ARRAY); + assert(src0.type != GLSL_TYPE_STRUCT); + assert(src1.type != GLSL_TYPE_ARRAY); + assert(src1.type != GLSL_TYPE_STRUCT); + if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) @@ -2042,6 +2047,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir) else src.swizzle = SWIZZLE_NOOP; + /* Change the register type to the element type of the array. */ + src.type = ir-array-type-element_type()-base_type; + this-result = src; } Thanks, but this causes a ton of piglit regressions (many crashes or failures on the new assertions). -Brian That's not surprising, since it was a quick and untested hack. I will look at the rest of the assertion failures the next time I have free time. --Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/16] Assorted gallium shader, sampler, sampler_view changes
On 08/09/2012 10:11 PM, Brian Paul wrote: The following patches are steps toward some gallium API clean-ups. 1. Eventually, replace pipe_context::bind_fragment/vertex/geometry/compute_sampler_states() with a single bind_sampler_states() entrypoint which takes a PIPE_SHADER_x to identify the shader stage and a 'start_slot' value like bind_compute_sampler_states() has. The later seemed to be agreed upon a few weeks ago. 2. Similarly for pipe_context::set_{shader}_sampler_views(). 3. Where possible, replace parallel arrays for vertex/fragment/geometry objects with a 2D array indexed by shader type. For example: replace: struct pipe_sampler_state *vert_samplers[MAX_SAMPLERS]; struct pipe_sampler_state *geom_samplers[MAX_SAMPLERS]; struct pipe_sampler_state *frag_samplers[MAX_SAMPLERS]; with: struct pipe_sampler_state *samplers[PIPE_SHADER_TYPES][MAX_SAMPLERS]; 4. Add support for geometry shader stuff in a few places like the state tracker. I've touched about half the drivers so far. There's a fair bit of work to be done before actually changing p_context.h -Brian Hi, I just wanted to say thank you for working on this. Getting sampling to work in my geometry shaders branch was something I was really not looking forward to, and once this work lands in master, it should allow me to finish my geometry shader work at least a couple of weeks before I could otherwise. -Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Support for EXT/ARB_geometry_shader4
Some of you already know about this from IRC, but recently I've been working on adding support for GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4 (which are essentially identical) to Mesa. A significant amount of the required code was already there from Zack Rusin's work on geometry shaders about 2 years ago (before the new GLSL compiler was merged), so the bulk of my changes are adding support to the GLSL compiler and glsl_to_tgsi, which is still not a small task. Anyway, as of yesterday, all of the GLSL demos in mesa/demos that use EXT/ARB_geometry_shader4 are working fully in my branch with softpipe. There are still some things that need doing and/or fixing before it should be considered for merging into master, but for now, my geometry shader work (including future updates) can be found in a branch of my Mesa repository on GitHub at: https://github.com/Plombo/mesa/tree/geometry-shaders . It's worth noting that geometry shaders in GL 3.2 (GLSL 1.50) core are slightly different than the geometry shaders in the EXT and ARB extensions. However, the main change in the core version is a change in the way inputs are accessed which is rather TGSI-unfriendly, so when GLSL 1.50 is implemented, we will have to lower GS inputs to the extension way of doing things. So all of this code is in fact necessary for GL 3.2 core even though it only implements the extensions. -Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir: set position before i instead of i-next in NV50LoweringPreSSA::visit
Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962. --- .../drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp index 39e0cfa..b688172 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp @@ -1034,7 +1034,7 @@ NV50LoweringPreSSA::visit(Instruction *i) bld.setPosition(i-prev, true); else if (i-next) - bld.setPosition(i-next, false); + bld.setPosition(i, false); else bld.setPosition(i-bb, true); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): docs: Update GL3.txt.
On 07/11/2012 12:24 AM, Eric Anholt wrote: Kenneth Graunke k...@kemper.freedesktop.org writes: inverse() has been done for a while. Does the inverse() builtin constant expression handling work for you? It doesn't here. None of us know what highp change means; GLSL 1.40 spec: Make the default precision qualification for fragment shader be high. -- it was also on our task list. Like the commit message said, precision qualifiers are entirely ignored by the GLSL compiler - they don't even make it to the IR stage. So there's no such thing as a default here since it doesn't have a value at all for any variable in the IR. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix transform feedback of unsubscripted gl_ClipDistance array
On 6/16/2012 5:43 PM, Marcin Slusarz wrote: gl_ClipDistance needs special treatment in form of lowering pass which transforms gl_ClipDistance representation from float[] to vec4[]. There are 2 implementations - at glsl linker level (enabled by LowerClipDistance option) and at glsl_to_tgsi level (enabled unconditionally for gallium drivers). Second implementation is incomplete - it does not take into account transform feedback (see commit 642e5b413e0890b2070ba78fde42db381eaf02e5 mesa: Fix transform feedback of unsubscripted gl_ClipDistance array for details). There are 2 possible fixes: - adding transform feedback support into glsl_to_tgsi version - ripping gl_ClipDistance support from glsl_to_tgsi and enabling gl_ClipDistance lowering on glsl linker side This patch implements 2nd option. All it does is: - reverts most of the commit 59be691638200797583bce39a83f641d30d97492 st/mesa: add support for gl_ClipDistance - changes LowerClipDistance to true Fixes Piglit tests EXT_transform_feedback/builtin-varyings gl_ClipDistance[{2,3,4,5,6,7,8}]-no-subscript on nv50. --- I can't say I know much about how transform feedback works, but is there a reason that the first fix would be difficult? It seems like a waste to not take advantage of hardware support for clip distances because the current implementation isn't complete. -Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: add an IABS opcode to TGSI
This is a necessary operation that is missing from TGSI. --- src/gallium/auxiliary/tgsi/tgsi_exec.c |4 src/gallium/auxiliary/tgsi/tgsi_info.c |1 + src/gallium/docs/source/tgsi.rst | 13 + src/gallium/include/pipe/p_shader_tokens.h |3 ++- 4 files changed, 20 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 7ea8511..3e2b899 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -4193,6 +4193,10 @@ exec_instruction( exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_UINT, TGSI_EXEC_DATA_UINT); break; + case TGSI_OPCODE_IABS: + exec_vector_unary(mach, inst, micro_iabs, TGSI_EXEC_DATA_INT, TGSI_EXEC_DATA_INT); + break; + default: assert( 0 ); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 6cd580a..c9acdb9 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -192,6 +192,7 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL }, { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP }, + { 1, 1, 0, 0, 0, 0, IABS, TGSI_OPCODE_IABS }, }; const struct tgsi_opcode_info * diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 45af528..7e7010f 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -1043,6 +1043,19 @@ XXX so let's discuss it, yeah? destination register, which is assumed to be an address (ADDR) register. +.. opcode:: IABS - Integer Absolute Value + +.. math:: + + dst.x = |src.x| + + dst.y = |src.y| + + dst.z = |src.z| + + dst.w = |src.w| + + .. opcode:: SAD - Sum Of Absolute Differences .. math:: diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index b24b64c..75e17a1 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -376,8 +376,9 @@ struct tgsi_property_data { #define TGSI_OPCODE_UARL157 #define TGSI_OPCODE_UCMP158 +#define TGSI_OPCODE_IABS159 -#define TGSI_OPCODE_LAST159 +#define TGSI_OPCODE_LAST160 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] softpipe GL3 status
On 01/07/2012 02:44 AM, Dave Airlie wrote: Let's add new opcodes for things like this. I'll add IABS. I sent a patch to the mailing list that adds IABS about an hour ago, before reading this message. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] glsl_to_tgsi: Create a new variable_store class replacing variables field in glsl_to_tgsi_visitor
This is good work. I just have a few suggested changes. On 01/07/2012 12:26 PM, Vincent Lejeune wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 414 +--- 1 files changed, 309 insertions(+), 105 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index cecceca..17ae525 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -230,14 +230,16 @@ public: class variable_storage : public exec_node { public: variable_storage(ir_variable *var, gl_register_file file, int index) - : file(file), index(index), var(var) + : file(file), index(index), type(var-type), is_array(var-type-is_array() || var-type-is_record() || var-type-is_matrix()), is_reladdressed(false) { /* empty */ } gl_register_file file; int index; - ir_variable *var; /* variable that maps to this, if any */ + const glsl_type *type; /* variable that maps to this, if any */ The comment there doesn't make sense anymore. + bool is_array; + bool is_reladdressed; }; class immediate_storage : public exec_node { @@ -286,6 +288,242 @@ public: st_src_reg return_reg; }; +static int type_size(const glsl_type *type); +static int swizzle_for_size(int size); + + +/** + * Single place to store all temporary values (either explicit - ir_variable* - + * or implicit - returned by get_temp -). I don't think the last - is supposed to be there. + * + * Explicit temps are stored in variables hash_table. + * Implicit temps are stored in rvalue_regs array. + */ +class variable_store { + friend class glsl_to_tgsi_variable_allocator; +protected: + void *mem_ctx; + hash_table* variables; + unsigned next_temp; + unsigned next_temp_array; + static void reindex_reladdress(const void *, void *, void *); + static void reindex_non_reladdress(const void *, void *, void *); + void reindex_rvalue(); + void reindex_rvalue_reladdressed(); + variable_storage* rvalue_regs; + unsigned rvalue_regs_count; + +public: + bool native_integers; + unsigned temp_amount() const; + unsigned temp_array_amount() const; + variable_store(); + ~variable_store(); + variable_storage *find_variable_storage(class ir_variable *var) const; + variable_storage *push(class ir_variable *, gl_register_file, int); + variable_storage *push(class ir_variable *); + variable_storage *retrieve_anonymous_temp(unsigned); + st_src_reg get_temp(const glsl_type *type); + void optimise_access(void); + unsigned *reindex_table; +}; + +unsigned +variable_store::temp_amount() const +{ + return next_temp; +} + +unsigned +variable_store::temp_array_amount() const +{ + return next_temp_array; +} + +variable_store::variable_store():mem_ctx(ralloc_context(NULL)),next_temp(1),next_temp_array(1),rvalue_regs(NULL),rvalue_regs_count(0) +{ + variables = hash_table_ctor(0,hash_table_pointer_hash,hash_table_pointer_compare); +} + +variable_store::~variable_store() +{ + hash_table_dtor(variables); + ralloc_free(mem_ctx); +} + +variable_storage * +variable_store::find_variable_storage(ir_variable *var) const +{ + return (class variable_storage *) hash_table_find(variables,var); The convention in the rest of glsl_to_tgsi (and Mesa) is to put a space after the comma separating arguments. I'd prefer it if this patch followed the same convention. +} + +variable_storage* +variable_store::push(class ir_variable *var, gl_register_file file, int index) +{ + variable_storage *storage = new (mem_ctx) variable_storage(var,file,index); + hash_table_insert(variables,storage,var); + return storage; +} + +variable_storage* +variable_store::push(ir_variable *ir) +{ + variable_storage* retval = push(ir, PROGRAM_TEMPORARY, next_temp); + next_temp += type_size(ir-type); + if (ir-type-is_array() || ir-type-is_record() || ir-type-is_matrix()) { + retval-is_array = true; + } + return retval; +} + +variable_storage* +variable_store::retrieve_anonymous_temp(unsigned reg) +{ + for (unsigned i = 0; i rvalue_regs_count; i++) { + unsigned range_start = rvalue_regs[i].index; + unsigned range_end = range_start + type_size(rvalue_regs[i].type) - 1; + if (reg = range_start reg = range_end) { + return rvalue_regs + i; + } + } + printf (Failed to get storage); + exit(1); I don't think writing to stdout and exiting cleanly like this is the right way to handle a fatal error. It should probably write the error message to stderr (making clear that it's a fatal error in Mesa, not an error in the application) and assert (so that a debugger can catch it), then use call exit(1). +} + +/** + * In the initial pass of codegen, we assign temporary numbers to + * intermediate results. (not SSA -- variable
Re: [Mesa-dev] softpipe GL3 status
On 01/06/2012 01:26 PM, Ian Romanick wrote: On 01/06/2012 09:04 AM, Dave Airlie wrote: Hi guys, Just a quick note, I've just spent a week or so trying to see where gallium and softpipe were w.r.t GL3.0 support. I've pushed a branch to my repo called softpipe-gl3. It contains patches in various state of usefulness but it brings the piglit results to 220 failures in 7623 tests, which isn't bad. Outstanding known problems (stuff I've dug into). smooth interpolation is broken in softpipe, worth about 70-100 fixes at a quick guess. integer abs - we have no TGSI representation for this, should we lower it to something? Or just generate some TGSI instructions to implement it. You should be able to fake it with a CMP-like instruction. I think that's how i915 does it in hardware. Depends on whether there's any hardware with a native integer abs instruciton. If there is, we should just add a new IABS instruction to TGSI and let drivers implement it how they want. Otherwise, your suggestion should work. integer SSG (set sign) - no TGSI for this, lower it? Where is SSG being generated? I thought ir_to_mesa was the only thing that generated it, and Gallium shouldn't hit that path. glsl_to_tgsi still generates the TGSI equivalent; that part hasn't been changed from ir_to_mesa. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2 v4] Add support for clip distances in Gallium
On 01/02/2012 02:58 PM, Bryan Cain wrote: This is the fourth revision of my changes to add support for gl_ClipDistance with Gallium. The first three revisions were sent in closer succession about two weeks ago. This revision is identical to v3 except for the inclusion of the changes suggested by Brian Paul in reply to the v3 patches. Does anyone mind if I push this? It's been on the list for two full days with no objections. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2 v4] Add support for clip distances in Gallium
This is the fourth revision of my changes to add support for gl_ClipDistance with Gallium. The first three revisions were sent in closer succession about two weeks ago. This revision is identical to v3 except for the inclusion of the changes suggested by Brian Paul in reply to the v3 patches. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] gallium: add support for clip distances
--- src/gallium/auxiliary/tgsi/tgsi_dump.c |3 +- src/gallium/auxiliary/tgsi/tgsi_text.c |3 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 38 +-- src/gallium/auxiliary/tgsi/tgsi_ureg.h |6 src/gallium/include/pipe/p_shader_tokens.h |3 +- 5 files changed, 41 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index e830aa5..bd299b0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -129,7 +129,8 @@ static const char *semantic_names[] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *immediate_type_names[] = diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index eb9190c..f46ba19 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -1024,7 +1024,8 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *interpolate_names[TGSI_INTERPOLATE_COUNT] = diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c b/src/gallium/auxiliary/tgsi/tgsi_ureg.c index 17f9ce2..0f9aa3a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c @@ -122,6 +122,7 @@ struct ureg_program struct { unsigned semantic_name; unsigned semantic_index; + unsigned usage_mask; /* = TGSI_WRITEMASK_* */ } output[UREG_MAX_OUTPUT]; unsigned nr_outputs; @@ -396,21 +397,27 @@ ureg_DECL_system_value(struct ureg_program *ureg, struct ureg_dst -ureg_DECL_output( struct ureg_program *ureg, - unsigned name, - unsigned index ) +ureg_DECL_output_masked( struct ureg_program *ureg, + unsigned name, + unsigned index, + unsigned usage_mask ) { unsigned i; + assert(usage_mask != 0); + for (i = 0; i ureg-nr_outputs; i++) { if (ureg-output[i].semantic_name == name - ureg-output[i].semantic_index == index) + ureg-output[i].semantic_index == index) { + ureg-output[i].usage_mask |= usage_mask; goto out; + } } if (ureg-nr_outputs UREG_MAX_OUTPUT) { ureg-output[i].semantic_name = name; ureg-output[i].semantic_index = index; + ureg-output[i].usage_mask = usage_mask; ureg-nr_outputs++; } else { @@ -422,6 +429,15 @@ out: } +struct ureg_dst +ureg_DECL_output( struct ureg_program *ureg, + unsigned name, + unsigned index ) +{ + return ureg_DECL_output_masked(ureg, name, index, TGSI_WRITEMASK_XYZW); +} + + /* Returns a new constant register. Keep track of which have been * referred to so that we can emit decls later. * @@ -1181,7 +1197,8 @@ emit_decl_semantic(struct ureg_program *ureg, unsigned file, unsigned index, unsigned semantic_name, - unsigned semantic_index) + unsigned semantic_index, + unsigned usage_mask) { union tgsi_any_token *out = get_tokens(ureg, DOMAIN_DECL, 3); @@ -1189,7 +1206,7 @@ emit_decl_semantic(struct ureg_program *ureg, out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION; out[0].decl.NrTokens = 3; out[0].decl.File = file; - out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; /* FIXME! */ + out[0].decl.UsageMask = usage_mask; out[0].decl.Semantic = 1; out[1].value = 0; @@ -1427,7 +1444,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_INPUT, ureg-gs_input[i].index, ureg-gs_input[i].semantic_name, -ureg-gs_input[i].semantic_index); +ureg-gs_input[i].semantic_index, +TGSI_WRITEMASK_XYZW); } } @@ -1436,7 +1454,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_SYSTEM_VALUE, ureg-system_value[i].index, ureg-system_value[i].semantic_name, - ureg-system_value[i].semantic_index); + ureg-system_value[i].semantic_index, + TGSI_WRITEMASK_XYZW); } for (i = 0; i ureg-nr_outputs; i++) { @@ -1444,7 +1463,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_OUTPUT, i, ureg-output[i].semantic_name, - ureg-output[i].semantic_index); + ureg-output[i].semantic_index, + ureg-output[i].usage_mask); } for (i = 0; i
[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 48 +--- src/mesa/state_tracker/st_program.c| 18 ++ 2 files changed, 61 insertions(+), 5 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 1515fc1..bc3005e 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -304,6 +304,7 @@ public: int samplers_used; bool indirect_addr_temps; bool indirect_addr_consts; + int num_clip_distances; int glsl_version; bool native_integers; @@ -4627,9 +4628,17 @@ st_translate_program( } for (i = 0; i numOutputs; i++) { - t-outputs[i] = ureg_DECL_output(ureg, - outputSemanticName[i], - outputSemanticIndex[i]); + if (outputSemanticName[i] == TGSI_SEMANTIC_CLIPDIST) { +int mask = ((1 (program-num_clip_distances - 4*outputSemanticIndex[i])) - 1) TGSI_WRITEMASK_XYZW; +t-outputs[i] = ureg_DECL_output_masked(ureg, +outputSemanticName[i], +outputSemanticIndex[i], +mask); + } else { +t-outputs[i] = ureg_DECL_output(ureg, + outputSemanticName[i], + outputSemanticIndex[i]); + } if ((outputSemanticName[i] == TGSI_SEMANTIC_PSIZE) proginfo-Id) { /* Writing to the point size result register requires special * handling to implement clamping. @@ -4806,7 +4815,8 @@ out: static struct gl_program * get_mesa_program(struct gl_context *ctx, struct gl_shader_program *shader_program, -struct gl_shader *shader) + struct gl_shader *shader, + int num_clip_distances) { glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor(); struct gl_program *prog; @@ -4851,6 +4861,7 @@ get_mesa_program(struct gl_context *ctx, v-options = options; v-glsl_version = ctx-Const.GLSLVersion; v-native_integers = ctx-Const.NativeIntegers; + v-num_clip_distances = num_clip_distances; _mesa_generate_parameters_list_for_uniforms(shader_program, shader, prog-Parameters); @@ -4980,6 +4991,25 @@ get_mesa_program(struct gl_context *ctx, return prog; } +/** + * Searches through the IR for a declaration of gl_ClipDistance and returns the + * declared size of the gl_ClipDistance array. Returns 0 if gl_ClipDistance is + * not declared in the IR. + */ +int get_clip_distance_size(exec_list *ir) +{ + foreach_iter (exec_list_iterator, iter, *ir) { + ir_instruction *inst = (ir_instruction *)iter.get(); + ir_variable *var = inst-as_variable(); + if (var == NULL) continue; + if (!strcmp(var-name, gl_ClipDistance)) { + return var-type-length; + } + } + + return 0; +} + extern C { struct gl_shader * @@ -5018,6 +5048,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name) GLboolean st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) { + int num_clip_distances[MESA_SHADER_TYPES]; assert(prog-LinkStatus); for (unsigned i = 0; i MESA_SHADER_TYPES; i++) { @@ -5029,6 +5060,11 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) const struct gl_shader_compiler_options *options = ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)]; + /* We have to determine the length of the gl_ClipDistance array before + * the array is lowered to two vec4s by lower_clip_distance(). + */ + num_clip_distances[i] = get_clip_distance_size(ir); + do { progress = false; @@ -5045,6 +5081,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) || progress; progress = lower_quadop_vector(ir, false) || progress; + progress = lower_clip_distance(ir) || progress; if (options-MaxIfDepth == 0) progress = lower_discard(ir) || progress; @@ -5079,7 +5116,8 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (prog-_LinkedShaders[i] == NULL) continue; - linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]); + linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i], + num_clip_distances[i]); if (linked_prog) { static const GLenum targets[] = { diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index d62bfcd..aceaaf8 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -244,6
[Mesa-dev] [PATCH 1/2] gallium: add support for clip distances
--- src/gallium/auxiliary/tgsi/tgsi_dump.c |3 +- src/gallium/auxiliary/tgsi/tgsi_text.c |3 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 36 +--- src/gallium/auxiliary/tgsi/tgsi_ureg.h |6 src/gallium/include/pipe/p_shader_tokens.h |3 +- 5 files changed, 39 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index e830aa5..bd299b0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -129,7 +129,8 @@ static const char *semantic_names[] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *immediate_type_names[] = diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index eb9190c..f46ba19 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -1024,7 +1024,8 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *interpolate_names[TGSI_INTERPOLATE_COUNT] = diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c b/src/gallium/auxiliary/tgsi/tgsi_ureg.c index 17f9ce2..56c4492 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c @@ -122,6 +122,7 @@ struct ureg_program struct { unsigned semantic_name; unsigned semantic_index; + unsigned usage_mask; } output[UREG_MAX_OUTPUT]; unsigned nr_outputs; @@ -396,21 +397,25 @@ ureg_DECL_system_value(struct ureg_program *ureg, struct ureg_dst -ureg_DECL_output( struct ureg_program *ureg, - unsigned name, - unsigned index ) +ureg_DECL_output_masked( struct ureg_program *ureg, + unsigned name, + unsigned index, + unsigned usage_mask ) { unsigned i; for (i = 0; i ureg-nr_outputs; i++) { if (ureg-output[i].semantic_name == name - ureg-output[i].semantic_index == index) + ureg-output[i].semantic_index == index) { + ureg-output[i].usage_mask |= usage_mask; goto out; + } } if (ureg-nr_outputs UREG_MAX_OUTPUT) { ureg-output[i].semantic_name = name; ureg-output[i].semantic_index = index; + ureg-output[i].usage_mask = usage_mask; ureg-nr_outputs++; } else { @@ -422,6 +427,15 @@ out: } +struct ureg_dst +ureg_DECL_output( struct ureg_program *ureg, + unsigned name, + unsigned index ) +{ + return ureg_DECL_output_masked(ureg, name, index, TGSI_WRITEMASK_XYZW); +} + + /* Returns a new constant register. Keep track of which have been * referred to so that we can emit decls later. * @@ -1181,7 +1195,8 @@ emit_decl_semantic(struct ureg_program *ureg, unsigned file, unsigned index, unsigned semantic_name, - unsigned semantic_index) + unsigned semantic_index, + unsigned usage_mask) { union tgsi_any_token *out = get_tokens(ureg, DOMAIN_DECL, 3); @@ -1189,7 +1204,7 @@ emit_decl_semantic(struct ureg_program *ureg, out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION; out[0].decl.NrTokens = 3; out[0].decl.File = file; - out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; /* FIXME! */ + out[0].decl.UsageMask = usage_mask; out[0].decl.Semantic = 1; out[1].value = 0; @@ -1427,7 +1442,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_INPUT, ureg-gs_input[i].index, ureg-gs_input[i].semantic_name, -ureg-gs_input[i].semantic_index); +ureg-gs_input[i].semantic_index, +TGSI_WRITEMASK_XYZW); } } @@ -1436,7 +1452,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_SYSTEM_VALUE, ureg-system_value[i].index, ureg-system_value[i].semantic_name, - ureg-system_value[i].semantic_index); + ureg-system_value[i].semantic_index, + TGSI_WRITEMASK_XYZW); } for (i = 0; i ureg-nr_outputs; i++) { @@ -1444,7 +1461,8 @@ static void emit_decls( struct ureg_program *ureg ) TGSI_FILE_OUTPUT, i, ureg-output[i].semantic_name, - ureg-output[i].semantic_index); + ureg-output[i].semantic_index, + ureg-output[i].usage_mask); } for (i = 0; i ureg-nr_samplers; i++) { diff --git
[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 49 +--- src/mesa/state_tracker/st_program.c| 18 ++ 2 files changed, 62 insertions(+), 5 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index b929806..3e8df78 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -304,6 +304,7 @@ public: int samplers_used; bool indirect_addr_temps; bool indirect_addr_consts; + int num_clip_distances; int glsl_version; bool native_integers; @@ -4618,9 +4619,16 @@ st_translate_program( } for (i = 0; i numOutputs; i++) { - t-outputs[i] = ureg_DECL_output(ureg, - outputSemanticName[i], - outputSemanticIndex[i]); + if (outputSemanticName[i] == TGSI_SEMANTIC_CLIPDIST) { +int mask = ((1 (program-num_clip_distances - 4*outputSemanticIndex[i])) - 1) 0xf; +t-outputs[i] = ureg_DECL_output_masked(ureg, +outputSemanticName[i], +outputSemanticIndex[i], +mask); + } else +t-outputs[i] = ureg_DECL_output(ureg, + outputSemanticName[i], + outputSemanticIndex[i]); if ((outputSemanticName[i] == TGSI_SEMANTIC_PSIZE) proginfo-Id) { /* Writing to the point size result register requires special * handling to implement clamping. @@ -4797,7 +4805,8 @@ out: static struct gl_program * get_mesa_program(struct gl_context *ctx, struct gl_shader_program *shader_program, -struct gl_shader *shader) + struct gl_shader *shader, + int num_clip_distances) { glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor(); struct gl_program *prog; @@ -4842,6 +4851,7 @@ get_mesa_program(struct gl_context *ctx, v-options = options; v-glsl_version = ctx-Const.GLSLVersion; v-native_integers = ctx-Const.NativeIntegers; + v-num_clip_distances = num_clip_distances; _mesa_generate_parameters_list_for_uniforms(shader_program, shader, prog-Parameters); @@ -4971,6 +4981,27 @@ get_mesa_program(struct gl_context *ctx, return prog; } +/** + * Searches through the IR for a declaration of gl_ClipDistance and returns the + * declared size of the gl_ClipDistance array. Returns 0 if gl_ClipDistance is + * not declared in the IR. + */ +int get_clip_distance_size(exec_list *ir) +{ + foreach_iter (exec_list_iterator, iter, *ir) { + ir_instruction *inst = (ir_instruction *)iter.get(); + ir_variable *var = inst-as_variable(); + if (var == NULL) continue; + if (!strcmp(var-name, gl_ClipDistance)) + { + fprintf(stderr, gl_ClipDistance found with size %i\n, var-type-length); + return var-type-length; + } + } + + return 0; +} + extern C { struct gl_shader * @@ -5009,6 +5040,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name) GLboolean st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) { + int num_clip_distances[MESA_SHADER_TYPES]; assert(prog-LinkStatus); for (unsigned i = 0; i MESA_SHADER_TYPES; i++) { @@ -5020,6 +5052,11 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) const struct gl_shader_compiler_options *options = ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)]; + /* We have to determine the length of the gl_ClipDistance array before + * the array is lowered to two vec4s by lower_clip_distance(). + */ + num_clip_distances[i] = get_clip_distance_size(ir); + do { progress = false; @@ -5036,6 +5073,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) || progress; progress = lower_quadop_vector(ir, false) || progress; + progress = lower_clip_distance(ir) || progress; if (options-MaxIfDepth == 0) progress = lower_discard(ir) || progress; @@ -5070,7 +5108,8 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (prog-_LinkedShaders[i] == NULL) continue; - linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]); + linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i], + num_clip_distances[i]); if (linked_prog) { static const GLenum targets[] = { diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index b83c561..b404503 100644 ---
[Mesa-dev] [PATCH 1/2] gallium: add TGSI_SEMANTIC_CLIPDIST for clip distance
--- src/gallium/auxiliary/tgsi/tgsi_dump.c |3 ++- src/gallium/include/pipe/p_shader_tokens.h |3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index e830aa5..bd299b0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -129,7 +129,8 @@ static const char *semantic_names[] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *immediate_type_names[] = diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 10cfaf6..330e0ba 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -146,7 +146,8 @@ struct tgsi_declaration_dimension #define TGSI_SEMANTIC_INSTANCEID 10 #define TGSI_SEMANTIC_VERTEXID 11 #define TGSI_SEMANTIC_STENCIL12 -#define TGSI_SEMANTIC_COUNT 13 /** number of semantic values */ +#define TGSI_SEMANTIC_CLIPDIST 13 +#define TGSI_SEMANTIC_COUNT 14 /** number of semantic values */ struct tgsi_declaration_semantic { -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |1 + src/mesa/state_tracker/st_program.c| 18 ++ 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9ef65c8..d50176d 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5036,6 +5036,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) || progress; progress = lower_quadop_vector(ir, false) || progress; + progress = lower_clip_distance(ir) || progress; if (options-MaxIfDepth == 0) progress = lower_discard(ir) || progress; diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index 04d3ef6..73581dd 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -244,6 +244,14 @@ st_prepare_vertex_program(struct gl_context *ctx, stvp-output_semantic_name[slot] = TGSI_SEMANTIC_PSIZE; stvp-output_semantic_index[slot] = 0; break; + case VERT_RESULT_CLIP_DIST0: +stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; +stvp-output_semantic_index[slot] = 0; +break; + case VERT_RESULT_CLIP_DIST1: +stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; +stvp-output_semantic_index[slot] = 1; +break; case VERT_RESULT_EDGE: assert(0); break; @@ -541,6 +549,16 @@ st_translate_fragment_program(struct st_context *st, input_semantic_index[slot] = 0; interpMode[slot] = TGSI_INTERPOLATE_CONSTANT; break; +case FRAG_ATTRIB_CLIP_DIST0: + input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; + input_semantic_index[slot] = 0; + interpMode[slot] = TGSI_INTERPOLATE_LINEAR; + break; +case FRAG_ATTRIB_CLIP_DIST1: + input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; + input_semantic_index[slot] = 1; + interpMode[slot] = TGSI_INTERPOLATE_LINEAR; + break; /* In most cases, there is nothing special about these * inputs, so adopt a convention to use the generic * semantic name and the mesa FRAG_ATTRIB_ number as the -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 39 ++- src/mesa/state_tracker/st_program.c| 18 + 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9ef65c8..5e54465 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -304,6 +304,7 @@ public: int samplers_used; bool indirect_addr_temps; bool indirect_addr_consts; + int num_clip_distances; int glsl_version; bool native_integers; @@ -4738,6 +4739,9 @@ st_translate_program( t-samplers[i] = ureg_DECL_sampler(ureg, i); } } + + /* Set the number of clip distances used in the shader. */ + ureg_property_num_clip_distances(ureg, program-num_clip_distances); /* Emit each instruction in turn: */ @@ -4797,7 +4801,8 @@ out: static struct gl_program * get_mesa_program(struct gl_context *ctx, struct gl_shader_program *shader_program, -struct gl_shader *shader) + struct gl_shader *shader, + int num_clip_distances) { glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor(); struct gl_program *prog; @@ -4842,6 +4847,7 @@ get_mesa_program(struct gl_context *ctx, v-options = options; v-glsl_version = ctx-Const.GLSLVersion; v-native_integers = ctx-Const.NativeIntegers; + v-num_clip_distances = num_clip_distances; _mesa_generate_parameters_list_for_uniforms(shader_program, shader, prog-Parameters); @@ -4971,6 +4977,27 @@ get_mesa_program(struct gl_context *ctx, return prog; } +/** + * Searches through the IR for a declaration of gl_ClipDistance and returns the + * declared size of the gl_ClipDistance array. Returns 0 if gl_ClipDistance is + * not declared in the IR. + */ +int get_clip_distance_size(exec_list *ir) +{ + foreach_iter (exec_list_iterator, iter, *ir) { + ir_instruction *inst = (ir_instruction *)iter.get(); + ir_variable *var = inst-as_variable(); + if (var == NULL) continue; + if (!strcmp(var-name, gl_ClipDistance)) + { + fprintf(stderr, gl_ClipDistance found with size %i\n, var-type-length); + return var-type-length; + } + } + + return 0; +} + extern C { struct gl_shader * @@ -5009,6 +5036,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name) GLboolean st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) { + int num_clip_distances[MESA_SHADER_TYPES]; assert(prog-LinkStatus); for (unsigned i = 0; i MESA_SHADER_TYPES; i++) { @@ -5020,6 +5048,11 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) const struct gl_shader_compiler_options *options = ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)]; + /* We have to determine the length of the gl_ClipDistance array before the + * array is lowered to two vec4s by lower_clip_distance(). + */ + num_clip_distances[i] = get_clip_distance_size(ir); + do { progress = false; @@ -5036,6 +5069,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) || progress; progress = lower_quadop_vector(ir, false) || progress; + progress = lower_clip_distance(ir) || progress; if (options-MaxIfDepth == 0) progress = lower_discard(ir) || progress; @@ -5070,7 +5104,8 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (prog-_LinkedShaders[i] == NULL) continue; - linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]); + linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i], + num_clip_distances[i]); if (linked_prog) { static const GLenum targets[] = { diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index 04d3ef6..73581dd 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -244,6 +244,14 @@ st_prepare_vertex_program(struct gl_context *ctx, stvp-output_semantic_name[slot] = TGSI_SEMANTIC_PSIZE; stvp-output_semantic_index[slot] = 0; break; + case VERT_RESULT_CLIP_DIST0: +stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; +stvp-output_semantic_index[slot] = 0; +break; + case VERT_RESULT_CLIP_DIST1: +stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; +stvp-output_semantic_index[slot] = 1; +break; case VERT_RESULT_EDGE: assert(0); break; @@ -541,6 +549,16 @@ st_translate_fragment_program(struct st_context *st, input_semantic_index[slot]
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. Could you also elaborate on why TGSI_SEMANTIC_CLIPDIST is useful for the drivers? I personally don't have nothing against it, but just like to understand why it makes a difference. Jose Unless my understanding of clip distances is wrong, the GPU uses clip distances to decide whether a vertex should be clipped, and thus needs to know which of the vertex shader outputs are clip distances. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 03:09 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 12:26 PM, Bryan Cain wrote: On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. The description is almost complete. :) The issue is that the shader may declare out float gl_ClipDistance[4]; the use non-constant addressing of the array. The compiler knows that gl_ClipDistance has at most 4 elements, but post-hoc analysis would not be able to determine that. Often the fixed-function hardware (see below) needs to know which clip distance values are actually written. But don't all the clip distances written by the shader need to be declared? E.g.: DCL OUT[0], CLIPDIST[0] DCL OUT[1], CLIPDIST[1] DCL OUT[2], CLIPDIST[2] DCL OUT[3], CLIPDIST[3] therefore a trivial analysis of the declarations convey that? No. Clip distance is an array of up to 8 floats in GLSL, but it's represented in the hardware as 2 vec4s. You can tell by analyzing the declarations whether there are more than 4 clip distances in use, but not which components the shader writes to. TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not the number of full vectors. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: add TGSI_SEMANTIC_CLIPDIST for clip distance
On 12/13/2011 03:07 PM, Brian Paul wrote: On Tue, Dec 13, 2011 at 9:59 AM, Bryan Cain bryanca...@gmail.com wrote: --- src/gallium/auxiliary/tgsi/tgsi_dump.c |3 ++- src/gallium/include/pipe/p_shader_tokens.h |3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index e830aa5..bd299b0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -129,7 +129,8 @@ static const char *semantic_names[] = PRIM_ID, INSTANCEID, VERTEXID, - STENCIL + STENCIL, + CLIPDIST }; static const char *immediate_type_names[] = diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 10cfaf6..330e0ba 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -146,7 +146,8 @@ struct tgsi_declaration_dimension #define TGSI_SEMANTIC_INSTANCEID 10 #define TGSI_SEMANTIC_VERTEXID 11 #define TGSI_SEMANTIC_STENCIL12 -#define TGSI_SEMANTIC_COUNT 13 /** number of semantic values */ +#define TGSI_SEMANTIC_CLIPDIST 13 +#define TGSI_SEMANTIC_COUNT 14 /** number of semantic values */ struct tgsi_declaration_semantic { Reviewed-by: Brian Paul bri...@vmware.com Thanks for reviewing, but this is the first version of this patch. There's an updated one that was sent a few hours after this one. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 03:25 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:09 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 12:26 PM, Bryan Cain wrote: On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. The description is almost complete. :) The issue is that the shader may declare out float gl_ClipDistance[4]; the use non-constant addressing of the array. The compiler knows that gl_ClipDistance has at most 4 elements, but post-hoc analysis would not be able to determine that. Often the fixed-function hardware (see below) needs to know which clip distance values are actually written. But don't all the clip distances written by the shader need to be declared? E.g.: DCL OUT[0], CLIPDIST[0] DCL OUT[1], CLIPDIST[1] DCL OUT[2], CLIPDIST[2] DCL OUT[3], CLIPDIST[3] therefore a trivial analysis of the declarations convey that? No. Clip distance is an array of up to 8 floats in GLSL, but it's represented in the hardware as 2 vec4s. You can tell by analyzing the declarations whether there are more than 4 clip distances in use, but not which components the shader writes to. TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not the number of full vectors. Lets imagine out float gl_ClipDistance[6]; Each a clip distance is a scalar float. Either all hardware represents the 8 clip distances as two 4 vectors, and we do: DCL OUT[0].xywz, CLIPDIST[0] DCL OUT[1].xy, CLIPDIST[1] using the full range of struct tgsi_declaration::UsageMask [1] or we represent them as as scalars: DCL OUT[0].x, CLIPDIST[0] DCL OUT[1].x, CLIPDIST[1] DCL OUT[2].x, CLIPDIST[2] DCL OUT[3].x, CLIPDIST[3] DCL OUT[4].x, CLIPDIST[4] DCL OUT[5].x, CLIPDIST[5] If indirect addressing is allowed as I read bore, then maybe the later is better. I confess my ignorance about clipping and maybe I'm being dense, but I still don't see the need for this TGSI_PROPERTY_NUM_CLIP_DISTANCES. Could you please draft an example TGSI shader showing this property (or just paste one generated with your change)? I think that would help a lot. Jose [1] I don't know if tgsi_dump pays much attention to tgsi_declaration::UsageMask, but it does exist. UsageMask might work, but before that can be considered a viable solution, someone will need to make it possible to actually declare it from ureg. As it is, ureg is hardcoded to set UsageMask to xyzw no matter what on all declared inputs and outputs. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 03:48 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:25 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:09 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 12:26 PM, Bryan Cain wrote: On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. The description is almost complete. :) The issue is that the shader may declare out float gl_ClipDistance[4]; the use non-constant addressing of the array. The compiler knows that gl_ClipDistance has at most 4 elements, but post-hoc analysis would not be able to determine that. Often the fixed-function hardware (see below) needs to know which clip distance values are actually written. But don't all the clip distances written by the shader need to be declared? E.g.: DCL OUT[0], CLIPDIST[0] DCL OUT[1], CLIPDIST[1] DCL OUT[2], CLIPDIST[2] DCL OUT[3], CLIPDIST[3] therefore a trivial analysis of the declarations convey that? No. Clip distance is an array of up to 8 floats in GLSL, but it's represented in the hardware as 2 vec4s. You can tell by analyzing the declarations whether there are more than 4 clip distances in use, but not which components the shader writes to. TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not the number of full vectors. Lets imagine out float gl_ClipDistance[6]; Each a clip distance is a scalar float. Either all hardware represents the 8 clip distances as two 4 vectors, and we do: DCL OUT[0].xywz, CLIPDIST[0] DCL OUT[1].xy, CLIPDIST[1] using the full range of struct tgsi_declaration::UsageMask [1] or we represent them as as scalars: DCL OUT[0].x, CLIPDIST[0] DCL OUT[1].x, CLIPDIST[1] DCL OUT[2].x, CLIPDIST[2] DCL OUT[3].x, CLIPDIST[3] DCL OUT[4].x, CLIPDIST[4] DCL OUT[5].x, CLIPDIST[5] If indirect addressing is allowed as I read bore, then maybe the later is better. I confess my ignorance about clipping and maybe I'm being dense, but I still don't see the need for this TGSI_PROPERTY_NUM_CLIP_DISTANCES. Could you please draft an example TGSI shader showing this property (or just paste one generated with your change)? I think that would help a lot. Jose [1] I don't know if tgsi_dump pays much attention to tgsi_declaration::UsageMask, but it does exist. UsageMask might work, but before that can be considered a viable solution, someone will need to make it possible to actually declare it from ureg. As it is, ureg is hardcoded to set UsageMask to xyzw no matter what on all declared inputs and outputs. ureg automatically fills the UsageMask from the destionation register masks, since it easy to determine from the opcodes. Which leads me to my second point, if indirect addressing of CLIPDIST is allowed, then we can't really pack the clip distance as 4-elem vectors in TGSI: not only the syntax would be very weird, but it would create havoc on all tgsi-translating code that makes decisions based on indirect addressing of registers. That is, float gl_ClipDistance[6]; gl_ClipDistance[i] = foo; would become DCL OUT[0].x, CLIPDIST[0] DCL OUT[1].x, CLIPDIST[1] DCL OUT[2].x, CLIPDIST[2] DCL OUT[3].x, CLIPDIST[3] DCL OUT[4].x, CLIPDIST[4] DCL OUT[5].x, CLIPDIST[5] MOV OUT[ADDR[0].x].x, foo and the info from TGSI_PROPERTY_NUM_CLIP_DISTANCES can be obtained by walking the declaration (which can/should be done only once in tgsi_scan). But this just doesn't look like it would ever work: DCL OUT[0].xyzw, CLIPDIST[0] DCL OUT[1].xy , CLIPDIST[1] MOV OUT[ADDR[0].x]., foo Jose If ureg automatically fills the UsageMask from the accessed components, it's probably a better solution than the property. About the indirect addressing of components: the GLSL compiler lowers indirect addressing of the gl_ClipDistance array to indirect addressing of the 2 vec4s, combined with conditional moves to the different components. Which is okay, because although indirect addressing of gl_ClipDistance is allowed by the GLSL specification, meaning have to support it, it's not something that's actually useful in practical situations. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 03:48 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:25 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:09 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 12:26 PM, Bryan Cain wrote: On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. The description is almost complete. :) The issue is that the shader may declare out float gl_ClipDistance[4]; the use non-constant addressing of the array. The compiler knows that gl_ClipDistance has at most 4 elements, but post-hoc analysis would not be able to determine that. Often the fixed-function hardware (see below) needs to know which clip distance values are actually written. But don't all the clip distances written by the shader need to be declared? E.g.: DCL OUT[0], CLIPDIST[0] DCL OUT[1], CLIPDIST[1] DCL OUT[2], CLIPDIST[2] DCL OUT[3], CLIPDIST[3] therefore a trivial analysis of the declarations convey that? No. Clip distance is an array of up to 8 floats in GLSL, but it's represented in the hardware as 2 vec4s. You can tell by analyzing the declarations whether there are more than 4 clip distances in use, but not which components the shader writes to. TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not the number of full vectors. Lets imagine out float gl_ClipDistance[6]; Each a clip distance is a scalar float. Either all hardware represents the 8 clip distances as two 4 vectors, and we do: DCL OUT[0].xywz, CLIPDIST[0] DCL OUT[1].xy, CLIPDIST[1] using the full range of struct tgsi_declaration::UsageMask [1] or we represent them as as scalars: DCL OUT[0].x, CLIPDIST[0] DCL OUT[1].x, CLIPDIST[1] DCL OUT[2].x, CLIPDIST[2] DCL OUT[3].x, CLIPDIST[3] DCL OUT[4].x, CLIPDIST[4] DCL OUT[5].x, CLIPDIST[5] If indirect addressing is allowed as I read bore, then maybe the later is better. I confess my ignorance about clipping and maybe I'm being dense, but I still don't see the need for this TGSI_PROPERTY_NUM_CLIP_DISTANCES. Could you please draft an example TGSI shader showing this property (or just paste one generated with your change)? I think that would help a lot. Jose [1] I don't know if tgsi_dump pays much attention to tgsi_declaration::UsageMask, but it does exist. UsageMask might work, but before that can be considered a viable solution, someone will need to make it possible to actually declare it from ureg. As it is, ureg is hardcoded to set UsageMask to xyzw no matter what on all declared inputs and outputs. ureg automatically fills the UsageMask from the destionation register masks, since it easy to determine from the opcodes. Wait, where does it do that? When I search through tgsi_ureg.c for UsageMask, all it shows are assignments of TGSI_WRITEMASK_XYZW to the UsageMask property. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium
On 12/13/2011 04:22 PM, Jose Fonseca wrote: - Original Message - - Original Message - On 12/13/2011 03:48 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:25 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 03:09 PM, Jose Fonseca wrote: - Original Message - On 12/13/2011 12:26 PM, Bryan Cain wrote: On 12/13/2011 02:11 PM, Jose Fonseca wrote: - Original Message - This is an updated version of the patch set I sent to the list a few hours ago. There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES that drivers can use to determine how many of the 8 available clip distances are actually used by a shader. Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ? No. The clip distances can be indirectly addressed (there are up to 2 of them in vec4 form for a total of 8 floats), which makes it impossible to determine which ones are used by analyzing the shader. The description is almost complete. :) The issue is that the shader may declare out float gl_ClipDistance[4]; the use non-constant addressing of the array. The compiler knows that gl_ClipDistance has at most 4 elements, but post-hoc analysis would not be able to determine that. Often the fixed-function hardware (see below) needs to know which clip distance values are actually written. But don't all the clip distances written by the shader need to be declared? E.g.: DCL OUT[0], CLIPDIST[0] DCL OUT[1], CLIPDIST[1] DCL OUT[2], CLIPDIST[2] DCL OUT[3], CLIPDIST[3] therefore a trivial analysis of the declarations convey that? No. Clip distance is an array of up to 8 floats in GLSL, but it's represented in the hardware as 2 vec4s. You can tell by analyzing the declarations whether there are more than 4 clip distances in use, but not which components the shader writes to. TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not the number of full vectors. Lets imagine out float gl_ClipDistance[6]; Each a clip distance is a scalar float. Either all hardware represents the 8 clip distances as two 4 vectors, and we do: DCL OUT[0].xywz, CLIPDIST[0] DCL OUT[1].xy, CLIPDIST[1] using the full range of struct tgsi_declaration::UsageMask [1] or we represent them as as scalars: DCL OUT[0].x, CLIPDIST[0] DCL OUT[1].x, CLIPDIST[1] DCL OUT[2].x, CLIPDIST[2] DCL OUT[3].x, CLIPDIST[3] DCL OUT[4].x, CLIPDIST[4] DCL OUT[5].x, CLIPDIST[5] If indirect addressing is allowed as I read bore, then maybe the later is better. I confess my ignorance about clipping and maybe I'm being dense, but I still don't see the need for this TGSI_PROPERTY_NUM_CLIP_DISTANCES. Could you please draft an example TGSI shader showing this property (or just paste one generated with your change)? I think that would help a lot. Jose [1] I don't know if tgsi_dump pays much attention to tgsi_declaration::UsageMask, but it does exist. UsageMask might work, but before that can be considered a viable solution, someone will need to make it possible to actually declare it from ureg. As it is, ureg is hardcoded to set UsageMask to xyzw no matter what on all declared inputs and outputs. ureg automatically fills the UsageMask from the destionation register masks, since it easy to determine from the opcodes. Wait, where does it do that? When I search through tgsi_ureg.c for UsageMask, all it shows are assignments of TGSI_WRITEMASK_XYZW to the UsageMask property. ah. I may be lying. But I'm pretty sure I wrote such code somewhere, sometime. Let me dig it. I was lying. I wrote tgsi_util_get_inst_usage_mask() in src/gallium/auxiliary/tgsi/tgsi_util.c , but it only analyses which registers are _read_, and never got hooked into ureg anyway. I don't want you to go over hoops just to pass a scalar quantity. So may be just add ability to ureg to specify declaration's output mask? One problem with this is that the output mask would have to be a parameter to a new declaration function in ureg, like ureg_DECL_output_with_mask() instead of just ureg_DECL_output(). In this case, ureg_DECL_output_with_mask() would be the only DECL function with a usage mask specified. If that asymmetry is okay with you, I think I could go the UsageMask route. Another approach would be just to add the property, and kill output mask. Two ways of doing the same is what I'd like to avoid. I'll need a day (it's late here) to think about this and see how output mask is being actually used in the code. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: Invalidate and revalidate uniform backing storage
It's unfortunately been a while since I've done anything with glsl_to_tgsi. What are the various functions that modify the TGSI IR and try to propagate changes about that up to the gl_program? If I can see where it is in the code, I can probably remember if there's a reason it was done that way. Bryan On 11/16/2011 01:13 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com If glUniform1i and friends are going to dump data directly in driver-allocated, the pointers have to be updated when the storage moves. This should fix the regressions seen with commit 7199096. I'm not sure if this is the only place that needs this treatment. I'm a little uncertain about the various functions in st_glsl_to_tgsi that modify the TGSI IR and try to propagate changes about that up to the gl_program. That seems sketchy to me. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Vadim Girlin vadimgir...@gmail.com Cc: Bryan Cain bryanca...@gmail.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 0bf6766..c8385bb 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -4571,6 +4571,13 @@ st_translate_program( t-pointSizeOutIndex = -1; t-prevInstWrotePointSize = GL_FALSE; + for (i = 0; i program-shader_program-NumUserUniformStorage; i++) { + struct gl_uniform_storage *const storage = + program-shader_program-UniformStorage[i]; + + _mesa_uniform_detach_all_driver_storage(storage); + } + /* * Declare input attributes. */ @@ -4797,6 +4804,13 @@ st_translate_program( t-insn[t-labels[i].branch_target]); } + /* This has to be done last. Any operation the can cause +* prog-ParameterValues to get reallocated (e.g., anything that adds a +* program constant) has to happen before creating this linkage. +*/ + _mesa_associate_uniform_storage(ctx, program-shader_program, +proginfo-Parameters); + out: if (t) { FREE(t-insn); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] TGSI declarations missing type info
On 11/13/2011 09:06 AM, Dave Airlie wrote: Hi guys, Just been looking at llvmpipe integer support and it seems like we lose some information about the type of data stored into temporaries, after st_glsl_to_cpp we no longer know what type the temporaries are, and llvm would really like to know and I can't see any reason that TGSI doesn't contain the info. Having untyped temp decls means we'd have to allocate some sort of union via aliases I guess in llvmpipe for all temps so we can store int/float in them. I've attached a run of glsl-vs-loop from llvmpipe with integer opcodes forced on. (llvmpipe-int-test branch of my repo). Dave. If you do add types to TGSI registers, it's worth noting that the internal IR used by glsl_to_tgsi (glsl_to_tgsi_instruction) already the types of all src and dst registers, and it's only lost when converting that to TGSI. However, it was only intended to be good enough to determine whether to emit an integer or float instruction, so there might be some mistakes remaining somewhere that would need to be corrected. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: Add uniform_locations_assigned parameter to do_dead_code opt pass
If it's worth anything, the glsl_to_tgsi part is Reviewed-by: Bryan Cain bryanca...@gmail.com On 10/21/2011 01:49 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Setting this flag prevents declarations of uniforms from being removed from the IR. Since the IR is directly used by several API functions that query uniforms in shaders, uniform declarations cannot be removed after the locations have been set. However, it should still be safe to reorder the declarations (this is not tested). Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980 Cc: Brian Paul bri...@vmware.com Cc: Bryan Cain bryanca...@gmail.com Cc: Vinson Lee v...@vmware.com Cc: José Fonseca jfons...@vmware.com Cc: Kenneth Graunke kenn...@whitecape.org --- src/glsl/glsl_parser_extras.cpp| 23 +-- src/glsl/ir_optimization.h |6 -- src/glsl/linker.cpp|2 +- src/glsl/main.cpp |2 +- src/glsl/opt_dead_code.cpp | 14 ++ src/glsl/test_optpass.cpp |4 ++-- src/mesa/drivers/dri/i965/brw_shader.cpp |3 ++- src/mesa/main/ff_fragment_shader.cpp |2 +- src/mesa/program/ir_to_mesa.cpp|6 -- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |4 +++- 10 files changed, 49 insertions(+), 17 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: Short-circuit lower_if_to_cond_assign when MaxIfDepth is UINT_MAX.
Looks good to me. Bryan On 10/18/2011 05:20 PM, Kenneth Graunke wrote: Setting MaxIfDepth to UINT_MAX effectively means don't lower anything. Explicitly checking for this common case allows us to avoid walking the IR, computing nesting levels, and so on. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Bryan Cain bryanca...@gmail.com Cc: Ian Romanick i...@freedesktop.org --- src/glsl/lower_if_to_cond_assign.cpp |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/glsl/lower_if_to_cond_assign.cpp b/src/glsl/lower_if_to_cond_assign.cpp index 7b89a15..2c5d561 100644 --- a/src/glsl/lower_if_to_cond_assign.cpp +++ b/src/glsl/lower_if_to_cond_assign.cpp @@ -79,6 +79,9 @@ public: bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth) { + if (max_depth == UINT_MAX) + return false; + ir_if_to_cond_assign_visitor v(max_depth); visit_list_elements(v, instructions); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: kill instruction if writemask=0 in eliminate_dead_code_advanced()
I don't think there's any reason we can't eliminate a dead instruction when the writemask is zero. I do wonder, though, why this patch actually makes a difference. There's an if (!inst-dead_mask || !inst-dst.writemask) three lines before the code visible in the patch that makes it not kill the instruction if the writemask is zero. I don't remember why that's there, but if it weren't there, and the writemask is zero, the dead_mask should also be zero, so it should be handled by the else if block. In short, I think that entire if/else if/else statement could use a look. Bryan On 10/07/2011 10:40 AM, Brian Paul wrote: From: Brian Paul bri...@vmware.com This fixes a bug where we'd wind up emitting an invalid instruction like MOVE R[0]., R[1]; - note the empty/zero writemask. If we don't write to any dest register channels, cull the instruction. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d8ef8a3..44b1149 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3776,8 +3776,14 @@ glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void) iter.remove(); delete inst; removed++; - } else + } else { inst-dst.writemask = ~(inst-dead_mask); + if (inst-dst.writemask == 0) { +iter.remove(); +delete inst; +removed++; + } + } } ralloc_free(write_level); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: kill instruction if writemask=0 in eliminate_dead_code_advanced()
What does it do if there's no destination register? In any case, I don't think glsl_to_tgsi emits any ARLs of that form, so it shouldn't be a problem. Bryan On 10/07/2011 01:06 PM, Marek Olšák wrote: I think ARL is allowed to have no destination register, right? In that case, there should be a special case not to eliminate ARLs. Marek On Fri, Oct 7, 2011 at 5:40 PM, Brian Paul brian.e.p...@gmail.com wrote: From: Brian Paul bri...@vmware.com This fixes a bug where we'd wind up emitting an invalid instruction like MOVE R[0]., R[1]; - note the empty/zero writemask. If we don't write to any dest register channels, cull the instruction. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d8ef8a3..44b1149 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3776,8 +3776,14 @@ glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void) iter.remove(); delete inst; removed++; - } else + } else { inst-dst.writemask = ~(inst-dead_mask); + if (inst-dst.writemask == 0) { +iter.remove(); +delete inst; +removed++; + } + } } ralloc_free(write_level); -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] glsl_to_tgsi: Use _mesa_generate_parameters_list_for_uniforms
On 10/07/2011 07:06 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Bryan Cain bryanca...@gmail.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 119 +--- 1 files changed, 2 insertions(+), 117 deletions(-) Reviewed-by: Bryan Cain bryanca...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_tgsi: implement ir_binop_all_equal and ir_binop_any_nequal for native integers
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 119 1 files changed, 85 insertions(+), 34 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 8921698..f68270d 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1528,15 +1528,45 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) st_src_reg temp = get_temp(native_integers ? glsl_type::get_instance(ir-operands[0]-type-base_type, 4, 1) : glsl_type::vec4_type); - assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT); - emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]); - - /* After the dot-product, the value will be an integer on the - * range [0,4]. Zero becomes 1.0, and positive values become zero. - */ - emit_dp(ir, result_dst, temp, temp, vector_elements); - if (result_dst.type == GLSL_TYPE_FLOAT) { + if (native_integers) { +st_dst_reg temp_dst = st_dst_reg(temp); +st_src_reg temp1 = st_src_reg(temp), temp2 = st_src_reg(temp); + +emit(ir, TGSI_OPCODE_SEQ, st_dst_reg(temp), op[0], op[1]); + +/* Emit 1-3 AND operations to combine the SEQ results. */ +switch (ir-operands[0]-type-vector_elements) { +case 2: + break; +case 3: + temp_dst.writemask = WRITEMASK_Y; + temp1.swizzle = SWIZZLE_; + temp2.swizzle = SWIZZLE_; + emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2); + break; +case 4: + temp_dst.writemask = WRITEMASK_X; + temp1.swizzle = SWIZZLE_; + temp2.swizzle = SWIZZLE_; + emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2); + temp_dst.writemask = WRITEMASK_Y; + temp1.swizzle = SWIZZLE_; + temp2.swizzle = SWIZZLE_; + emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2); +} + +temp1.swizzle = SWIZZLE_; +temp2.swizzle = SWIZZLE_; +emit(ir, TGSI_OPCODE_AND, result_dst, temp1, temp2); + } else { +emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]); + +/* After the dot-product, the value will be an integer on the + * range [0,4]. Zero becomes 1.0, and positive values become zero. + */ +emit_dp(ir, result_dst, temp, temp, vector_elements); + /* Negating the result of the dot-product gives values on the range * [-4, 0]. Zero becomes 1.0, and negative values become zero. * This is achieved using SGE. @@ -1544,11 +1574,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) st_src_reg sge_src = result_src; sge_src.negate = ~sge_src.negate; emit(ir, TGSI_OPCODE_SGE, result_dst, sge_src, st_src_reg_for_float(0.0)); - } else { -/* The TGSI negate flag doesn't work for integers, so use SEQ 0 - * instead. - */ -emit(ir, TGSI_OPCODE_SEQ, result_dst, result_src, st_src_reg_for_int(0)); } } else { emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], op[1]); @@ -1561,30 +1586,56 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) st_src_reg temp = get_temp(native_integers ? glsl_type::get_instance(ir-operands[0]-type-base_type, 4, 1) : glsl_type::vec4_type); - assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT); emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]); - /* After the dot-product, the value will be an integer on the - * range [0,4]. Zero stays zero, and positive values become 1.0. - */ - glsl_to_tgsi_instruction *const dp = - emit_dp(ir, result_dst, temp, temp, vector_elements); - if (this-prog-Target == GL_FRAGMENT_PROGRAM_ARB - result_dst.type == GLSL_TYPE_FLOAT) { -/* The clamping to [0,1] can be done for free in the fragment - * shader with a saturate. - */ -dp-saturate = true; - } else if (result_dst.type == GLSL_TYPE_FLOAT) { -/* Negating the result of the dot-product gives values on the range - * [-4, 0]. Zero stays zero, and negative values become 1.0. This - * achieved using SLT. - */ -st_src_reg slt_src = result_src; -slt_src.negate = ~slt_src.negate; -emit(ir, TGSI_OPCODE_SLT, result_dst, slt_src, st_src_reg_for_float(0.0)); + if (native_integers) { +st_dst_reg temp_dst = st_dst_reg(temp); +st_src_reg temp1 =
Re: [Mesa-dev] [PATCH v2 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL
I don't see any reason why this patch would make a difference, but since it apparently does, I'll take a look at it and fix it. What program is that? Bryan On 09/14/2011 07:01 AM, Marek Olšák wrote: Bryan, This commit causes hardlocks on r600g (integers disabled). Maybe you can see better than me what's wrong. This is a side-by-side diff of the failing TGSI shader. The right-hand one is from Mesa master. The left-hand one is with the commit reverted. http://pastebin.com/raw.php?i=QXB3SZAE Thanks, Marek On Sat, Sep 10, 2011 at 7:36 PM, Bryan Cain bryanca...@gmail.com wrote: Since TGSI now has a UARL opcode that takes an integer as the source, it is no longer necessary to hack around the lack of an integer ARL opcode using I2F. UARL is only emitted when native integers are enabled; ARL is still used otherwise. Reviewed-by: Brian Paul bri...@vmware.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 18 +++--- 1 files changed, 7 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index e2857ed..05d4d33 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, inst-function = NULL; - if (op == TGSI_OPCODE_ARL) + if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL) this-num_address_regs = 1; /* Update indirect addressing status used by TGSI */ @@ -724,16 +724,12 @@ void glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0) { - st_src_reg tmp = get_temp(glsl_type::float_type); + int op = TGSI_OPCODE_ARL; - if (src0.type == GLSL_TYPE_INT) - emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0); - else if (src0.type == GLSL_TYPE_UINT) - emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0); - else - tmp = src0; - - emit(NULL, TGSI_OPCODE_ARL, dst, tmp); + if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT) + op = TGSI_OPCODE_UARL; + + emit(NULL, op, dst, src0); } /** -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL
Can one of the Gallium interface maintainers please review this patch so I can push it? Bryan On 09/02/2011 11:09 AM, Bryan Cain wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 18 +++--- 1 files changed, 7 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index e2857ed..05d4d33 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, inst-function = NULL; - if (op == TGSI_OPCODE_ARL) + if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL) this-num_address_regs = 1; /* Update indirect addressing status used by TGSI */ @@ -724,16 +724,12 @@ void glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0) { - st_src_reg tmp = get_temp(glsl_type::float_type); + int op = TGSI_OPCODE_ARL; - if (src0.type == GLSL_TYPE_INT) - emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0); - else if (src0.type == GLSL_TYPE_UINT) - emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0); - else - tmp = src0; - - emit(NULL, TGSI_OPCODE_ARL, dst, tmp); + if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT) + op = TGSI_OPCODE_UARL; + + emit(NULL, op, dst, src0); } /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: add TGSI opcodes UARL and UCMP
Can one of the Gallium interface maintainers please review this patch so I can push it? Bryan On 09/02/2011 11:09 AM, Bryan Cain wrote: They are needed by glsl_to_tgsi for an efficient implementation using native integers. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 30 src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++ src/gallium/include/pipe/p_shader_tokens.h |5 +++- 3 files changed, 37 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 38dc1ef..896d871 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -3312,6 +3312,28 @@ micro_usne(union tgsi_exec_channel *dst, } static void +micro_uarl(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-i[0] = src-u[0]; + dst-i[1] = src-u[1]; + dst-i[2] = src-u[2]; + dst-i[3] = src-u[3]; +} + +static void +micro_ucmp(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1, + const union tgsi_exec_channel *src2) +{ + dst-f[0] = src0-u[0] ? src1-f[0] : src2-f[0]; + dst-f[1] = src0-u[1] ? src1-f[1] : src2-f[1]; + dst-f[2] = src0-u[2] ? src1-f[2] : src2-f[2]; + dst-f[3] = src0-u[3] ? src1-f[3] : src2-f[3]; +} + +static void exec_instruction( struct tgsi_exec_machine *mach, const struct tgsi_full_instruction *inst, @@ -4071,6 +4093,14 @@ exec_instruction( assert(0); break; + case TGSI_OPCODE_UARL: + exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, TGSI_EXEC_DATA_UINT); + break; + + case TGSI_OPCODE_UCMP: + exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, TGSI_EXEC_DATA_UINT); + break; + default: assert( 0 ); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 14ed56a..6cd580a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -189,6 +189,9 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO }, { 1, 2, 0, 0, 0, 0, SAMPLE_POS, TGSI_OPCODE_SAMPLE_POS }, { 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO }, + + { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL }, + { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP }, }; const struct tgsi_opcode_info * diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index d3a3632..0a26e39 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -363,7 +363,10 @@ struct tgsi_property_data { #define TGSI_OPCODE_SAMPLE_POS 155 #define TGSI_OPCODE_SAMPLE_INFO 156 -#define TGSI_OPCODE_LAST157 +#define TGSI_OPCODE_UARL157 +#define TGSI_OPCODE_UCMP158 + +#define TGSI_OPCODE_LAST159 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL
Disregard this, I meant to send this for patch 1/2 and not 2/2. This patch doesn't contain any Gallium interface changes. Bryan On 09/10/2011 11:43 AM, Bryan Cain wrote: Can one of the Gallium interface maintainers please review this patch so I can push it? Bryan On 09/02/2011 11:09 AM, Bryan Cain wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 18 +++--- 1 files changed, 7 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index e2857ed..05d4d33 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, inst-function = NULL; - if (op == TGSI_OPCODE_ARL) + if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL) this-num_address_regs = 1; /* Update indirect addressing status used by TGSI */ @@ -724,16 +724,12 @@ void glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0) { - st_src_reg tmp = get_temp(glsl_type::float_type); + int op = TGSI_OPCODE_ARL; - if (src0.type == GLSL_TYPE_INT) - emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0); - else if (src0.type == GLSL_TYPE_UINT) - emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0); - else - tmp = src0; - - emit(NULL, TGSI_OPCODE_ARL, dst, tmp); + if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT) + op = TGSI_OPCODE_UARL; + + emit(NULL, op, dst, src0); } /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL
It will only be emitted when the driver supports integer operations. The I2F+ARL combination is currently what is emitted when integer support is enabled (float targets only need ARL) but I can make that more clear in the commit message. On 09/10/2011 12:07 PM, Brian Paul wrote: I guess my question is: do the drivers need to be updated for TGSI_OPCODE_UARL? Or will this only be emitted when the driver supports integer operations? If that's the case, please say so in the commit message or code. Otherwise: Reviewed-by: Brian Paul bri...@vmware.com On 09/10/2011 10:48 AM, Bryan Cain wrote: Disregard this, I meant to send this for patch 1/2 and not 2/2. This patch doesn't contain any Gallium interface changes. Bryan On 09/10/2011 11:43 AM, Bryan Cain wrote: Can one of the Gallium interface maintainers please review this patch so I can push it? Bryan On 09/02/2011 11:09 AM, Bryan Cain wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 18 +++--- 1 files changed, 7 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index e2857ed..05d4d33 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, inst-function = NULL; - if (op == TGSI_OPCODE_ARL) + if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL) this-num_address_regs = 1; /* Update indirect addressing status used by TGSI */ @@ -724,16 +724,12 @@ void glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0) { - st_src_reg tmp = get_temp(glsl_type::float_type); + int op = TGSI_OPCODE_ARL; - if (src0.type == GLSL_TYPE_INT) - emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0); - else if (src0.type == GLSL_TYPE_UINT) - emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0); - else - tmp = src0; - - emit(NULL, TGSI_OPCODE_ARL, dst, tmp); + if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT) + op = TGSI_OPCODE_UARL; + + emit(NULL, op, dst, src0); } /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/2] gallium: add TGSI opcodes UARL and UCMP
They are needed by glsl_to_tgsi for an efficient implementation using native integers. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 30 src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++ src/gallium/docs/source/tgsi.rst | 19 + src/gallium/include/pipe/p_shader_tokens.h |5 +++- 4 files changed, 56 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index d9de41b..ce6399c 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -3367,6 +3367,28 @@ micro_usne(union tgsi_exec_channel *dst, } static void +micro_uarl(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-i[0] = src-u[0]; + dst-i[1] = src-u[1]; + dst-i[2] = src-u[2]; + dst-i[3] = src-u[3]; +} + +static void +micro_ucmp(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1, + const union tgsi_exec_channel *src2) +{ + dst-u[0] = src0-u[0] ? src1-u[0] : src2-u[0]; + dst-u[1] = src0-u[1] ? src1-u[1] : src2-u[1]; + dst-u[2] = src0-u[2] ? src1-u[2] : src2-u[2]; + dst-u[3] = src0-u[3] ? src1-u[3] : src2-u[3]; +} + +static void exec_instruction( struct tgsi_exec_machine *mach, const struct tgsi_full_instruction *inst, @@ -4126,6 +4148,14 @@ exec_instruction( assert(0); break; + case TGSI_OPCODE_UARL: + exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, TGSI_EXEC_DATA_UINT); + break; + + case TGSI_OPCODE_UCMP: + exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, TGSI_EXEC_DATA_UINT); + break; + default: assert( 0 ); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 14ed56a..6cd580a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -189,6 +189,9 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO }, { 1, 2, 0, 0, 0, 0, SAMPLE_POS, TGSI_OPCODE_SAMPLE_POS }, { 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO }, + + { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL }, + { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP }, }; const struct tgsi_opcode_info * diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 5cf0875..d7f50b1 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -1013,6 +1013,25 @@ XXX so let's discuss it, yeah? dst.w = src0.w \oplus src1.w +.. opcode:: UCMP - Integer Conditional Move + +.. math:: + + dst.x = src0.x ? src1.x : src2.x + + dst.y = src0.y ? src1.y : src2.y + + dst.z = src0.z ? src1.z : src2.z + + dst.w = src0.w ? src1.w : src2.w + + +.. opcode:: UARL - Integer Address Register Load + + Moves the contents of the source register, assumed to be an integer, into the + destination register, which is assumed to be an address (ADDR) register. + + .. opcode:: SAD - Sum Of Absolute Differences .. math:: diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index b9e3dcf..7236c92 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -363,7 +363,10 @@ struct tgsi_property_data { #define TGSI_OPCODE_SAMPLE_POS 155 #define TGSI_OPCODE_SAMPLE_INFO 156 -#define TGSI_OPCODE_LAST157 +#define TGSI_OPCODE_UARL157 +#define TGSI_OPCODE_UCMP158 + +#define TGSI_OPCODE_LAST159 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: add a UniformBooleanTrue option
Drivers supporting native integers set UniformBooleanTrue to the integer value that should be used for true when uploading uniform booleans. This is ~0 for Gallium and 1 for i965. --- src/mesa/drivers/dri/i965/brw_context.c |4 +++- src/mesa/main/mtypes.h |6 ++ src/mesa/main/uniforms.c|5 - src/mesa/state_tracker/st_extensions.c |1 + 4 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 6ef0fcb..5ea7385 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -253,8 +253,10 @@ GLboolean brwCreateContext( int api, /* If we're using the new shader backend, we require integer uniforms * stored as actual integers. */ - if (brw-new_vs_backend) + if (brw-new_vs_backend) { ctx-Const.NativeIntegers = true; + ctx-Const.UniformBooleanTrue = 1; + } return GL_TRUE; } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 44ebf0a..ad59797 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2722,6 +2722,12 @@ struct gl_constants */ GLboolean NativeIntegers; + /** +* If the driver supports real 32-bit integers, what integer value should be +* used for boolean true in uniform uploads? (Usually 1 or ~0.) +*/ + GLuint UniformBooleanTrue; + /** Which texture units support GL_ATI_envmap_bumpmap as targets */ GLbitfield SupportedBumpUnits; diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index fe1ce6d..b0f9c33 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -802,7 +802,10 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, else uniformVal[i].b = uniformVal[i].u ? 1 : 0; - if (!ctx-Const.NativeIntegers) + if (ctx-Const.NativeIntegers) + uniformVal[i].u = +uniformVal[i].b ? ctx-Const.UniformBooleanTrue : 0; + else uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; } } diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index aa7f3b5..76e84eb 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -207,6 +207,7 @@ void st_init_limits(struct st_context *st) c-MaxProgramTexelOffset = screen-get_param(screen, PIPE_CAP_MAX_TEXEL_OFFSET); c-GLSLVersion = 120; + c-UniformBooleanTrue = ~0; } } -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans
On 09/02/2011 06:13 PM, Eric Anholt wrote: On Wed, 31 Aug 2011 01:33:59 -0500, Bryan Cain bryanca...@gmail.com wrote: With this patch, there are no piglit regressions on softpipe with native integers enabled. Unlike my previous patch, this uses integer values of ~0 and 0 for true and false, respectively, instead of the float values 1.0 and 0.0. This will break b2* conversions on our driver, since we expect true to be 1, not ~0. The minimal change would require emit(AND(temp, uniform, 1)) when deferencing boolean components of uniform variables. Your driver already emits an AND instruction on every compare, according to this code from brw_fs_visitor.cpp: case ir_binop_less: case ir_binop_greater: case ir_binop_lequal: case ir_binop_gequal: case ir_binop_equal: case ir_binop_all_equal: case ir_binop_nequal: case ir_binop_any_nequal: temp = this-result; /* original gen4 does implicit conversion before comparison. */ if (intel-gen 5) temp.type = op[0].type; inst = emit(BRW_OPCODE_CMP, temp, op[0], op[1]); inst-conditional_mod = brw_conditional_for_comparison(ir-operation); emit(BRW_OPCODE_AND, this-result, this-result, fs_reg(0x1)); break; If the hardware sets ~0 on compares anyway, why not use that in your driver? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] gallium: add TGSI opcodes UARL and UCMP
They are needed by glsl_to_tgsi for an efficient implementation using native integers. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 30 src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++ src/gallium/include/pipe/p_shader_tokens.h |5 +++- 3 files changed, 37 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 38dc1ef..896d871 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -3312,6 +3312,28 @@ micro_usne(union tgsi_exec_channel *dst, } static void +micro_uarl(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-i[0] = src-u[0]; + dst-i[1] = src-u[1]; + dst-i[2] = src-u[2]; + dst-i[3] = src-u[3]; +} + +static void +micro_ucmp(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1, + const union tgsi_exec_channel *src2) +{ + dst-f[0] = src0-u[0] ? src1-f[0] : src2-f[0]; + dst-f[1] = src0-u[1] ? src1-f[1] : src2-f[1]; + dst-f[2] = src0-u[2] ? src1-f[2] : src2-f[2]; + dst-f[3] = src0-u[3] ? src1-f[3] : src2-f[3]; +} + +static void exec_instruction( struct tgsi_exec_machine *mach, const struct tgsi_full_instruction *inst, @@ -4071,6 +4093,14 @@ exec_instruction( assert(0); break; + case TGSI_OPCODE_UARL: + exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, TGSI_EXEC_DATA_UINT); + break; + + case TGSI_OPCODE_UCMP: + exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, TGSI_EXEC_DATA_UINT); + break; + default: assert( 0 ); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 14ed56a..6cd580a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -189,6 +189,9 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO }, { 1, 2, 0, 0, 0, 0, SAMPLE_POS, TGSI_OPCODE_SAMPLE_POS }, { 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO }, + + { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL }, + { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP }, }; const struct tgsi_opcode_info * diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index d3a3632..0a26e39 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -363,7 +363,10 @@ struct tgsi_property_data { #define TGSI_OPCODE_SAMPLE_POS 155 #define TGSI_OPCODE_SAMPLE_INFO 156 -#define TGSI_OPCODE_LAST157 +#define TGSI_OPCODE_UARL157 +#define TGSI_OPCODE_UCMP158 + +#define TGSI_OPCODE_LAST159 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 18 +++--- 1 files changed, 7 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index e2857ed..05d4d33 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, inst-function = NULL; - if (op == TGSI_OPCODE_ARL) + if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL) this-num_address_regs = 1; /* Update indirect addressing status used by TGSI */ @@ -724,16 +724,12 @@ void glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0) { - st_src_reg tmp = get_temp(glsl_type::float_type); + int op = TGSI_OPCODE_ARL; - if (src0.type == GLSL_TYPE_INT) - emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0); - else if (src0.type == GLSL_TYPE_UINT) - emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0); - else - tmp = src0; - - emit(NULL, TGSI_OPCODE_ARL, dst, tmp); + if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT) + op = TGSI_OPCODE_UARL; + + emit(NULL, op, dst, src0); } /** -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans
Are there any objections to pushing this? Bryan On 08/31/2011 01:33 AM, Bryan Cain wrote: With this patch, there are no piglit regressions on softpipe with native integers enabled. Unlike my previous patch, this uses integer values of ~0 and 0 for true and false, respectively, instead of the float values 1.0 and 0.0. --- src/mesa/main/uniforms.c |6 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 160 2 files changed, 116 insertions(+), 50 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index cda840f..fa96fd3 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -777,12 +777,12 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, if (isUniformBool) { for (i = 0; i elems; i++) { if (basicType == GL_FLOAT) - uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0; + uniformVal[i].u = uniformVal[i].f != 0.0f ? ~0 : 0; else - uniformVal[i].b = uniformVal[i].u ? 1 : 0; + uniformVal[i].u = uniformVal[i].u ? ~0 : 0; if (!ctx-Const.NativeIntegers) - uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; + uniformVal[i].f = uniformVal[i].u ? 1.0f : 0.0f; } } } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 2266083..c8f790a 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -385,6 +385,8 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); + void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); + void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -562,7 +564,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, } this-instructions.push_tail(inst); - + + if (native_integers) + try_emit_float_set(ir, op, dst); + return inst; } @@ -588,6 +593,25 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } + /** + * Emits the code to convert the result of float SET instructions to integers. + */ +void +glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, + st_dst_reg dst) +{ + if ((op == TGSI_OPCODE_SEQ || +op == TGSI_OPCODE_SNE || +op == TGSI_OPCODE_SGE || +op == TGSI_OPCODE_SLT)) + { + st_src_reg src = st_src_reg(dst); + src.negate = ~src.negate; + dst.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_F2I, dst, src); + } +} + /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -604,7 +628,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) - type = src0.type; + type = src0.type == GLSL_TYPE_BOOL ? GLSL_TYPE_INT : src0.type; #define case4(c, f, i, u) \ case TGSI_OPCODE_##c: \ @@ -630,12 +654,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(SGE, ISGE, USGE); case3(SLT, ISLT, USLT); - case2iu(SHL, SHL); case2iu(ISHR, USHR); - case2iu(NOT, NOT); - case2iu(AND, AND); - case2iu(OR, OR); - case2iu(XOR, XOR); default: break; } @@ -1389,7 +1408,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) switch (ir-operation) { case ir_unop_logic_not: if (result_dst.type != GLSL_TYPE_FLOAT) - emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], st_src_reg_for_type(result_dst.type, 0)); + emit(ir, TGSI_OPCODE_NOT, result_dst, op[0]); else { /* Previously 'SEQ dst, src, 0.0' was used for this. However, many * older GPUs implement SEQ using multiple instructions (i915 uses two @@ -1489,10 +1508,10 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit(ir, TGSI_OPCODE_SLT, result_dst, op[0], op[1]); break; case ir_binop_greater: - emit(ir, TGSI_OPCODE_SGT, result_dst, op[0], op[1]); + emit(ir, TGSI_OPCODE_SLT, result_dst, op[1], op[0]); break; case ir_binop_lequal: - emit(ir, TGSI_OPCODE_SLE, result_dst, op[0], op[1]); + emit(ir, TGSI_OPCODE_SGE, result_dst, op[1], op[0]); break; case ir_binop_gequal: emit(ir, TGSI_OPCODE_SGE, result_dst, op[0], op[1]); @@ -1605,41 +1624,52 @@ glsl_to_tgsi_visitor
Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans
On 09/02/2011 06:13 PM, Eric Anholt wrote: On Wed, 31 Aug 2011 01:33:59 -0500, Bryan Cain bryanca...@gmail.com wrote: With this patch, there are no piglit regressions on softpipe with native integers enabled. Unlike my previous patch, this uses integer values of ~0 and 0 for true and false, respectively, instead of the float values 1.0 and 0.0. This will break b2* conversions on our driver, since we expect true to be 1, not ~0. The minimal change would require emit(AND(temp, uniform, 1)) when deferencing boolean components of uniform variables. Can you please change your driver if the hardware is not able to support it, then? If that's not possible, we need to come up with a way to upload booleans as ~0 for Gallium drivers and 1 for i965. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans
With this patch, there are no piglit regressions on softpipe with native integers enabled. Unlike my previous patch, this uses integer values of ~0 and 0 for true and false, respectively, instead of the float values 1.0 and 0.0. --- src/mesa/main/uniforms.c |6 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 160 2 files changed, 116 insertions(+), 50 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index cda840f..fa96fd3 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -777,12 +777,12 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, if (isUniformBool) { for (i = 0; i elems; i++) { if (basicType == GL_FLOAT) - uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0; + uniformVal[i].u = uniformVal[i].f != 0.0f ? ~0 : 0; else - uniformVal[i].b = uniformVal[i].u ? 1 : 0; + uniformVal[i].u = uniformVal[i].u ? ~0 : 0; if (!ctx-Const.NativeIntegers) - uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; + uniformVal[i].f = uniformVal[i].u ? 1.0f : 0.0f; } } } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 2266083..c8f790a 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -385,6 +385,8 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); + void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); + void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -562,7 +564,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, } this-instructions.push_tail(inst); - + + if (native_integers) + try_emit_float_set(ir, op, dst); + return inst; } @@ -588,6 +593,25 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } + /** + * Emits the code to convert the result of float SET instructions to integers. + */ +void +glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, +st_dst_reg dst) +{ + if ((op == TGSI_OPCODE_SEQ || +op == TGSI_OPCODE_SNE || +op == TGSI_OPCODE_SGE || +op == TGSI_OPCODE_SLT)) + { + st_src_reg src = st_src_reg(dst); + src.negate = ~src.negate; + dst.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_F2I, dst, src); + } +} + /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -604,7 +628,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT) type = GLSL_TYPE_FLOAT; else if (native_integers) - type = src0.type; + type = src0.type == GLSL_TYPE_BOOL ? GLSL_TYPE_INT : src0.type; #define case4(c, f, i, u) \ case TGSI_OPCODE_##c: \ @@ -630,12 +654,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(SGE, ISGE, USGE); case3(SLT, ISLT, USLT); - case2iu(SHL, SHL); case2iu(ISHR, USHR); - case2iu(NOT, NOT); - case2iu(AND, AND); - case2iu(OR, OR); - case2iu(XOR, XOR); default: break; } @@ -1389,7 +1408,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) switch (ir-operation) { case ir_unop_logic_not: if (result_dst.type != GLSL_TYPE_FLOAT) - emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], st_src_reg_for_type(result_dst.type, 0)); + emit(ir, TGSI_OPCODE_NOT, result_dst, op[0]); else { /* Previously 'SEQ dst, src, 0.0' was used for this. However, many * older GPUs implement SEQ using multiple instructions (i915 uses two @@ -1489,10 +1508,10 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit(ir, TGSI_OPCODE_SLT, result_dst, op[0], op[1]); break; case ir_binop_greater: - emit(ir, TGSI_OPCODE_SGT, result_dst, op[0], op[1]); + emit(ir, TGSI_OPCODE_SLT, result_dst, op[1], op[0]); break; case ir_binop_lequal: - emit(ir, TGSI_OPCODE_SLE, result_dst, op[0], op[1]); + emit(ir, TGSI_OPCODE_SGE, result_dst, op[1], op[0]); break; case ir_binop_gequal: emit(ir, TGSI_OPCODE_SGE, result_dst, op[0], op[1]); @@ -1605,41 +1624,52 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) } case ir_binop_logic_xor: - emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], op[1]); + if (native_integers) + emit(ir, TGSI_OPCODE_XOR, result_dst, op[0], op[1]); + else
[Mesa-dev] [PATCH] mesa: Replace the EmitNoIfs compiler flag with a MaxIfLevel flag.
This is a better, more fine-grained way of lowering if statements. Fixes the game And Yet It Moves on nv50. --- src/mesa/drivers/dri/i915/i915_context.c |2 +- src/mesa/main/mtypes.h |6 +- src/mesa/program/ir_to_mesa.cpp|8 src/mesa/state_tracker/st_extensions.c |2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |6 +++--- 5 files changed, 10 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i915_context.c b/src/mesa/drivers/dri/i915/i915_context.c index 11bee14..7a40ba1 100644 --- a/src/mesa/drivers/dri/i915/i915_context.c +++ b/src/mesa/drivers/dri/i915/i915_context.c @@ -189,7 +189,7 @@ i915CreateContext(int api, struct gl_shader_compiler_options *const fs_options = ctx-ShaderCompilerOptions[MESA_SHADER_FRAGMENT]; - fs_options-EmitNoIfs = GL_TRUE; + fs_options-MaxIfLevel = 0; fs_options-EmitNoNoise = GL_TRUE; fs_options-EmitNoPow = GL_TRUE; fs_options-EmitNoMainReturn = GL_TRUE; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index f2eb889..9f95bcd 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2266,11 +2266,6 @@ struct gl_shader_compiler_options /** Driver-selectable options: */ GLboolean EmitCondCodes; /** Use condition codes? */ GLboolean EmitNVTempInitialization; /** 0-fill NV temp registers */ - /** -* Attempts to flatten all ir_if (OPCODE_IF) for GPUs that can't -* support control flow. -*/ - GLboolean EmitNoIfs; GLboolean EmitNoLoops; GLboolean EmitNoFunctions; GLboolean EmitNoCont; /** Emit CONT opcode? */ @@ -2288,6 +2283,7 @@ struct gl_shader_compiler_options GLboolean EmitNoIndirectUniform; /** No indirect addressing of constants */ /*@}*/ + GLuint MaxIfLevel; /** Maximum nested IF blocks */ GLuint MaxUnrollIterations; struct gl_sl_pragmas DefaultPragmas; /** Default #pragma settings */ diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index dd154db..7fb286e 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -3119,7 +3119,7 @@ get_mesa_program(struct gl_context *ctx, switch (mesa_inst-Opcode) { case OPCODE_IF: -if (options-EmitNoIfs) { +if (options-MaxIfLevel == 0) { linker_warning(shader_program, Couldn't flatten if-statement. This will likely result in software @@ -3241,10 +3241,10 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) progress = lower_quadop_vector(ir, true) || progress; -if (options-EmitNoIfs) { +if (options-MaxIfLevel == 0) progress = lower_discard(ir) || progress; - progress = lower_if_to_cond_assign(ir) || progress; -} + +progress = lower_if_to_cond_assign(ir, options-MaxIfLevel) || progress; if (options-EmitNoNoise) progress = lower_noise(ir) || progress; diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 8e90093..9f429d9 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -173,7 +173,7 @@ void st_init_limits(struct st_context *st) options-EmitNoNoise = TRUE; /* TODO: make these more fine-grained if anyone needs it */ - options-EmitNoIfs = !screen-get_shader_param(screen, sh, PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH); + options-MaxIfLevel = screen-get_shader_param(screen, sh, PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH); options-EmitNoLoops = !screen-get_shader_param(screen, sh, PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH); options-EmitNoFunctions = !screen-get_shader_param(screen, sh, PIPE_SHADER_CAP_SUBROUTINES); options-EmitNoMainReturn = !screen-get_shader_param(screen, sh, PIPE_SHADER_CAP_SUBROUTINES); diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 98bea69..f5232af 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -4991,10 +4991,10 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) progress = lower_quadop_vector(ir, false) || progress; - if (options-EmitNoIfs) { + if (options-MaxIfLevel == 0) progress = lower_discard(ir) || progress; -progress = lower_if_to_cond_assign(ir) || progress; - } + + progress = lower_if_to_cond_assign(ir, options-MaxIfLevel) || progress; if (options-EmitNoNoise) progress = lower_noise(ir) || progress; -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Missing integer set opcodes in Gallium
I somehow didn't notice this until now, but why are there no Gallium set opcodes for integers that are equivalent to the SGT or SLE opcodes for floats? Does it have to do with available hardware features? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Missing integer set opcodes in Gallium
On 08/29/2011 03:41 PM, Zack Rusin wrote: On Monday, August 29, 2011 04:02:08 PM Zack Rusin wrote: Either way though if GL needs those ops then, like Brian mentioned, it'd be a good idea to add them. Actually I think it seems easier to just flip the ops rather than add new instructions, i.e. change stuff like a b, to b a, and a = b to b = a. Just an opinion though, I don't have strong preferences here. You're right; that's an easier solution. That way, the only opcode I'll need to add is UARL. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers
On 08/26/2011 09:58 PM, Bryan Cain wrote: This fixes all of the piglit regressions in softpipe when native integers are enabled. --- src/mesa/main/uniforms.c |8 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index cda840f..0801476 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, /* if the uniform is bool-valued, convert to 1 or 0 */ if (isUniformBool) { for (i = 0; i elems; i++) { - if (basicType == GL_FLOAT) - uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0; - else - uniformVal[i].b = uniformVal[i].u ? 1 : 0; - - if (!ctx-Const.NativeIntegers) - uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; + uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f; } } } I'd like to push this to master (with one typo fix in the glsl_to_tgsi part of the patch) since no more objections have been raised, but could someone verify that this core Mesa change is okay? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Use a separate div_to_mul_rcp lowering flag for integers.
On 08/28/2011 07:38 PM, Eric Anholt wrote: On Sat, 27 Aug 2011 20:18:55 -0700, Kenneth Graunke kenn...@whitecape.org wrote: From: Bryan Cain bryanca...@gmail.com Using multiply and reciprocal for integer division involves potentially lossy floating point conversions. This is okay for older GPUs that represent integers as floating point, but undesirable for GPUs with native integer division instructions. TGSI, for example, has UDIV/IDIV instructions for integer division, so it makes sense to handle this directly. Likewise for i965. Signed-off-by: Bryan Cain bryanca...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- case ir_binop_div: - if (lowering(DIV_TO_MUL_RCP)) + if (lowering(INT_DIV_TO_MUL_RCP) ir-operands[1]-type-is_integer()) + int_div_to_mul_rcp(ir); + else if (lowering(DIV_TO_MUL_RCP)) div_to_mul_rcp(ir); break; Sure looks odd to me for one of these to be checking the type and ther other not. It works, though. If it's not an integer type, it's going to be a float type. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers
On 08/27/2011 05:39 AM, Christoph Bumiller wrote: On 27.08.2011 04:58, Bryan Cain wrote: This fixes all of the piglit regressions in softpipe when native integers are enabled. --- src/mesa/main/uniforms.c |8 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index cda840f..0801476 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, /* if the uniform is bool-valued, convert to 1 or 0 */ if (isUniformBool) { for (i = 0; i elems; i++) { - if (basicType == GL_FLOAT) - uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0; - else - uniformVal[i].b = uniformVal[i].u ? 1 : 0; - - if (!ctx-Const.NativeIntegers) - uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; + uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f; } } } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9cac309..d1674eb 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -385,6 +385,8 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); + void try_emit_integer_set(ir_instruction *ir, unsigned op, st_dst_reg dst); + void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -563,6 +565,8 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); + try_emit_integer_set(ir, op, dst); + return inst; } @@ -589,6 +593,32 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) } /** + * Emits the code to convert the result of integer SET instructions to floats. + */ +void +glsl_to_tgsi_visitor::try_emit_integer_set(ir_instruction *ir, unsigned op, + st_dst_reg dst) +{ + st_src_reg src; + + if (!(op == TGSI_OPCODE_USEQ || + op == TGSI_OPCODE_USNE || + op == TGSI_OPCODE_ISGE || + op == TGSI_OPCODE_USGE || + op == TGSI_OPCODE_ISLT || + op == TGSI_OPCODE_USLT)) + return; + + src = st_src_reg(dst); + src.type = GLSL_TYPE_INT; + + dst.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_I2F, dst, src); + src.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_SNE, dst, src, st_src_reg_for_float(0.0)); +} + emit(ir, TGSI_OPCODE_AND, dst, src, st_src_reg_for_float(1.0f)); I still don't quite like booleans as floats, but I guess I can just easily track back to the source SET to emit my set-predicate-register op without having all the fuss in between. For now, I'm trying to get things working. Once integers are working on softpipe and r600g with no piglit regressions, I'll look into optimizations like using AND. +/** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. * @@ -1662,7 +1692,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, op[0]); break; case ir_unop_i2f: - case ir_unop_b2f: if (native_integers) { emit(ir, TGSI_OPCODE_I2F, result_dst, op[0]); break; @@ -1670,10 +1699,14 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) case ir_unop_i2u: case ir_unop_u2i: /* Converting between signed and unsigned integers is a no-op. */ - case ir_unop_b2i: - /* Booleans are stored as integers (or floats in GLSL 1.20 and lower). */ + case ir_unop_b2f: + /* Booleans are stored as floats. */ result_src = op[0]; break; + case ir_unop_b2i: + if (native_integers) + emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]); + break; case ir_unop_f2i: if (native_integers) emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]); @@ -1681,6 +1714,11 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); break; case ir_unop_f2b: + if (native_integers) { + emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], st_src_reg_for_float(0.0)); + emit(ir, TGSI_OPCODE_F2I, result_dst, result_src); This is inconsistent, why do you convert to integer here if you store all booleans as floats ? + } + break; case ir_unop_i2b: emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], st_src_reg_for_type(result_dst.type, 0)); This can go wrong
[Mesa-dev] [PATCH] glsl: use a separate div_to_mul_rcp lowering flag for integers
TGSI, at the very least, has UDIV/IDIV instructions for integer division. --- src/glsl/ir_optimization.h | 13 +++-- src/glsl/lower_instructions.cpp|4 +++- src/mesa/drivers/dri/i965/brw_shader.cpp |1 + src/mesa/program/ir_to_mesa.cpp|2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +- 5 files changed, 13 insertions(+), 9 deletions(-) diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index f7808bd..48448d4 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -29,12 +29,13 @@ */ /* Operations for lower_instructions() */ -#define SUB_TO_ADD_NEG 0x01 -#define DIV_TO_MUL_RCP 0x02 -#define EXP_TO_EXP20x04 -#define POW_TO_EXP20x08 -#define LOG_TO_LOG20x10 -#define MOD_TO_FRACT 0x20 +#define SUB_TO_ADD_NEG 0x01 +#define DIV_TO_MUL_RCP 0x02 +#define EXP_TO_EXP20x04 +#define POW_TO_EXP20x08 +#define LOG_TO_LOG20x10 +#define MOD_TO_FRACT 0x20 +#define INT_DIV_TO_MUL_RCP 0x40 bool do_common_optimization(exec_list *ir, bool linked, unsigned max_unroll_iterations); diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 23aa19b..fd0c760 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -265,7 +265,9 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_binop_div: - if (lowering(DIV_TO_MUL_RCP)) + if (lowering(INT_DIV_TO_MUL_RCP) ir-operands[1]-type-is_integer()) +div_to_mul_rcp(ir); + else if (lowering(DIV_TO_MUL_RCP)) div_to_mul_rcp(ir); break; diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 3ff6bba..7e53097 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -100,6 +100,7 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) lower_instructions(shader-ir, MOD_TO_FRACT | DIV_TO_MUL_RCP | +INT_DIV_TO_MUL_RCP | SUB_TO_ADD_NEG | EXP_TO_EXP2 | LOG_TO_LOG2); diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 6820e4c..dd154db 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -3232,7 +3232,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) /* Lowering */ do_mat_op_to_vec(ir); lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2 -| LOG_TO_LOG2 +| LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP | ((options-EmitNoPow) ? POW_TO_EXP2 : 0))); progress = do_lower_jumps(ir, true, true, options-EmitNoMainReturn, options-EmitNoCont, options-EmitNoLoops) || progress; diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9cac309..7aef4aa 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5012,7 +5012,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) /* Lowering */ do_mat_op_to_vec(ir); lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2 -| LOG_TO_LOG2 +| LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP | ((options-EmitNoPow) ? POW_TO_EXP2 : 0))); progress = do_lower_jumps(ir, true, true, options-EmitNoMainReturn, options-EmitNoCont, options-EmitNoLoops) || progress; -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers
This fixes all of the piglit regressions in softpipe when native integers are enabled. --- src/mesa/main/uniforms.c |8 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index cda840f..0801476 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct gl_program *program, /* if the uniform is bool-valued, convert to 1 or 0 */ if (isUniformBool) { for (i = 0; i elems; i++) { - if (basicType == GL_FLOAT) - uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0; - else - uniformVal[i].b = uniformVal[i].u ? 1 : 0; - - if (!ctx-Const.NativeIntegers) - uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f; + uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f; } } } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9cac309..d1674eb 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -385,6 +385,8 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); + void try_emit_integer_set(ir_instruction *ir, unsigned op, st_dst_reg dst); + void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -563,6 +565,8 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); + try_emit_integer_set(ir, op, dst); + return inst; } @@ -589,6 +593,32 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) } /** + * Emits the code to convert the result of integer SET instructions to floats. + */ +void +glsl_to_tgsi_visitor::try_emit_integer_set(ir_instruction *ir, unsigned op, +st_dst_reg dst) +{ + st_src_reg src; + + if (!(op == TGSI_OPCODE_USEQ || + op == TGSI_OPCODE_USNE || + op == TGSI_OPCODE_ISGE || + op == TGSI_OPCODE_USGE || + op == TGSI_OPCODE_ISLT || + op == TGSI_OPCODE_USLT)) + return; + + src = st_src_reg(dst); + src.type = GLSL_TYPE_INT; + + dst.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_I2F, dst, src); + src.type = GLSL_TYPE_FLOAT; + emit(ir, TGSI_OPCODE_SNE, dst, src, st_src_reg_for_float(0.0)); +} + +/** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. * @@ -1662,7 +1692,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, op[0]); break; case ir_unop_i2f: - case ir_unop_b2f: if (native_integers) { emit(ir, TGSI_OPCODE_I2F, result_dst, op[0]); break; @@ -1670,10 +1699,14 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) case ir_unop_i2u: case ir_unop_u2i: /* Converting between signed and unsigned integers is a no-op. */ - case ir_unop_b2i: - /* Booleans are stored as integers (or floats in GLSL 1.20 and lower). */ + case ir_unop_b2f: + /* Booleans are stored as floats. */ result_src = op[0]; break; + case ir_unop_b2i: + if (native_integers) + emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]); + break; case ir_unop_f2i: if (native_integers) emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]); @@ -1681,6 +1714,11 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); break; case ir_unop_f2b: + if (native_integers) { + emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], st_src_reg_for_float(0.0)); + emit(ir, TGSI_OPCODE_F2I, result_dst, result_src); + } + break; case ir_unop_i2b: emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], st_src_reg_for_type(result_dst.type, 0)); @@ -2154,6 +2192,7 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) inst = (glsl_to_tgsi_instruction *)this-instructions.get_tail(); new_inst = emit(ir, inst-op, l, inst-src[0], inst-src[1], inst-src[2]); new_inst-saturate = inst-saturate; + inst-dead_mask = inst-dst.writemask; } else { for (i = 0; i type_size(ir-lhs-type); i++) { emit(ir, TGSI_OPCODE_MOV, l, r); -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] st_glsl_to_tgsi: implement TXS/TXQ. (v2)
The usual commit prefix for the GLSL-TGSI translator is glsl_to_tgsi. Other than that, Reviewed-by: Bryan Cain bryanca...@gmail.com On 08/25/2011 09:46 AM, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com GLSL uses TXS, call the gallium TXQ opcode. v2: fix indent from 4-3. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 27 ++- 1 files changed, 18 insertions(+), 9 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 6f0d9fa..fb5060c 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -2426,16 +2426,18 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) glsl_to_tgsi_instruction *inst = NULL; unsigned opcode = TGSI_OPCODE_NOP; - ir-coordinate-accept(this); + if (ir-coordinate) { + ir-coordinate-accept(this); - /* Put our coords in a temp. We'll need to modify them for shadow, -* projection, or LOD, so the only case we'd use it as is is if -* we're doing plain old texturing. The optimization passes on -* glsl_to_tgsi_visitor should handle cleaning up our mess in that case. -*/ - coord = get_temp(glsl_type::vec4_type); - coord_dst = st_dst_reg(coord); - emit(ir, TGSI_OPCODE_MOV, coord_dst, this-result); + /* Put our coords in a temp. We'll need to modify them for shadow, + * projection, or LOD, so the only case we'd use it as is is if + * we're doing plain old texturing. The optimization passes on + * glsl_to_tgsi_visitor should handle cleaning up our mess in that case. + */ + coord = get_temp(glsl_type::vec4_type); + coord_dst = st_dst_reg(coord); + emit(ir, TGSI_OPCODE_MOV, coord_dst, this-result); + } if (ir-projector) { ir-projector-accept(this); @@ -2470,6 +2472,10 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) dy = this-result; break; case ir_txs: + opcode = TGSI_OPCODE_TXQ; + ir-lod_info.lod-accept(this); + lod_info = this-result; + break; case ir_txf: /* TODO: use TGSI_OPCODE_TXF here */ assert(!GLSL 1.30 features unsupported); break; @@ -2544,6 +2550,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) if (opcode == TGSI_OPCODE_TXD) inst = emit(ir, opcode, result_dst, coord, dx, dy); + else if (opcode == TGSI_OPCODE_TXQ) + inst = emit(ir, opcode, result_dst, lod_info); else inst = emit(ir, opcode, result_dst, coord); @@ -4276,6 +4284,7 @@ compile_tgsi_instruction(struct st_translate *t, case TGSI_OPCODE_TXD: case TGSI_OPCODE_TXL: case TGSI_OPCODE_TXP: + case TGSI_OPCODE_TXQ: src[num_src++] = t-samplers[inst-sampler]; ureg_tex_insn(ureg, inst-op, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] glsl-tgsi: add TXF support.
Like the other patch, the commit prefix should probably be glsl_to_tgsi. On 08/25/2011 09:51 AM, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This adds texelFetch support to translate from GLSL to TGSI TXF opcode. I've tested this works with an r600g and softpipe backend. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index fb5060c..8f9b843 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -2477,7 +2477,9 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) lod_info = this-result; break; case ir_txf: /* TODO: use TGSI_OPCODE_TXF here */ You can remove the comment on the above line now that TXF is actually being used there. - assert(!GLSL 1.30 features unsupported); + opcode = TGSI_OPCODE_TXF; + ir-lod_info.lod-accept(this); + lod_info = this-result; break; } @@ -2541,7 +2543,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) coord_dst.writemask = WRITEMASK_XYZW; } - if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXB) { + if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXB || + opcode == TGSI_OPCODE_TXF) { /* TGSI stores LOD or LOD bias in the last channel of the coords. */ coord_dst.writemask = WRITEMASK_W; emit(ir, TGSI_OPCODE_MOV, coord_dst, lod_info); @@ -4285,6 +4288,7 @@ compile_tgsi_instruction(struct st_translate *t, case TGSI_OPCODE_TXL: case TGSI_OPCODE_TXP: case TGSI_OPCODE_TXQ: + case TGSI_OPCODE_TXF: src[num_src++] = t-samplers[inst-sampler]; ureg_tex_insn(ureg, inst-op, Aside from those things, Reviewed-by: Bryan Cain bryanca...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] is_tex bit for TGSI sampling instructions
On 08/24/2011 07:26 AM, Michal Krol wrote: - Original Message - Any reasons { 1, 2, 0, 0, 0, 0, LOAD,TGSI_OPCODE_LOAD }, { 1, 2, 0, 0, 0, 0, LOAD_MS, TGSI_OPCODE_LOAD_MS }, { 1, 3, 0, 0, 0, 0, SAMPLE, TGSI_OPCODE_SAMPLE }, { 1, 4, 0, 0, 0, 0, SAMPLE_B,TGSI_OPCODE_SAMPLE_B }, { 1, 4, 0, 0, 0, 0, SAMPLE_C,TGSI_OPCODE_SAMPLE_C }, { 1, 4, 0, 0, 0, 0, SAMPLE_C_LZ, TGSI_OPCODE_SAMPLE_C_LZ }, { 1, 5, 0, 0, 0, 0, SAMPLE_D,TGSI_OPCODE_SAMPLE_D }, { 1, 3, 0, 0, 0, 0, SAMPLE_L,TGSI_OPCODE_SAMPLE_L }, { 1, 3, 0, 0, 0, 0, GATHER4, TGSI_OPCODE_GATHER4 }, { 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO }, { 1, 2, 0, 0, 0, 0, SAMPLE_POS, TGSI_OPCODE_SAMPLE_POS }, { 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO }, these don't have the is_tex bit set? This bit marks legacy texture sample instructions that have texture target encoded directly into the instruction token. In contrast, the instructions above take an extra argument of type RESOURCE, for which the texture target is encoded in register declaration. Like Dave said, the GLSL-TGSI translator needs to account for this. I'm not sure on how to initialize a resource in TGSI, though. Is there a state tracker that uses it that could provide an example? I also have an unrelated TGSI question: has it been determined exactly what the source regs for TXF should be and what order they should be in? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] is_tex bit for TGSI sampling instructions
On 08/24/2011 11:42 PM, Zack Rusin wrote: On Wednesday, August 24, 2011 10:14:48 PM Bryan Cain wrote: Like Dave said, the GLSL-TGSI translator needs to account for this. Probably not, at least yet. All of those instructions are DX10.1 level instructions which support splitting of samplers and the underlaying resource, we don't support that yet. I'm not sure on how to initialize a resource in TGSI, though. Is there a state tracker that uses it that could provide an example? I don't think any of the open drivers actually implement this scheme quite yet. The resources are declared using ureg_DECL_resource, the resulting code would look something like: DCL RES[0], 2D, FLOAT LOAD DST[0], SRC[0], RES[0] Okay, thanks for your answer. I also have an unrelated TGSI question: has it been determined exactly what the source regs for TXF should be and what order they should be in? Two four component integer vectors, first is the coordinate the other is the offset. I think gpu_program4 specifies that instruction. Ultimately I wanted to replace all the texture sampling/fetching/loading instructions with the DX10.1 variants. There's less of them and they're cleaner. Drivers which don't support that split could just ignore it. For various reasons I never finished it so this texture sampling code is quite frankly half-assed, I'd just use the documented/driver implemented opcodes and leave others untouched. z The problem is that we need these for GLSL 1.30. Since all the pieces are falling into place on the Mesa side, we need to have support for these operations soon on the Gallium side. That's why I want someone to actually implement these operations in at least one driver. Especially TXF; how hard would that really be for someone who understands the code to implement in tgsi_exec? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] mesa: Make the gl_constant_value's bool occupy the same space as float/int.
On 08/20/2011 03:05 PM, Dan McCabe wrote: What are the implications for other architectures that support doubles? I don't see what you mean. gl_constant_value doesn't support doubles yet. Bryan On 08/19/2011 05:56 PM, Eric Anholt wrote: At least for Intel, all our uniform components are of uint32_t size, either float or signed or unsigned int. For uploading uniform data in the driver, it's much easier to upload a full dword per uniform element instead of trying to pick out the bool byte and then fill in the top 3 bytes of pad with 0. --- src/mesa/program/prog_parameter.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/program/prog_parameter.h b/src/mesa/program/prog_parameter.h index 1a5ed34..4c2773a 100644 --- a/src/mesa/program/prog_parameter.h +++ b/src/mesa/program/prog_parameter.h @@ -53,7 +53,7 @@ typedef union gl_constant_value { GLfloat f; - GLboolean b; + GLint b; GLint i; GLuint u; } gl_constant_value; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] mesa: Make the gl_constant_value's bool occupy the same space as float/int.
On 08/20/2011 05:10 PM, Dan McCabe wrote: On 08/20/2011 01:30 PM, Bryan Cain wrote: On 08/20/2011 03:05 PM, Dan McCabe wrote: What are the implications for other architectures that support doubles? I don't see what you mean. gl_constant_value doesn't support doubles yet. Yet - that is the operative word. You can buy GPUs that support doubles today. Therefore, double support should be on our radar (it appears to be on Eric's radar, based on his commit comment). And we should understand what the implications are for our code. Clearly, I don't understand the implications; if I did, I wouldn't have asked. But perhaps Eric might. There are a couple of possible answers. I don't know - OK, but at least let's ask the question and start thinking about it's answer. No impact - I like that answer. It affects this, and this, and this - While not ideal, at least we then know what to do in the future and prepare ourselves for that future. cheers, danm This particular commit has no impact at all. A core assumption of gl_constant_value is that 4 of each type in the union will fit into a vec4, so the gl_constant_value code will need to be reworked when double support is added. But this commit doesn't afffect that in any way. Bryan Bryan On 08/19/2011 05:56 PM, Eric Anholt wrote: At least for Intel, all our uniform components are of uint32_t size, either float or signed or unsigned int. For uploading uniform data in the driver, it's much easier to upload a full dword per uniform element instead of trying to pick out the bool byte and then fill in the top 3 bytes of pad with 0. --- src/mesa/program/prog_parameter.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/program/prog_parameter.h b/src/mesa/program/prog_parameter.h index 1a5ed34..4c2773a 100644 --- a/src/mesa/program/prog_parameter.h +++ b/src/mesa/program/prog_parameter.h @@ -53,7 +53,7 @@ typedef union gl_constant_value { GLfloat f; - GLboolean b; + GLint b; GLint i; GLuint u; } gl_constant_value; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix incorrect loop over instruction src regs
The usual commit message prefix for changes to glsl_to_tgsi is glsl_to_tgsi, not st/mesa. On 08/16/2011 05:33 PM, Brian Paul wrote: The array of src regs is of size 3, not 4. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index aef23e7..7b90c81 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3443,7 +3443,7 @@ glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void) /* Continuing the block, clear any channels from the write array that * are read by this instruction. */ - for (int i = 0; i 4; i++) { + for (unsigned i = 0; i Elements(inst-src); i++) { Why not just use 3 here? if (inst-src[i].file == PROGRAM_TEMPORARY inst-src[i].reladdr){ /* Any temporary might be read, so no dead code elimination * across this instruction. You're right that the change itself is needed, though. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix incorrect loop over instruction src regs
On 08/17/2011 09:47 AM, Keith Whitwell wrote: On Wed, 2011-08-17 at 09:36 -0500, Bryan Cain wrote: The usual commit message prefix for changes to glsl_to_tgsi is glsl_to_tgsi, not st/mesa. On 08/16/2011 05:33 PM, Brian Paul wrote: The array of src regs is of size 3, not 4. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index aef23e7..7b90c81 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3443,7 +3443,7 @@ glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void) /* Continuing the block, clear any channels from the write array that * are read by this instruction. */ - for (int i = 0; i 4; i++) { + for (unsigned i = 0; i Elements(inst-src); i++) { Why not just use 3 here? Elements(inst-src) is self-documenting. 3 is just a number and to figure out if it was the correct number you'd have to go and look at the header file to see if it matched the value there. Both should generate the same compiled code. Keith Fair enough. The patch was pushed to master shortly before I sent that mail, so there's not much point in discussing it further. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: inline st_prepare_fragment_program in st_translate_fragment_program
This reverts an unnecessary part of commit 4683529048ee and fixes misrendering and an assertion failure in Cogs. Fixes freedesktop.org bug 39888. --- src/mesa/state_tracker/st_program.c | 326 +-- src/mesa/state_tracker/st_program.h | 15 -- 2 files changed, 162 insertions(+), 179 deletions(-) diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index ca01d2e..a4f47ed 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -416,151 +416,6 @@ st_get_vp_variant(struct st_context *st, return vpv; } -/** - * Translate Mesa fragment shader attributes to TGSI attributes. - * \return GL_TRUE if color output should be written to all render targets, - * GL_FALSE if not - */ -GLboolean -st_prepare_fragment_program(struct gl_context *ctx, -struct st_fragment_program *stfp) -{ - GLuint attr; - const GLbitfield inputsRead = stfp-Base.Base.InputsRead; - GLboolean write_all = GL_FALSE; - - /* -* Convert Mesa program inputs to TGSI input register semantics. -*/ - for (attr = 0; attr FRAG_ATTRIB_MAX; attr++) { - if (inputsRead (1 attr)) { - const GLuint slot = stfp-num_inputs++; - - stfp-input_to_index[attr] = slot; - - switch (attr) { - case FRAG_ATTRIB_WPOS: -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_POSITION; -stfp-input_semantic_index[slot] = 0; -stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR; -break; - case FRAG_ATTRIB_COL0: -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_COLOR; -stfp-input_semantic_index[slot] = 0; -stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR; -break; - case FRAG_ATTRIB_COL1: -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_COLOR; -stfp-input_semantic_index[slot] = 1; -stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR; -break; - case FRAG_ATTRIB_FOGC: -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_FOG; -stfp-input_semantic_index[slot] = 0; -stfp-interp_mode[slot] = TGSI_INTERPOLATE_PERSPECTIVE; -break; - case FRAG_ATTRIB_FACE: -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_FACE; -stfp-input_semantic_index[slot] = 0; -stfp-interp_mode[slot] = TGSI_INTERPOLATE_CONSTANT; -break; -/* In most cases, there is nothing special about these - * inputs, so adopt a convention to use the generic - * semantic name and the mesa FRAG_ATTRIB_ number as the - * index. - * - * All that is required is that the vertex shader labels - * its own outputs similarly, and that the vertex shader - * generates at least every output required by the - * fragment shader plus fixed-function hardware (such as - * BFC). - * - * There is no requirement that semantic indexes start at - * zero or be restricted to a particular range -- nobody - * should be building tables based on semantic index. - */ - case FRAG_ATTRIB_PNTC: - case FRAG_ATTRIB_TEX0: - case FRAG_ATTRIB_TEX1: - case FRAG_ATTRIB_TEX2: - case FRAG_ATTRIB_TEX3: - case FRAG_ATTRIB_TEX4: - case FRAG_ATTRIB_TEX5: - case FRAG_ATTRIB_TEX6: - case FRAG_ATTRIB_TEX7: - case FRAG_ATTRIB_VAR0: - default: -/* Actually, let's try and zero-base this just for - * readability of the generated TGSI. - */ -assert(attr = FRAG_ATTRIB_TEX0); -stfp-input_semantic_index[slot] = (attr - FRAG_ATTRIB_TEX0); -stfp-input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC; -if (attr == FRAG_ATTRIB_PNTC) - stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR; -else - stfp-interp_mode[slot] = TGSI_INTERPOLATE_PERSPECTIVE; -break; - } - } - else { - stfp-input_to_index[attr] = -1; - } - } - - /* -* Semantics and mapping for outputs -*/ - { - uint numColors = 0; - GLbitfield64 outputsWritten = stfp-Base.Base.OutputsWritten; - - /* if z is written, emit that first */ - if (outputsWritten BITFIELD64_BIT(FRAG_RESULT_DEPTH)) { - stfp-output_semantic_name[stfp-num_outputs] = TGSI_SEMANTIC_POSITION; - stfp-output_semantic_index[stfp-num_outputs] = 0; - stfp-result_to_output[FRAG_RESULT_DEPTH] = stfp-num_outputs; - stfp-num_outputs++; - outputsWritten = ~(1 FRAG_RESULT_DEPTH); - } - - if (outputsWritten BITFIELD64_BIT(FRAG_RESULT_STENCIL)) { - stfp-output_semantic_name[stfp-num_outputs] = TGSI_SEMANTIC_STENCIL; -
Re: [Mesa-dev] Status of the GLSL-TGSI translator, part 2
On 08/04/2011 08:24 AM, Brian Paul wrote: On Tue, Aug 2, 2011 at 3:50 PM, Bryan Cain bryanca...@gmail.com wrote: On 08/02/2011 11:27 AM, Brian Paul wrote: On 08/01/2011 12:38 PM, Bryan Cain wrote: Since Mesa 7.11 is released now, I figure it's time to discuss merging the glsl-to-tgsi branch to master again. The translator is more mature than last time. There are no regressions that I know of on any driver. The code generation has improved so that it's the same as or better than ir_to_mesa in almost every case. It also will still emit TGSI integer opcodes if you change ctx-GLSLVersion from 120 to 130 in st_extensions.c. Driver developers can use this to implement these opcodes in their drivers, since they will be needed for GLSL 1.30 eventually. Are there any objections to merging it to master now? If there aren't, I'll revise some of the commit messages for correctness and wrap long lines since cgit doesn't do that automatically, then merge it after getting approval. Sounds OK to me. I was just skimming over the new stuff - it looks like create_color_map_texture() is duplicated in two files and could be shared. -Brian Yes, where should the function go and what header should it be declared in? It appears to me that none of the functions in st_atom_*.c are declared directly in header files. How about st_texture.[ch]? That's where st_texture_create() lives. Okay, done. Is there anything else that should be done, or is it ready to merge to master? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] S2TC - yet another attempt to solve the S3TC issue
On 08/03/2011 01:58 PM, Ian Romanick wrote: On 08/03/2011 08:04 AM, Brian Paul wrote: On Mon, Aug 1, 2011 at 11:44 AM, Rudolf Polzer divver...@xonotic.org wrote: Hi, I developed, together with Maik Merten, a replacement for S3TC with the following properties: - does not use any interesting algorithms (no color ramps, each 4x4 block is just a 2 colors palette - basically, this is Color Cell Compression from August 1986) - is perfectly decodable by any S3TC decoder - can somewhat (at a noticeable quality loss) attempt to decode S3TC - due to the simpler nature of the algorithm, it outperforms the S3TC libtxc_dxtn - due to the compression being simpler, it was easier to write a good compression algorithm for it, and it sometimes yields better quality than NVidia Texture Tools encoding to full S3TC It is called S2TC because each block only has 2 colors, and expands to Super Simple Texture Compression. It comes in form of a libtxc_dxtn replacement library, so it can be used with Mesa right now. You can find more info on https://github.com/divVerent/s2tc/wiki including quality comparison pictures that compare it to full S3TC. When used in conjunction with an OpenGL driver, this would mean: - when a precompressed texture is uploaded, nothing changes - it is uploaded as full S3TC - when an uncompressed texture is uploaded as a compressed format, it is converted using the S2TC encoder, which typically yields less quality than a S3TC encoder, but still renders correctly and usually good enough (see screenshots on my wiki) - in case a software fallback hits (or llvmpipe is used), S2TC is used for decoding - due to some bit values only being defined in the full S3TC format spec, this will be somewhat wrong and yields reduced quality when precompressed S3TC textures are used - I believe this suffices to claim support for GL_EXT_texture_compression_s3tc without having an actual S3TC compressor, as compressing to a sub-format of S3TC is the same as just having a less clever S3TC compressor. Decompressing being incorrect (but still good enough for most uses) in case of software fallbacks however may be a problem, but then one could still instead upload it to the GPU, let it decode it, and download the decoded texture instead The question now is: - does Mesa have any interest in integrating this method (doesn't have to be my reference implementation, which implements quite many selectable variations of the encoding methods), and still using full S3TC if a libtxc_dxtn is found? - or would Mesa be interested in linking to this implementation as an alternative to full libtxc_dxtn especially for the USA, as it's likely that less licenses are required to use this method? One possible implementation BTW could be Linux distros shipping S2TC libtxc_dxtn by default, and providing a simple way for users to upgrade to a full S3TC library, if they are aware of the issues with it and see themselves not affected by them? Hi, I saw the story about this on Phronix a couple weeks ago. I like it. As far as I'm concerned, I think it would be OK to integrate this into Mesa to use as a fallback when libtxc_dxtn isn't available. Does anyone else have an opinion? I think this solves the issue for the compressor and for the software decompressor. I don't think this solves the problem for the decompressor for hardware drivers, so that may still pose a problem for Linux distros. Pardon my ignorance, but why do hardware drivers need a decompressor? Bryan Of course, all of this may actually be irrelevant soon: http://www.litigatingapple.com/blog/2011/7/24/why-htcs-courtship-of-s3-might-be-too-clever-by-half.html I think the ideal solution would be to have something in core Mesa that could eventually evolve into full S3TC support. When this happens, I don't think libtxc_dxtn is the best answer. I think libsquish (http://code.google.com/p/libsquish/) is probably a better answer. This library is used in other projects, and it has quite a bit more optimized implementation. What we really want is to just do the compression on the GPU, but that's a bit larger project. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Status of the GLSL-TGSI translator, part 2
Since Mesa 7.11 is released now, I figure it's time to discuss merging the glsl-to-tgsi branch to master again. The translator is more mature than last time. There are no regressions that I know of on any driver. The code generation has improved so that it's the same as or better than ir_to_mesa in almost every case. It also will still emit TGSI integer opcodes if you change ctx-GLSLVersion from 120 to 130 in st_extensions.c. Driver developers can use this to implement these opcodes in their drivers, since they will be needed for GLSL 1.30 eventually. Are there any objections to merging it to master now? If there aren't, I'll revise some of the commit messages for correctness and wrap long lines since cgit doesn't do that automatically, then merge it after getting approval. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2] Make Gallium drivers respect the force_s3tc_enable environment variable
On Mon, Jul 25, 2011 at 1:50 PM, Jose Fonseca jfons...@vmware.com wrote: I have no objection FWIW. Jose - Original Message - This is a revised version of my previous patches. Patch 1 incorporates Brian's feedback, and patch 2 is unchanged from before. I did test patch 2 without patch 1, and found that both patches are neeeded for force_s3tc_enable to work. Both of these are candidates for 7.10 and 7.11, since games such as 0 A.D. expect drivers to support this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Can I go ahead and push these patches this afternoon if there are no objections? Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 7.11 release candidate 3
On Mon, Jul 25, 2011 at 9:54 PM, Ian Romanick i...@freedesktop.org wrote: The only changes in the 7.11 branch (until the final release) should be low-risk fixes for significant bugs. Other than updates to the release notes and changing the version, RC3 should represent 7.11 final. Would my patches for making Gallium drivers respect force_s3tc_enable qualify? They are very low-risk, and fix https://bugs.freedesktop.org/show_bug.cgi?id=29012. I don't know what qualifies as a significant bug, but the fix makes at least one game (fs2_open) work out of the box with Gallium drivers that didn't before. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] Make Gallium drivers respect the force_s3tc_enable environment variable
This is a revised version of my previous patches. Patch 1 incorporates Brian's feedback, and patch 2 is unchanged from before. I did test patch 2 without patch 1, and found that both patches are neeeded for force_s3tc_enable to work. Both of these are candidates for 7.10 and 7.11, since games such as 0 A.D. expect drivers to support this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable
NOTE: This is a candidate for the 7.10 and 7.11 branches. --- src/mesa/state_tracker/st_extensions.c | 11 ++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 99b231d..b5f6d35 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -208,6 +208,15 @@ void st_init_limits(struct st_context *st) } +static GLboolean st_get_s3tc_override(void) +{ + const char *override = _mesa_getenv(force_s3tc_enable); + if (override !strcmp(override, true)) + return GL_TRUE; + return GL_FALSE; +} + + /** * Use pipe_screen::get_param() to query PIPE_CAP_ values to determine * which GL extensions are supported. @@ -426,7 +435,7 @@ void st_init_extensions(struct st_context *st) if (screen-is_format_supported(screen, PIPE_FORMAT_DXT5_RGBA, PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW) - ctx-Mesa_DXTn) { + (ctx-Mesa_DXTn || st_get_s3tc_override())) { ctx-Extensions.EXT_texture_compression_s3tc = GL_TRUE; ctx-Extensions.S3_s3tc = GL_TRUE; } -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] util: enable S3TC support when the force_s3tc_enable env var is set to true
NOTE: This is a candidate for the 7.10 and 7.11 branches. --- src/gallium/auxiliary/util/u_format_s3tc.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c b/src/gallium/auxiliary/util/u_format_s3tc.c index bb989c2..d8a7c0d 100644 --- a/src/gallium/auxiliary/util/u_format_s3tc.c +++ b/src/gallium/auxiliary/util/u_format_s3tc.c @@ -119,8 +119,15 @@ util_format_s3tc_init(void) library = util_dl_open(DXTN_LIBNAME); if (!library) { - debug_printf(couldn't open DXTN_LIBNAME , software DXTn - compression/decompression unavailable\n); + if (getenv(force_s3tc_enable) + !strcmp(getenv(force_s3tc_enable), true)) { + debug_printf(couldn't open DXTN_LIBNAME , enabling DXTn due to +force_s3tc_enable=true environment variable\n); + util_format_s3tc_enabled = TRUE; + } else { + debug_printf(couldn't open DXTN_LIBNAME , software DXTn +compression/decompression unavailable\n); + } return; } -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable
On 07/21/2011 08:48 AM, Brian Paul wrote: On 07/20/2011 04:53 PM, Bryan Cain wrote: --- src/mesa/state_tracker/st_extensions.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 99b231d..073e72c 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -208,6 +208,16 @@ void st_init_limits(struct st_context *st) } +static int st_get_s3tc_override(void) +{ + const char *override = _mesa_getenv(force_s3tc_enable); + fprintf(stderr, force_s3tc_enable=%s\n, override); This fprintf() looks like left-over debug code. Oops, you're right. + if (override !strcmp(override, true)) + return GL_TRUE; + return GL_FALSE; If you're going to return GL_TRUE/GL_FALSE, the function type should be GLboolean. Okay, I'll resubmit the patch with those changes. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Patches to make Gallium drivers respect the force_s3tc_enable environment variable
The purpose of the following two patches is to make st/mesa expose the S3TC extensions when the force_s3tc_enable environment variable is used. This is to match the behavior of the DRI drivers, in which force_s3tc_enable is an option in driconf. Although st/mesa can't use the driconf functions, it can at least respect the environment variable for this option. These patches have been tested with fs2_open 3.6.12 and the Freespace 2 MediaVPs. Setting force_s3tc_enable=true makes the game run flawlessly without libtxc_dxtn, whereas without it the game writes several error messages to its log file and then crashes. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable
--- src/mesa/state_tracker/st_extensions.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 99b231d..073e72c 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -208,6 +208,16 @@ void st_init_limits(struct st_context *st) } +static int st_get_s3tc_override(void) +{ + const char *override = _mesa_getenv(force_s3tc_enable); + fprintf(stderr, force_s3tc_enable=%s\n, override); + if (override !strcmp(override, true)) + return GL_TRUE; + return GL_FALSE; +} + + /** * Use pipe_screen::get_param() to query PIPE_CAP_ values to determine * which GL extensions are supported. @@ -426,7 +436,7 @@ void st_init_extensions(struct st_context *st) if (screen-is_format_supported(screen, PIPE_FORMAT_DXT5_RGBA, PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW) - ctx-Mesa_DXTn) { + (ctx-Mesa_DXTn || st_get_s3tc_override())) { ctx-Extensions.EXT_texture_compression_s3tc = GL_TRUE; ctx-Extensions.S3_s3tc = GL_TRUE; } -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] util: enable S3TC support when the force_s3tc_enable env var is set to true
--- src/gallium/auxiliary/util/u_format_s3tc.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c b/src/gallium/auxiliary/util/u_format_s3tc.c index bb989c2..d8a7c0d 100644 --- a/src/gallium/auxiliary/util/u_format_s3tc.c +++ b/src/gallium/auxiliary/util/u_format_s3tc.c @@ -119,8 +119,15 @@ util_format_s3tc_init(void) library = util_dl_open(DXTN_LIBNAME); if (!library) { - debug_printf(couldn't open DXTN_LIBNAME , software DXTn - compression/decompression unavailable\n); + if (getenv(force_s3tc_enable) + !strcmp(getenv(force_s3tc_enable), true)) { + debug_printf(couldn't open DXTN_LIBNAME , enabling DXTn due to +force_s3tc_enable=true environment variable\n); + util_format_s3tc_enabled = TRUE; + } else { + debug_printf(couldn't open DXTN_LIBNAME , software DXTn +compression/decompression unavailable\n); + } return; } -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Common Subexpression Elimination
On 07/16/2011 10:28 AM, Vincent wrote: Hi, I wrote an optimisation pass that perform CSE (that is, it spots expressions that appears twice or more, and factor them in a temporary to avoid recalculation). This is my first patch to Mesa, I would like to receive feedback : case where the algorithm does not work, crashes, ... I tried to follow coding style, using exec_list*, ralloced objects... The algorithm currently only spot binary expressions (that is x + y, x * y, ...) but unary expression (exp(x), ln(x)...) should follow soon. Regards, Vincent. Hi Vincent, For future reference, the diff you sent is currently reversed - the lines you added are marked as - and the lines you removed are marked as +. It's best if you send patches using git-send-email to avoid this. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Merging glsl-to-tgsi to master
On Tue, Jul 12, 2011 at 12:00 PM, Ian Romanick i...@freedesktop.org wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/11/2011 09:53 AM, Egon Ashrafinia wrote: Hello guys. 1 month ago, we talked about merging glsl-to-tgsi to master but it still not happend. What about now? I could compile and test it a bit. It works. Anyone who could do it? What does Bryan Cain say about it? One thing to consider is whether or not this will make it more difficult to cherry pick fixes to the 7.11 release branch. Bryan has been doing some really cool work, and I'd like to see it get merged. However, I also want 7.11 to ship on time and with as few bugs as possible. Anything that will make that more difficult should wait until August. I don't think merging the glsl-to-tgsi branch would make it much more difficult to cherry pick fixes to 7.11. The main change in the branch is the addition of a new file to st/mesa (st_glsl_to_tgsi.cpp) containing the GLSL-TGSI translator. That in and of itself doesn't make it any harder. The only existing files with non-trivial changes that might affect cherry-picking are uniforms.c and prog_parameter.c in core Mesa. Several other files in st/mesa and core Mesa have minor changes to make them work with either the prog_parameter changes or to work without Mesa IR, but I don't think those changes are enough to make cherry-picking difficult. They're at least not worse than most changes in master in that regard. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Merging glsl-to-tgsi to master
On 07/11/2011 11:53 AM, Egon Ashrafinia wrote: Hello guys. 1 month ago, we talked about merging glsl-to-tgsi to master but it still not happend. What about now? I could compile and test it a bit. It works. Anyone who could do it? What does Bryan Cain say about it? Hi Egon, Last month, I was trying to get it merged before the 7.11 merge window closed. When that didn't happen, I decided to stop being in a hurry and instead make some more improvements before trying again to get it merged to master. Also, Brian Paul pointed out an issue that I hadn't noticed before where st/mesa needed the current fragment shader to be in Mesa IR form for glBitmap, glDrawPixels, and glCopyPixels to work correctly. All of that is done now, as I finished the glBitmap path on Sunday and the DrawPixels/CopyPixels path on Saturday. So now I need to fix a few minor things like commit messages and one case where the old path generates better code than glsl_to_tgsi. The merge might also be delayed a bit further if it's decided that it should wait until after 7.11 is released, as Ian's reply suggested. Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev