from:"Bryan Cain"

Re: [Mesa-dev] RFC: mesa/st dynamic sampler support in tgsi

2014-08-05 Thread Bryan Cain

On Mon, Aug 4, 2014 at 11:54 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 Another apporach I've tried is to just use a TEMP register as the
 indirect offset here. Unfortunately this gets destroyed by
 st_glsl_to_tgsi's various optimizations which assume that temp
 registers can't be reladdr's and so messes up the values. I started
 adding support for that, but then quickly realized that was probably
 not the right thing to do.

 So... should I increase the number of address registers to 1? Or is
 there some other simple approach that I'm missing?

 Thanks,

   -ilia


The problem you are having with the optimizations in glsl_to_tgsi is
because, as you say, it doesn't check the reladdr fields when calculating
live intervals.  I'm currently in the middle of reworking the live interval
calculation - if you don't mind waiting a day or two for me to finish the
patch set and send it to the list, I can add that change to the reworked
function in the patch set.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] RFC: mesa/st dynamic sampler support in tgsi

2014-08-05 Thread Bryan Cain

On Tue, Aug 5, 2014 at 3:23 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 On Tue, Aug 5, 2014 at 4:14 PM, Bryan Cain bryanca...@gmail.com wrote:
  On Mon, Aug 4, 2014 at 11:54 PM, Ilia Mirkin imir...@alum.mit.edu
 wrote:
 
  Another apporach I've tried is to just use a TEMP register as the
  indirect offset here. Unfortunately this gets destroyed by
  st_glsl_to_tgsi's various optimizations which assume that temp
  registers can't be reladdr's and so messes up the values. I started
  adding support for that, but then quickly realized that was probably
  not the right thing to do.
 
  So... should I increase the number of address registers to 1? Or is
  there some other simple approach that I'm missing?
 
  Thanks,
 
-ilia
 
 
  The problem you are having with the optimizations in glsl_to_tgsi is
  because, as you say, it doesn't check the reladdr fields when calculating
  live intervals.  I'm currently in the middle of reworking the live
 interval
  calculation - if you don't mind waiting a day or two for me to finish the
  patch set and send it to the list, I can add that change to the reworked
  function in the patch set.

 I can certainly wait a little while. However from everything I can
 tell, all current reladdr's are in the ADDR register file, so perhaps
 I should just stick with that?


Either way is workable from glsl_to_tgsi's standpoint.  I assume the
gallium interface people have a better idea of what the preferred solution
is.

[Resending because I forgot to CC the list before.]
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl_to_tgsi: remove unnecessary dead code elimination pass

2014-05-05 Thread Bryan Cain

With the more advanced dead code elimination pass already being run,
eliminate_dead_code was making no difference in instruction count, and had
an undesirable O(n^2) runtime. So remove it and rename
eliminate_dead_code_advanced to eliminate_dead_code.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   50 +++-
 1 file changed, 5 insertions(+), 45 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 6eb6c8a..b0e0782 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -460,8 +460,7 @@ public:
int get_last_temp_write(int index);
 
void copy_propagate(void);
-   void eliminate_dead_code(void);
-   int eliminate_dead_code_advanced(void);
+   int eliminate_dead_code(void);
void merge_registers(void);
void renumber_registers(void);
 
@@ -3663,7 +3662,8 @@ glsl_to_tgsi_visitor::copy_propagate(void)
 }
 
 /*
- * Tracks available PROGRAM_TEMPORARY registers for dead code elimination.
+ * On a basic block basis, tracks available PROGRAM_TEMPORARY registers for 
dead
+ * code elimination.
  *
  * The glsl_to_tgsi_visitor lazily produces code assuming that this pass
  * will occur.  As an example, a TXP production after copy propagation but 
@@ -3676,48 +3676,9 @@ glsl_to_tgsi_visitor::copy_propagate(void)
  * and after this pass:
  *
  * 0: TXP TEMP[2], INPUT[4].xyyw, texture[0], 2D;
- * 
- * FIXME: assumes that all functions are inlined (no support for BGNSUB/ENDSUB)
- * FIXME: doesn't eliminate all dead code inside of loops; it steps around them
- */
-void
-glsl_to_tgsi_visitor::eliminate_dead_code(void)
-{
-   int i;
-   
-   for (i=0; i  this-next_temp; i++) {
-  int last_read = get_last_temp_read(i);
-  int j = 0;
-  
-  foreach_list_safe(node, this-instructions) {
- glsl_to_tgsi_instruction *inst = (glsl_to_tgsi_instruction *) node;
-
- if (inst-dst.file == PROGRAM_TEMPORARY  inst-dst.index == i 
- j  last_read)
- {
-inst-remove();
-delete inst;
- }
- 
- j++;
-  }
-   }
-}
-
-/*
- * On a basic block basis, tracks available PROGRAM_TEMPORARY registers for 
dead
- * code elimination.  This is less primitive than eliminate_dead_code(), as it
- * is per-channel and can detect consecutive writes without a read between them
- * as dead code.  However, there is some dead code that can be eliminated by 
- * eliminate_dead_code() but not this function - for example, this function 
- * cannot eliminate an instruction writing to a register that is never read and
- * is the only instruction writing to that register.
- *
- * The glsl_to_tgsi_visitor lazily produces code assuming that this pass
- * will occur.
  */
 int
-glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
+glsl_to_tgsi_visitor::eliminate_dead_code(void)
 {
glsl_to_tgsi_instruction **writes = rzalloc_array(mem_ctx,
  glsl_to_tgsi_instruction 
*,
@@ -5245,9 +5206,8 @@ get_mesa_program(struct gl_context *ctx,
/* Perform optimizations on the instructions in the glsl_to_tgsi_visitor. */
v-simplify_cmp();
v-copy_propagate();
-   while (v-eliminate_dead_code_advanced());
+   while (v-eliminate_dead_code());
 
-   v-eliminate_dead_code();
v-merge_registers();
v-renumber_registers();

-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: implement multisample textures

2013-10-25 Thread Bryan Cain

On 10/25/2013 01:35 PM, Emil Velikov wrote:
 On 21/10/13 23:23, Bryan Cain wrote:
 This is a port of 4da54c91d24da (nvc0: implement multisample textures) to
 nv50.

 When coupled with the patch to only report 16 texture samplers (to fix
 crashes), all of the Piglit tests in spec/arb_texture_multisample pass.

 Hello Bryan,

 Big thanks for your work. As promised here is a quick piglit summary on
 my nv96

 pass/fail/crash
 69/32/27

 * dmesg does not spit anything nouveau related during the tests
 * any geometry shader related tests were skipped
 (piglit: info: Failed to create GL 3.2 core context)
 * all the crashes are due to the following assert
 codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed.

 PASSarb_texture_multisample-*
 PASSfb-completeness/*
 FAILsample-position/*
 FAILtexelFetch fs sampler2DMS 4*
 CRASH   texelFetch fs sampler2DMSArray 4*
 FAILtexelFetch/*-*s-isampler2DMS
 CRASH   texelFetch/*-*s-isampler2DMSArray
 PASStextureSize/*


 Hope you find this useful :)
 No real world apps that use multisample textures were tested, yet.

 Cheers
 Emil

Hi Emil,

Thanks for testing on nv96.  It seems, though, that I messed up my
piglit-run command and didn't include all of the relevant tests as a
result.  Now that I've fixed that, I'm seeing the same failures and
crashes on my nva5.

It seems that multisampling is broken with texelFetch (both the
texelFetch and sample-position tests use it) but works otherwise, unless
it turns out not to produce the right results in real world applications
for pre-nva3 cards.

I'm going to take some time this weekend to see what's going on with
multisampling and texelFetch.

Thanks again,
Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: implement multisample textures

2013-10-25 Thread Bryan Cain

On 10/25/2013 04:11 PM, Christoph Bumiller wrote:
 On 25.10.2013 20:35, Emil Velikov wrote:
 On 21/10/13 23:23, Bryan Cain wrote:
 This is a port of 4da54c91d24da (nvc0: implement multisample textures) to
 nv50.

 When coupled with the patch to only report 16 texture samplers (to fix
 crashes), all of the Piglit tests in spec/arb_texture_multisample pass.

 Hello Bryan,

 Big thanks for your work. As promised here is a quick piglit summary on
 my nv96

 pass/fail/crash
 69/32/27

 * dmesg does not spit anything nouveau related during the tests
 * any geometry shader related tests were skipped
 (piglit: info: Failed to create GL 3.2 core context)
 * all the crashes are due to the following assert
 codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed.
 I'm not sure how you'd get  4 arguments there (x y layer sample ?).
 There's no mip maps for multisample textures.

 But either way you're probably going to have to do things by hand:
 E.g. MS8 textures contain contiguous 4x2 rectangles of samples for each
 pixel, so you multiply x by 4 and y by 2 to arrive at the sub-rectangle
 and then add the correct offsets for the sample id as seen in
 get_sample_position (store the info in a constant buffer, that has to be
 updated when texture changes).

 You might want to use a lookup table like in nve4 compute (look for MS
 sample coordinate offsets) to map sample id to coordinate offset, that
 one works for any sample count as long as you don't use the ALT modes
 (nve4 doesn't need to for textures, but for images/surfaces/UAVs/RATs
 where the whole VM address calculation is done by hand).

You're probably right.  I don't know why MSAA appears to work for me,
but there's probably something wrong with the output that I haven't
noticed.  I'll work on implementing it properly this weekend.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: implement multisample textures

2013-10-25 Thread Bryan Cain

On 10/25/2013 05:05 PM, Christoph Bumiller wrote:
 On 25.10.2013 23:51, Bryan Cain wrote:
 On 10/25/2013 04:11 PM, Christoph Bumiller wrote:
 On 25.10.2013 20:35, Emil Velikov wrote:
 On 21/10/13 23:23, Bryan Cain wrote:
 This is a port of 4da54c91d24da (nvc0: implement multisample textures) 
 to
 nv50.

 When coupled with the patch to only report 16 texture samplers (to fix
 crashes), all of the Piglit tests in spec/arb_texture_multisample pass.

 Hello Bryan,

 Big thanks for your work. As promised here is a quick piglit summary on
 my nv96

 pass/fail/crash
 69/32/27

 * dmesg does not spit anything nouveau related during the tests
 * any geometry shader related tests were skipped
 (piglit: info: Failed to create GL 3.2 core context)
 * all the crashes are due to the following assert
 codegen/nv50_ir_emit_nv50.cpp:1393:emitTEX: Assertion `argc = 4' failed.
 I'm not sure how you'd get  4 arguments there (x y layer sample ?).
 There's no mip maps for multisample textures.

 But either way you're probably going to have to do things by hand:
 E.g. MS8 textures contain contiguous 4x2 rectangles of samples for each
 pixel, so you multiply x by 4 and y by 2 to arrive at the sub-rectangle
 and then add the correct offsets for the sample id as seen in
 get_sample_position (store the info in a constant buffer, that has to be
 updated when texture changes).

 You might want to use a lookup table like in nve4 compute (look for MS
 sample coordinate offsets) to map sample id to coordinate offset, that
 one works for any sample count as long as you don't use the ALT modes
 (nve4 doesn't need to for textures, but for images/surfaces/UAVs/RATs
 where the whole VM address calculation is done by hand).
 You're probably right.  I don't know why MSAA appears to work for me,
 but there's probably something wrong with the output that I haven't
 noticed.  I'll work on implementing it properly this weekend.
 MSAA itself (rendering and resolving) has been working before, the only
 thing that ARB_texture_multisample adds is texelFetch from MS resources.

I really should read an extension's spec carefully before trying to
implement it so that I don't waste other people's time.  Sorry.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50: implement multisample textures

2013-10-21 Thread Bryan Cain

This is a port of 4da54c91d24da (nvc0: implement multisample textures) to
nv50.

When coupled with the patch to only report 16 texture samplers (to fix
crashes), all of the Piglit tests in spec/arb_texture_multisample pass.
---
 .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  |5 ++-
 src/gallium/drivers/nouveau/nv50/nv50_context.c|   46 
 src/gallium/drivers/nouveau/nv50/nv50_miptree.c|2 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |3 +-
 src/gallium/drivers/nouveau/nv50/nv50_tex.c|   20 +++--
 5 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
index caaf09f..d5d1f1e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
@@ -569,6 +569,7 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i)
const int arg = i-tex.target.getArgCount();
const int dref = arg;
const int lod = i-tex.target.isShadow() ? (arg + 1) : arg;
+   const int lyr = arg - (i-tex.target.isMS() ? 2 : 1);
 
// dref comes before bias/lod
if (i-tex.target.isShadow())
@@ -577,11 +578,11 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i)
 
// array index must be converted to u32
if (i-tex.target.isArray()) {
-  Value *layer = i-getSrc(arg - 1);
+  Value *layer = i-getSrc(lyr);
   LValue *src = new_LValue(func, FILE_GPR);
   bld.mkCvt(OP_CVT, TYPE_U32, src, TYPE_F32, layer);
   bld.mkOp2(OP_MIN, TYPE_U32, src, src, bld.loadImm(NULL, 511));
-  i-setSrc(arg - 1, src);
+  i-setSrc(lyr, src);
 
   if (i-tex.target.isCube()) {
  std::vectorValue * acube, a2d;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c 
b/src/gallium/drivers/nouveau/nv50/nv50_context.c
index b6bdf79..45e3f5d 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c
@@ -194,6 +194,10 @@ nv50_invalidate_resource_storage(struct nouveau_context 
*ctx,
return ref;
 }
 
+static void
+nv50_context_get_sample_position(struct pipe_context *, unsigned, unsigned,
+ float *);
+
 struct pipe_context *
 nv50_create(struct pipe_screen *pscreen, void *priv)
 {
@@ -237,6 +241,7 @@ nv50_create(struct pipe_screen *pscreen, void *priv)
 
pipe-flush = nv50_flush;
pipe-texture_barrier = nv50_texture_barrier;
+   pipe-get_sample_position = nv50_context_get_sample_position;
 
if (!screen-cur_ctx) {
   screen-cur_ctx = nv50;
@@ -315,3 +320,44 @@ nv50_bufctx_fence(struct nouveau_bufctx *bufctx, boolean 
on_flush)
  nv50_resource_validate(res, (unsigned)ref-priv_data);
}
 }
+
+static void
+nv50_context_get_sample_position(struct pipe_context *pipe,
+ unsigned sample_count, unsigned sample_index,
+ float *xy)
+{
+   static const uint8_t ms1[1][2] = { { 0x8, 0x8 } };
+   static const uint8_t ms2[2][2] = {
+  { 0x4, 0x4 }, { 0xc, 0xc } }; /* surface coords (0,0), (1,0) */
+   static const uint8_t ms4[4][2] = {
+  { 0x6, 0x2 }, { 0xe, 0x6 },   /* (0,0), (1,0) */
+  { 0x2, 0xa }, { 0xa, 0xe } }; /* (0,1), (1,1) */
+   static const uint8_t ms8[8][2] = {
+  { 0x1, 0x7 }, { 0x5, 0x3 },   /* (0,0), (1,0) */
+  { 0x3, 0xd }, { 0x7, 0xb },   /* (0,1), (1,1) */
+  { 0x9, 0x5 }, { 0xf, 0x1 },   /* (2,0), (3,0) */
+  { 0xb, 0xf }, { 0xd, 0x9 } }; /* (2,1), (3,1) */
+#if 0
+   /* NOTE: NVA3+ has alternative modes for MS2 and MS8, currently not used */
+   static const uint8_t ms8_alt[8][2] = {
+  { 0x9, 0x5 }, { 0x7, 0xb },   /* (2,0), (1,1) */
+  { 0xd, 0x9 }, { 0x5, 0x3 },   /* (3,1), (1,0) */
+  { 0x3, 0xd }, { 0x1, 0x7 },   /* (0,1), (0,0) */
+  { 0xb, 0xf }, { 0xf, 0x1 } }; /* (2,1), (3,0) */
+#endif
+
+   const uint8_t (*ptr)[2];
+
+   switch (sample_count) {
+   case 0:
+   case 1: ptr = ms1; break;
+   case 2: ptr = ms2; break;
+   case 4: ptr = ms4; break;
+   case 8: ptr = ms8; break;
+   default:
+  assert(0);
+  return; /* bad sample count - undefined locations */
+   }
+   xy[0] = ptr[sample_index][0] * 0.0625f;
+   xy[1] = ptr[sample_index][1] * 0.0625f;
+}
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_miptree.c 
b/src/gallium/drivers/nouveau/nv50/nv50_miptree.c
index 513d8f9..1963a4a 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_miptree.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_miptree.c
@@ -277,6 +277,8 @@ nv50_miptree_init_layout_tiled(struct nv50_miptree *mt)
 */
d = mt-layout_3d ? pt-depth0 : 1;
 
+   assert(!mt-ms_mode || !pt-last_level);
+
for (l = 0; l = pt-last_level; ++l) {
   struct nv50_miptree_level *lvl = mt-level[l];
   unsigned tsx, tsy, tsz;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index

Re: [Mesa-dev] [PATCH 06/34] draw/gs: fix allocation of buffer for GS output vertices

2013-07-30 Thread Bryan Cain

On Tue, Jul 30, 2013 at 8:46 PM, Paul Berry stereotype...@gmail.com wrote:
 On 29 July 2013 11:09, Zack Rusin za...@vmware.com wrote:

 That looks wrong to me. We already account for the other fields in the
 vertex_size.


 This patch came from Bryan Cain's original geometry shader patch series--I
 admit I'm not familiar enough with Gallium code to know how to fix it.
 Anyone want to step in and address Zack's comment?  Or the Gallium-related
 comments on patches 08 and 24?

 If no one has time to work on Gallium geometry shaders right now, that's ok.
 I can pull the Gallium stuff out of this series and archive it in a branch
 until someone has time to pick it up.

This patch is a year old, and I don't remember what it was supposed to
fix.  The Gallium geometry shader code has changed significantly in
the last year, and it should be safe to leave this patch unmerged.  If
any problems come up as a result, they can be addressed then.




 - Original Message -
  From: Bryan Cain bryanca...@gmail.com
 
  Before, it accounted for the size of the vertices but not the other
  fields
  in the vertex_header struct, which caused memory corruption.
  ---
   src/gallium/auxiliary/draw/draw_gs.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)
 
  diff --git a/src/gallium/auxiliary/draw/draw_gs.c
  b/src/gallium/auxiliary/draw/draw_gs.c
  index cd63e2b..78727c6 100644
  --- a/src/gallium/auxiliary/draw/draw_gs.c
  +++ b/src/gallium/auxiliary/draw/draw_gs.c
  @@ -560,7 +560,8 @@ int draw_geometry_shader_run(struct
  draw_geometry_shader
  *shader,
  /* we allocate exactly one extra vertex per primitive to allow the
  GS to
  emit
   * overflown vertices into some area where they won't harm anyone */
  output_verts-verts =
  -  (struct vertex_header *)MALLOC(output_verts-vertex_size *
  +  (struct vertex_header *)MALLOC(sizeof(struct vertex_header) +
  + output_verts-vertex_size *
max_out_prims *
shader-primitive_boundary);
 
  --
  1.8.3.4
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/tgsi: clarify (possibly change) TGSI_OPCODE_UCMP definition

2013-05-08 Thread Bryan Cain

On 05/08/2013 03:26 PM, Roland Scheidegger wrote:
 There's some other
 somewhat bogus code with comments saying UCMP should be used instead of
 some poor emulation with i2f instructions.

Yeah, sorry about that.  I wrote that code (and comment) a few days
before the UCMP opcode was added to TGSI, and didn't notice until today
that it was never updated to actually use UCMP.  There are a few changes
and clean-ups I've been wanting to make to glsl_to_tgsi anyway - I'll
include an update to emit UCMP if someone hasn't already done so by the
time I finish that and send out a patch set.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nouveau: emit and flush fence in fence_signalled if needed

2013-05-07 Thread Bryan Cain

The Mesa state tracker expects us to emit the fence even if it doesn't call
fence_finish.  Notably, this occurs when glClientWaitSync is called with
timeout 0.

Fixes Portal and Left 4 Dead 2, which were both stalling on startup by
repeatedly calling glClientWaitSync with timeout 0 while waiting for commands
to complete.
---
 src/gallium/drivers/nouveau/nouveau_fence.c |   36 ++-
 src/gallium/drivers/nouveau/nouveau_fence.h |1 +
 2 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c 
b/src/gallium/drivers/nouveau/nouveau_fence.c
index dea146c..722be01 100644
--- a/src/gallium/drivers/nouveau/nouveau_fence.c
+++ b/src/gallium/drivers/nouveau/nouveau_fence.c
@@ -167,6 +167,25 @@ nouveau_fence_update(struct nouveau_screen *screen, 
boolean flushed)
}
 }
 
+boolean
+nouveau_fence_ensure_flushed(struct nouveau_fence *fence)
+{
+   struct nouveau_screen *screen = fence-screen;
+
+   if (fence-state  NOUVEAU_FENCE_STATE_EMITTED) {
+  nouveau_fence_emit(fence);
+
+  if (fence == screen-fence.current)
+ nouveau_fence_new(screen, screen-fence.current, FALSE);
+   }
+   if (fence-state  NOUVEAU_FENCE_STATE_FLUSHED) {
+  if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel))
+ return FALSE;
+   }
+
+   return TRUE;
+}
+
 #define NOUVEAU_FENCE_MAX_SPINS (1  31)
 
 boolean
@@ -174,8 +193,9 @@ nouveau_fence_signalled(struct nouveau_fence *fence)
 {
struct nouveau_screen *screen = fence-screen;
 
-   if (fence-state = NOUVEAU_FENCE_STATE_EMITTED)
-  nouveau_fence_update(screen, FALSE);
+   if (!nouveau_fence_ensure_flushed(fence))
+  return FALSE;
+   nouveau_fence_update(screen, FALSE);
 
return fence-state == NOUVEAU_FENCE_STATE_SIGNALLED;
 }
@@ -189,16 +209,8 @@ nouveau_fence_wait(struct nouveau_fence *fence)
/* wtf, someone is waiting on a fence in flush_notify handler? */
assert(fence-state != NOUVEAU_FENCE_STATE_EMITTING);
 
-   if (fence-state  NOUVEAU_FENCE_STATE_EMITTED) {
-  nouveau_fence_emit(fence);
-
-  if (fence == screen-fence.current)
- nouveau_fence_new(screen, screen-fence.current, FALSE);
-   }
-   if (fence-state  NOUVEAU_FENCE_STATE_FLUSHED) {
-  if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel))
- return FALSE;
-   }
+   if (!nouveau_fence_ensure_flushed(fence))
+  return FALSE;
 
do {
   nouveau_fence_update(screen, FALSE);
diff --git a/src/gallium/drivers/nouveau/nouveau_fence.h 
b/src/gallium/drivers/nouveau/nouveau_fence.h
index 3984a9a..d497c7f 100644
--- a/src/gallium/drivers/nouveau/nouveau_fence.h
+++ b/src/gallium/drivers/nouveau/nouveau_fence.h
@@ -34,6 +34,7 @@ boolean nouveau_fence_new(struct nouveau_screen *, struct 
nouveau_fence **,
 boolean nouveau_fence_work(struct nouveau_fence *, void (*)(void *), void *);
 voidnouveau_fence_update(struct nouveau_screen *, boolean flushed);
 voidnouveau_fence_next(struct nouveau_screen *);
+boolean nouveau_fence_ensure_flushed(struct nouveau_fence *);
 boolean nouveau_fence_wait(struct nouveau_fence *);
 boolean nouveau_fence_signalled(struct nouveau_fence *);
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: Don't copy propagate from swizzles.

2013-04-24 Thread Bryan Cain

On 04/20/2013 12:40 PM, Fabian Bieler wrote:
 Do not propagate a copy if source and destination are identical.

 Otherwise code like

 MOV TEMP[0].xyzw, TEMP[0].wzyx
 mov TEMP[1].xyzw, TEMP[0].xyzw

 is changed to

 MOV TEMP[0].xyzw, TEMP[0].wzyx
 mov TEMP[1].xyzw, TEMP[0].wzyx
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index f2eb3e7..b5d0534 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -3544,6 +3544,8 @@ glsl_to_tgsi_visitor::copy_propagate(void)
/* If this is a copy, add it to the ACP. */
if (inst-op == TGSI_OPCODE_MOV 
inst-dst.file == PROGRAM_TEMPORARY 
 +  !(inst-dst.file == inst-src[0].file 
 + inst-dst.index == inst-src[0].index) 
!inst-dst.reladdr 
!inst-saturate 
!inst-src[0].reladdr 

Nice catch.  FYI, it seems that the almost identical copy_progagate
function in ir_to_mesa also needs this fix.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Geometry shader support for nv50

2013-04-17 Thread Bryan Cain

The following patch set makes the necessary changes to support geometry shaders
in the nv50 driver.  There are no piglit tests yet for geometry shader corner
cases yet, so these changes were tested with all of the GS demos in mesa/demos
and several corner case tests, using Paul Berry's gs branch with support for
GL_ARB_geometry_shader4.   I have also confirmed that the set does not regress
any of the piglit tests in tests/shaders.

Although this set updates the nvc0 driver to handle the indirect addressing
changes in nv50_ir_from_tgsi, it hasn't been tested yet since I don't have
the hardware.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs

2013-04-17 Thread Bryan Cain

From: Christoph Bumiller e0425...@student.tuwien.ac.at

v2 (Bryan Cain bryanca...@gmail.com): Fix emission of PFETCH instructions
using the SHL form.
---
 src/gallium/drivers/nv50/codegen/nv50_ir.h |1 +
 .../drivers/nv50/codegen/nv50_ir_emit_nv50.cpp |   62 ++--
 src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp |1 +
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir.h 
b/src/gallium/drivers/nv50/codegen/nv50_ir.h
index ae365af..9d29b34 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir.h
@@ -366,6 +366,7 @@ enum SVSemantic
SV_CLOCK,
SV_LBASE,
SV_SBASE,
+   SV_VERTEX_STRIDE,
SV_UNDEFINED,
SV_LAST
 };
diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
index bc5a833..67b6298 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
@@ -87,6 +87,7 @@ private:
void emitLOAD(const Instruction *);
void emitSTORE(const Instruction *);
void emitMOV(const Instruction *);
+   void emitRDSV(const Instruction *);
void emitNOP();
void emitINTERP(const Instruction *);
void emitPFETCH(const Instruction *);
@@ -772,6 +773,29 @@ CodeEmitterNV50::emitMOV(const Instruction *i)
}
 }
 
+static inline uint8_t getSRegEncoding(const ValueRef ref)
+{
+   switch (SDATA(ref).sv.sv) {
+   case SV_PHYSID:return 0;
+   case SV_CLOCK: return 1;
+   case SV_VERTEX_STRIDE: return 3;
+// case SV_PM_COUNTER:return 4 + SDATA(ref).sv.index;
+   case SV_SAMPLE_INDEX:  return 8;
+   default:
+  assert(!no sreg for system value);
+  return 0;
+   }
+}
+
+void
+CodeEmitterNV50::emitRDSV(const Instruction *i)
+{
+   code[0] = 0x0001;
+   code[1] = 0x6000 | (getSRegEncoding(i-src(0))  14);
+   defId(i-def(0), 2);
+   emitFlagsRd(i);
+}
+
 void
 CodeEmitterNV50::emitNOP()
 {
@@ -794,15 +818,40 @@ CodeEmitterNV50::emitQUADOP(const Instruction *i, uint8_t 
lane, uint8_t quOp)
   srcId(i-src(0), 32 + 14);
 }
 
+/* NOTE: This returns the base address of a vertex inside the primitive.
+ * src0 is an immediate, the index (not offset) of the vertex
+ * inside the primitive. XXX: signed or unsigned ?
+ * src1 (may be NULL) should use whatever units the hardware requires
+ * (on nv50 this is bytes, so, relative index * 4; signed 16 bit value).
+ */
 void
 CodeEmitterNV50::emitPFETCH(const Instruction *i)
 {
-   code[0] = 0x1181;
-   code[1] = 0x0420 | (0xf  14);
+   const uint32_t prim = i-src(0).get()-reg.data.u32;
+   assert(prim = 127);
 
-   defId(i-def(0), 2);
-   srcAddr8(i-src(0), 9);
-   setAReg16(i, 0);
+   if (i-def(0).getFile() == FILE_ADDRESS) {
+  // shl $aX a[] 0
+  code[0] = 0x0001 | ((DDATA(i-def(0)).id + 1)  2);
+  code[1] = 0xc020;
+  code[0] |= prim  7;
+  assert(!i-srcExists(1));
+   } else
+   if (i-srcExists(1)) {
+  // ld b32 $rX a[$aX+base]
+  code[0] = 0x0001;
+  code[1] = 0x0420 | (0xf  14);
+  defId(i-def(0), 2);
+  code[0] |= prim  9;
+  setARegBits(SDATA(i-src(1)).id + 1);
+   } else {
+  // mov b32 $rX a[]
+  code[0] = 0x1001;
+  code[1] = 0x0420 | (0xf  14);
+  defId(i-def(0), 2);
+  code[0] |= prim  9;
+   }
+   emitFlagsRd(i);
 }
 
 void
@@ -1620,6 +1669,9 @@ CodeEmitterNV50::emitInstruction(Instruction *insn)
case OP_PFETCH:
   emitPFETCH(insn);
   break;
+   case OP_RDSV:
+  emitRDSV(insn);
+  break;
case OP_LINTERP:
case OP_PINTERP:
   emitINTERP(insn);
diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp
index c7121bf..743e566 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_print.cpp
@@ -265,6 +265,7 @@ static const char *SemanticStr[SV_LAST + 1] =
CLOCK,
LBASE,
SBASE,
+   VERTEX_STRIDE,
?,
(INVALID)
 };
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] nv50/ir: delay calculation of indirect addresses

2013-04-17 Thread Bryan Cain

Instead of emitting an SHL 4 io an address register on the TGSI ARL and UARL
instructions, emit the shift when the loaded address is actually used.  This
is necessary because input vertex and attribute indices in geometry shaders on
nv50 need to be shifted left by 2 instead of 4.
---
 .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp |   34 +--
 .../drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp |   97 +++-
 .../drivers/nvc0/codegen/nv50_ir_lowering_nvc0.cpp |   13 +++
 3 files changed, 131 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
index d8abccd..786328a 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
@@ -1114,6 +1114,7 @@ private:
};
 
Value *getVertexBase(int s);
+   Value *getIndirectAddress(Value *ptr);
DataArray *getArrayForFile(unsigned file, int idx);
Value *fetchSrc(int s, int c);
Value *acquireDst(int d, int c);
@@ -1331,7 +1332,8 @@ Converter::getVertexBase(int s)
   if (tgsi.getSrc(s).isIndirect(1))
  rel = fetchSrc(tgsi.getSrc(s).getIndirect(1), 0, NULL);
   vtxBaseValid |= 1  s;
-  vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(), mkImm(index), rel);
+  vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(2, FILE_ADDRESS),
+  mkImm(index), rel);
}
return vtxBase[s];
 }
@@ -1390,6 +1392,14 @@ Converter::getArrayForFile(unsigned file, int idx)
 }
 
 Value *
+Converter::getIndirectAddress(Value *ptr)
+{
+   if(!ptr)
+  return NULL;
+   return mkOp2v(OP_SHL, TYPE_U32, getSSA(2, FILE_ADDRESS), ptr, mkImm(4));
+}
+
+Value *
 Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr)
 {
const int idx2d = src.is2D() ? src.getIndex(1) : 0;
@@ -1401,7 +1411,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, 
int c, Value *ptr)
   assert(!ptr);
   return loadImm(NULL, info-immd.data[idx * 4 + swz]);
case TGSI_FILE_CONSTANT:
-  return mkLoadv(TYPE_U32, srcToSym(src, c), ptr);
+  return mkLoadv(TYPE_U32, srcToSym(src, c), getIndirectAddress(ptr));
case TGSI_FILE_INPUT:
   if (prog-getType() == Program::TYPE_FRAGMENT) {
  // don't load masked inputs, won't be assigned a slot
@@ -1409,9 +1419,13 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, 
int c, Value *ptr)
 return loadImm(NULL, swz == TGSI_SWIZZLE_W ? 1.0f : 0.0f);
 if (!ptr  info-in[idx].sn == TGSI_SEMANTIC_FACE)
 return mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_FACE, 0));
- return interpolate(src, c, ptr);
+ return interpolate(src, c, getIndirectAddress(ptr));
   }
-  return mkLoadv(TYPE_U32, srcToSym(src, c), ptr);
+  else if (ptr  prog-getType() == Program::TYPE_GEOMETRY)
+ // nv50 and nvc0 need different things here, so let the lowering
+ // passes decide what to do with the address
+ return mkLoadv(TYPE_U32, srcToSym(src, c), ptr);
+  return mkLoadv(TYPE_U32, srcToSym(src, c), getIndirectAddress(ptr));
case TGSI_FILE_OUTPUT:
   assert(!load from output file);
   return NULL;
@@ -1420,7 +1434,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, 
int c, Value *ptr)
   return mkOp1v(OP_RDSV, TYPE_U32, getSSA(), srcToSym(src, c));
default:
   return getArrayForFile(src.getFile(), idx2d)-load(
- sub.cur-values, idx, swz, ptr);
+ sub.cur-values, idx, swz, getIndirectAddress(ptr));
}
 }
 
@@ -1463,8 +1477,9 @@ Converter::storeDst(int d, int c, Value *val)
   break;
}
 
-   Value *ptr = dst.isIndirect(0) ?
-  fetchSrc(dst.getIndirect(0), 0, NULL) : NULL;
+   Value *ptr = NULL;
+   if (dst.isIndirect(0))
+  ptr = getIndirectAddress(fetchSrc(dst.getIndirect(0), 0, NULL));
 
if (info-io.genUserClip  0 
dst.getFile() == TGSI_FILE_OUTPUT 
@@ -2166,12 +2181,11 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
  src0 = fetchSrc(0, c);
  mkCvt(OP_CVT, TYPE_S32, dst0[c], TYPE_F32, src0)-rnd = ROUND_M;
- mkOp2(OP_SHL, TYPE_U32, dst0[c], dst0[c], mkImm(4));
   }
   break;
case TGSI_OPCODE_UARL:
   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
- mkOp2(OP_SHL, TYPE_U32, dst0[c], fetchSrc(0, c), mkImm(4));
+ mkOp1(OP_MOV, TYPE_U32, dst0[c], fetchSrc(0, c));
   break;
case TGSI_OPCODE_EX2:
case TGSI_OPCODE_LG2:
@@ -2704,7 +2718,7 @@ Converter::Converter(Program *ir, const tgsi::Source 
*code) : BuildUtil(ir),
 
tData.setup(TGSI_FILE_TEMPORARY, 0, 0, tSize, 4, 4, tFile, 0);
pData.setup(TGSI_FILE_PREDICATE, 0, 0, pSize, 4, 4, FILE_PREDICATE, 0);
-   aData.setup(TGSI_FILE_ADDRESS, 0, 0, aSize, 4, 4, FILE_ADDRESS, 0);
+   aData.setup(TGSI_FILE_ADDRESS, 0, 0, aSize, 4, 4, FILE_GPR, 0);

[Mesa-dev] [PATCH 3/3] nv50: add support for geometry shaders

2013-04-17 Thread Bryan Cain

Layer output probably doesn't work yet, but other than that everything seems
to be working.
---
 .../drivers/nv50/codegen/nv50_ir_emit_nv50.cpp |   25 +++-
 src/gallium/drivers/nv50/nv50_program.c|   17 +
 src/gallium/drivers/nv50/nv50_shader_state.c   |2 ++
 src/gallium/drivers/nv50/nv50_tex.c|2 ++
 4 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
index 67b6298..d1d8b52 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_emit_nv50.cpp
@@ -493,7 +493,12 @@ CodeEmitterNV50::emitForm_MAD(const Instruction *i)
setSrc(i, 1, 1);
setSrc(i, 2, 2);
 
-   setAReg16(i, 1);
+   if (i-getIndirect(0, 0)) {
+  assert(!i-getIndirect(1, 0));
+  setAReg16(i, 0);
+   }
+   else
+  setAReg16(i, 1);
 }
 
 // like default form, but 2nd source in slot 2, and no 3rd source
@@ -512,7 +517,12 @@ CodeEmitterNV50::emitForm_ADD(const Instruction *i)
setSrc(i, 0, 0);
setSrc(i, 1, 2);
 
-   setAReg16(i, 1);
+   if (i-getIndirect(0, 0)) {
+  assert(!i-getIndirect(1, 0));
+  setAReg16(i, 0);
+   }
+   else
+  setAReg16(i, 1);
 }
 
 // default short form (rr, ar, rc, gr)
@@ -602,8 +612,11 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
 
switch (sf) {
case FILE_SHADER_INPUT:
-  // use 'mov' where we can
-  code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001;
+  if(progType == Program::TYPE_GEOMETRY)
+ code[0] = 0x1181;
+  else
+ // use 'mov' where we can
+ code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001;
   code[1] = 0x0020 | (i-lanes  14);
   if (typeSizeof(i-dType) == 4)
  code[1] |= 0x0400;
@@ -1399,8 +1412,8 @@ CodeEmitterNV50::emitShift(const Instruction *i)
 void
 CodeEmitterNV50::emitOUT(const Instruction *i)
 {
-   code[0] = (i-op == OP_EMIT) ? 0xf200 : 0xf400;
-   code[1] = 0xc001;
+   code[0] = (i-op == OP_EMIT) ? 0xf201 : 0xf401;
+   code[1] = 0xc000;
 
emitFlagsRd(i);
 }
diff --git a/src/gallium/drivers/nv50/nv50_program.c 
b/src/gallium/drivers/nv50/nv50_program.c
index c17ffdc..1229002 100644
--- a/src/gallium/drivers/nv50/nv50_program.c
+++ b/src/gallium/drivers/nv50/nv50_program.c
@@ -359,6 +359,23 @@ nv50_program_translate(struct nv50_program *prog, uint16_t 
chipset)
   if (info-prop.fp.usesDiscard)
  prog-fp.flags[0] |= NV50_3D_FP_CONTROL_USES_KIL;
}
+   else if (prog-type == PIPE_SHADER_GEOMETRY) {
+  switch(info-prop.gp.outputPrim) {
+  case PIPE_PRIM_POINTS:
+ prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_POINTS;
+ break;
+  case PIPE_PRIM_LINE_STRIP:
+ prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_LINE_STRIP;
+ break;
+  case PIPE_PRIM_TRIANGLE_STRIP:
+ prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_TRIANGLE_STRIP;
+ break;
+  default:
+ assert(0);
+ break;
+  }
+  prog-gp.vert_count = info-prop.gp.maxVertices;
+   }
 
if (prog-pipe.stream_output.num_outputs)
   prog-so = nv50_program_create_strmout_state(info,
diff --git a/src/gallium/drivers/nv50/nv50_shader_state.c 
b/src/gallium/drivers/nv50/nv50_shader_state.c
index 7f05243..613d257 100644
--- a/src/gallium/drivers/nv50/nv50_shader_state.c
+++ b/src/gallium/drivers/nv50/nv50_shader_state.c
@@ -193,6 +193,8 @@ nv50_gmtyprog_validate(struct nv50_context *nv50)
struct nv50_program *gp = nv50-gmtyprog;
 
if (gp) {
+  if(!nv50_program_validate(nv50, gp))
+ return;
   BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_TEMP), 1);
   PUSH_DATA (push, gp-max_gpr);
   BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_RESULT), 1);
diff --git a/src/gallium/drivers/nv50/nv50_tex.c 
b/src/gallium/drivers/nv50/nv50_tex.c
index 40b264d..be70fda 100644
--- a/src/gallium/drivers/nv50/nv50_tex.c
+++ b/src/gallium/drivers/nv50/nv50_tex.c
@@ -293,6 +293,7 @@ void nv50_validate_textures(struct nv50_context *nv50)
boolean need_flush;
 
need_flush  = nv50_validate_tic(nv50, 0);
+   need_flush  = nv50_validate_tic(nv50, 1);
need_flush |= nv50_validate_tic(nv50, 2);
 
if (need_flush) {
@@ -343,6 +344,7 @@ void nv50_validate_samplers(struct nv50_context *nv50)
boolean need_flush;
 
need_flush  = nv50_validate_tsc(nv50, 0);
+   need_flush  = nv50_validate_tsc(nv50, 1);
need_flush |= nv50_validate_tsc(nv50, 2);
 
if (need_flush) {
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GS plans?

2013-01-30 Thread Bryan Cain

On 01/29/2013 12:44 PM, Paul Berry wrote:
 On 29 January 2013 09:33, Ian Romanick i...@freedesktop.org
 mailto:i...@freedesktop.org wrote:

 On 01/29/2013 09:02 AM, Brian Paul wrote:


 Hi Bryan,

 Back in July you announced your work on geometry shaders:
 http://lists.freedesktop.org/archives/mesa-dev/2012-July/024792.html

 But it looks like it hasn't been touched since October.  I was
 wondering
 what plans you might have for that.

 I believe the Intel guys also were planning on tackling GS
 support in
 Mesa this year.  Ian, Eric, do you have some idea of when
 you'll be
 working on that?  Were you going to use Bryan's work?


 I believe Paul is going to chip away at a bit of it.  I think he's
 planning to use some of Bryan's work.  I believe they've been in
 contact, but I'm not 100% sure.


 Yes, we have.  It looks like Bryan made a lot of headway before he set
 the branch aside in October, so I'm planning to use that as a starting
 point.  My impression from talking to Bryan is that he doesn't have
 too much time/interest to put into geometry shaders these days, so for
 the moment I'm planning to take over responsibility for the branch myself.

 My first order of business is to write some piglit tests for geometry
 shaders, so that I don't have to push anything to Mesa master that
 isn't well-tested.  I have an nVidia system that I can use as a
 reference platform to make sure my tests are correct.  I also want to
 spend a little bit of time prototyping some i965 back-end code just to
 increase my familiarity with the problem domain and to find out if
 i965 hardware has any sharp corners I need to watch out for.  Once
 I've done those two things, my plan is to start rebasing Bryan's patch
 series onto master and sending it out for review piece by piece.

It's true that I don't have too much time/interest to put into geometry
shaders anymore, but before you get to the rebasing part of your plan, I
do hope to have reorganized my the commits from my
geometry-shaders-rebase branch into a more modular form which would be
suitable for review on the mailing list as a patch set.  I've already
started the process of figuring out what this should look like, so if
all goes well it will be ready by the time it's needed.  I'll notify you
(and the mailing list) when it's ready.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for array and structure elements

2012-10-29 Thread Bryan Cain

On 10/29/2012 06:22 AM, Andreas Boll wrote:
 2012/10/24 Andreas Boll andreas.boll@gmail.com:
 2012/10/24 Marek Olšák mar...@gmail.com:
 On Tue, Oct 23, 2012 at 11:52 PM, Bryan Cain bryanca...@gmail.com wrote:
 On 10/23/2012 04:50 PM, Brian Paul wrote:
 On 10/23/2012 10:58 AM, Bryan Cain wrote:
 This fixes an issue where glsl_to_tgsi_visior::get_opcode() would
 emit the
 wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT
 instead of
 GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float
 opcodes for
 operations on integer or boolean values dereferenced from an array or
 structure.  Assertions have been added to get_opcode() to prevent
 this bug
 from reappearing in the future.
 ---
   src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   21
 +++--
   1 file changed, 19 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 9146f24..cefc568 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction
 *ir, unsigned op,
   {
  int type = GLSL_TYPE_FLOAT;

 +   assert(src0.type != GLSL_TYPE_ARRAY);
 +   assert(src0.type != GLSL_TYPE_STRUCT);
 +   assert(src1.type != GLSL_TYPE_ARRAY);
 +   assert(src1.type != GLSL_TYPE_STRUCT);
 +
  if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
 type = GLSL_TYPE_FLOAT;
  else if (native_integers)
 @@ -1074,8 +1079,12 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
  assert(index == storage-index + (int)i);
   }
} else {
 -st_src_reg src(PROGRAM_STATE_VAR, index,
 -  native_integers ? ir-type-base_type :
 GLSL_TYPE_FLOAT);
 + /* We use GLSL_TYPE_FLOAT here regardless of the actual
 type of
 +  * the data being moved since MOV does not care about
 the type of
 +  * data it is moving, and we don't want to declare
 registers with
 +  * array or struct types.
 +  */
 +st_src_reg src(PROGRAM_STATE_VAR, index, GLSL_TYPE_FLOAT);
   src.swizzle = slots[i].swizzle;
   emit(ir, TGSI_OPCODE_MOV, dst, src);
   /* even a float takes up a whole vec4 reg in a
 struct/array. */
 @@ -2042,6 +2051,9 @@
 glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
  else
 src.swizzle = SWIZZLE_NOOP;

 +   /* Change the register type to the element type of the array. */
 +   src.type = ir-type-base_type;
 +
  this-result = src;
   }

 @@ -2067,6 +2079,7 @@
 glsl_to_tgsi_visitor::visit(ir_dereference_record *ir)
 this-result.swizzle = SWIZZLE_NOOP;

  this-result.index += offset;
 +   this-result.type = ir-type-base_type;
   }

   /**
 @@ -2286,6 +2299,10 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
 inst-dead_mask = inst-dst.writemask;
  } else {
 for (i = 0; i  type_size(ir-lhs-type); i++) {
 + if (ir-rhs-type-is_array())
 + r.type = ir-rhs-type-element_type()-base_type;
 + else if (ir-rhs-type-is_record())
 + r.type =
 ir-rhs-type-fields.structure[i].type-base_type;
emit(ir, TGSI_OPCODE_MOV, l, r);
l.index++;
r.index++;
 Thanks.  That seems to fix the test program that Jose posted to the
 piglit list (vs-all-equal-bool-array).

 Did you do a full piglit regression test?

 If so:

 Reviewed-by: Brian Paul bri...@vmware.com

 But we should probably note this as a candidate for the stable branches.

 -Brian
 I did a piglit regression test with -t shaders, which I think should
 cover everything.  And yes, this should be a candidate for the stable
 branches.  I just forgot to mention that in the commit message.
 actually is
 -t shaders actually covers a very small subset of all GLSL tests.
 Most of them are in the spec group.

 Marek
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 Tested-by: Andreas Boll andreas.boll@gmail.com

 I found no regression in piglit (quick-driver) on r600g (rv770),
 llvmpipe and softpipe.

 This patch fixes the following piglit tests:
 {fs,vs}-bool-array on r600g and softpipe
 {fs,vs}-int-array on r600g and softpipe
 vs-all-equal-bool-array on llvmpipe

 actually isThose tests pass for me on mesa 8.0.4.
 So this patch actually fixes regressions introduced somewhere on the road.

 Nice work Bryan!
 Sorry I've pushed this commit accidentally.
 I've reverted it for now.
 But I think this patch is ready to land on master.
 (with the stable note, the reviewed-by and tested-by)

 Andreas.

Yes, it is.  I just haven't gotten around to pushing it yet.  You can do
it if you'd like.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http

[Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for array and structure elements

2012-10-23 Thread Bryan Cain

This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the
wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of
GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for
operations on integer or boolean values dereferenced from an array or
structure.  Assertions have been added to get_opcode() to prevent this bug
from reappearing in the future.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9146f24..cefc568 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
 {
int type = GLSL_TYPE_FLOAT;

+   assert(src0.type != GLSL_TYPE_ARRAY);
+   assert(src0.type != GLSL_TYPE_STRUCT);
+   assert(src1.type != GLSL_TYPE_ARRAY);
+   assert(src1.type != GLSL_TYPE_STRUCT);
+
if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
   type = GLSL_TYPE_FLOAT;
else if (native_integers)
@@ -1074,8 +1079,12 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
assert(index == storage-index + (int)i);
 }
  } else {
-st_src_reg src(PROGRAM_STATE_VAR, index,
-  native_integers ? ir-type-base_type : GLSL_TYPE_FLOAT);
+   /* We use GLSL_TYPE_FLOAT here regardless of the actual type of
+* the data being moved since MOV does not care about the type 
of
+* data it is moving, and we don't want to declare registers 
with
+* array or struct types.
+*/
+st_src_reg src(PROGRAM_STATE_VAR, index, GLSL_TYPE_FLOAT);
 src.swizzle = slots[i].swizzle;
 emit(ir, TGSI_OPCODE_MOV, dst, src);
 /* even a float takes up a whole vec4 reg in a struct/array. */
@@ -2042,6 +2051,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
else
   src.swizzle = SWIZZLE_NOOP;
 
+   /* Change the register type to the element type of the array. */
+   src.type = ir-type-base_type;
+
this-result = src;
 }
 
@@ -2067,6 +2079,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_record *ir)
   this-result.swizzle = SWIZZLE_NOOP;
 
this-result.index += offset;
+   this-result.type = ir-type-base_type;
 }
 
 /**
@@ -2286,6 +2299,10 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
   inst-dead_mask = inst-dst.writemask;
} else {
   for (i = 0; i  type_size(ir-lhs-type); i++) {
+ if (ir-rhs-type-is_array())
+   r.type = ir-rhs-type-element_type()-base_type;
+ else if (ir-rhs-type-is_record())
+   r.type = ir-rhs-type-fields.structure[i].type-base_type;
  emit(ir, TGSI_OPCODE_MOV, l, r);
  l.index++;
  r.index++;
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mishandling of bool comparisons in GLSL-TGSI translator

2012-10-15 Thread Bryan Cain

Hi.  I'm still around to some extent, although I still need to submit
the changes to the ML for my geometry shader work from a couple of
months ago.  I don't have much free time anymore, but I'll try to take a
look at this soon.

-Bryan

On 10/15/2012 03:35 PM, Brian Paul wrote:
 Jose found a problem in the GLSL-TGSI translater with respect to
 boolean array comparisions when PIPE_SHADER_CAP_INTEGERS=1.  I dug
 into it a bit.

 The attached shader_runner case shows the issue.

 Basically, the comparison of boolean values (which are unsigned
 integers) winds up being done with TGSI_OPCODE_SEQ instead of USEQ:

 VERT
 DCL IN[0]
 DCL OUT[0], POSITION
 DCL OUT[1], GENERIC[12]
 DCL CONST[0..1]
 DCL TEMP[0], LOCAL
 DCL TEMP[1], LOCAL
 IMM UINT32 {4294967295, 0, 0, 0}
 IMM FLT32 {1., 0., 0., 0.}
   0: SEQ TEMP[0].x, CONST[0]., IMM[0].UINT operands!
   1: F2I TEMP[0].x, -TEMP[0]
   2: SEQ TEMP[1].x, CONST[1]., IMM[0].UINT operands!
   3: F2I TEMP[1].x, -TEMP[1]
   4: AND TEMP[0].x, TEMP[0]., TEMP[1].
   5: IF TEMP[0]. :0
   6:   MOV TEMP[0], IMM[1].xyyx
   7: ELSE :0
   8:   MOV TEMP[0], IMM[1].yxyx
   9: ENDIF
  10: MOV OUT[1], TEMP[0]
  11: MOV OUT[0], IN[0]
  12: END

 The correct code sequence should be:

   0: USEQ TEMP[0].x, CONST[0]., IMM[0].
   1: USEQ TEMP[1].x, CONST[1]., IMM[0].
   2: AND TEMP[0].x, TEMP[0]., TEMP[1].
   3: IF TEMP[0]. :0
   4:   MOV TEMP[0], IMM[1].xyyx
   5: ELSE :0
   6:   MOV TEMP[0], IMM[1].yxyx
   7: ENDIF
   8: MOV OUT[1], TEMP[0]
   9: MOV OUT[0], IN[0]
  10: END

 The attached patch for glsl_to_tgsi_visitor::get_opcode() fixes the
 issue but the first assert which I added fails if it's enabled.
 AFAICT, by time we get into glsl_to_tgsi_visitor::get_opcode() we
 should have scalar operands, not arrays.  It's not clear to me how to
 fix that.

 I'm hoping someone who's spent more time on glsl_to_tgsi.cpp (is Bryan
 Cain still around?) can take a look at this.  Thanks.

 -Brian


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for dereferenced array elements

2012-10-15 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9146f24..caa7e94 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
 {
int type = GLSL_TYPE_FLOAT;

+   assert(src0.type != GLSL_TYPE_ARRAY);
+   assert(src0.type != GLSL_TYPE_STRUCT);
+   assert(src1.type != GLSL_TYPE_ARRAY);
+   assert(src1.type != GLSL_TYPE_STRUCT);
+
if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
   type = GLSL_TYPE_FLOAT;
else if (native_integers)
@@ -2042,6 +2047,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
else
   src.swizzle = SWIZZLE_NOOP;
 
+   /* Change the register type to the element type of the array. */
+   src.type = ir-array-type-element_type()-base_type;
+
this-result = src;
 }
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl_to_tgsi: set correct register type for dereferenced array elements

2012-10-15 Thread Bryan Cain

On 10/15/2012 06:47 PM, Brian Paul wrote:
 On 10/15/2012 04:06 PM, Bryan Cain wrote:
 ---
   src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 
   1 file changed, 8 insertions(+)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 9146f24..caa7e94 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -633,6 +633,11 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction
 *ir, unsigned op,
   {
  int type = GLSL_TYPE_FLOAT;

 +   assert(src0.type != GLSL_TYPE_ARRAY);
 +   assert(src0.type != GLSL_TYPE_STRUCT);
 +   assert(src1.type != GLSL_TYPE_ARRAY);
 +   assert(src1.type != GLSL_TYPE_STRUCT);
 +
  if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
 type = GLSL_TYPE_FLOAT;
  else if (native_integers)
 @@ -2042,6 +2047,9 @@
 glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
  else
 src.swizzle = SWIZZLE_NOOP;

 +   /* Change the register type to the element type of the array. */
 +   src.type = ir-array-type-element_type()-base_type;
 +
  this-result = src;
   }


 Thanks, but this causes a ton of piglit regressions (many crashes or
 failures on the new assertions).

 -Brian


That's not surprising, since it was a quick and untested hack.  I will
look at the rest of the assertion failures the next time I have free time.

--Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/16] Assorted gallium shader, sampler, sampler_view changes

2012-08-10 Thread Bryan Cain

On 08/09/2012 10:11 PM, Brian Paul wrote:

 The following patches are steps toward some gallium API clean-ups.

 1. Eventually, replace
 pipe_context::bind_fragment/vertex/geometry/compute_sampler_states()
 with a single bind_sampler_states() entrypoint which takes a
 PIPE_SHADER_x to identify the shader stage and a 'start_slot' value
 like bind_compute_sampler_states() has.  The later seemed to be agreed
 upon a few weeks ago.

 2. Similarly for pipe_context::set_{shader}_sampler_views().

 3. Where possible, replace parallel arrays for
 vertex/fragment/geometry objects with a 2D array indexed by shader
 type.  For example:

 replace:
struct pipe_sampler_state *vert_samplers[MAX_SAMPLERS];
struct pipe_sampler_state *geom_samplers[MAX_SAMPLERS];
struct pipe_sampler_state *frag_samplers[MAX_SAMPLERS];
 with:
struct pipe_sampler_state *samplers[PIPE_SHADER_TYPES][MAX_SAMPLERS];

 4. Add support for geometry shader stuff in a few places like the
 state tracker.

 I've touched about half the drivers so far.  There's a fair bit of
 work to be done before actually changing p_context.h

 -Brian

Hi, I just wanted to say thank you for working on this.  Getting
sampling to work in my geometry shaders branch was something I was
really not looking forward to, and once this work lands in master, it
should allow me to finish my geometry shader work at least a couple of
weeks before I could otherwise.

-Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Support for EXT/ARB_geometry_shader4

2012-07-27 Thread Bryan Cain

Some of you already know about this from IRC, but recently I've been
working on adding support for GL_EXT_geometry_shader4 and
GL_ARB_geometry_shader4 (which are essentially identical) to Mesa.  A
significant amount of the required code was already there from Zack
Rusin's work on geometry shaders about 2 years ago (before the new GLSL
compiler was merged), so the bulk of my changes are adding support to
the GLSL compiler and glsl_to_tgsi, which is still not a small task.

Anyway, as of yesterday, all of the GLSL demos in mesa/demos that use
EXT/ARB_geometry_shader4 are working fully in my branch with softpipe. 
There are still some things that need doing and/or fixing before it
should be considered for merging into master, but for now, my geometry
shader work (including future updates) can be found in a branch of my
Mesa repository on GitHub at:
https://github.com/Plombo/mesa/tree/geometry-shaders .

It's worth noting that geometry shaders in GL 3.2 (GLSL 1.50) core are
slightly different than the geometry shaders in the EXT and ARB
extensions.  However, the main change in the core version is a change in
the way inputs are accessed which is rather TGSI-unfriendly, so when
GLSL 1.50 is implemented, we will have to lower GS inputs to the
extension way of doing things.  So all of this code is in fact necessary
for GL 3.2 core even though it only implements the extensions.

-Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: set position before i instead of i-next in NV50LoweringPreSSA::visit

2012-07-17 Thread Bryan Cain

Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962.
---
 .../drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp
index 39e0cfa..b688172 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_lowering_nv50.cpp
@@ -1034,7 +1034,7 @@ NV50LoweringPreSSA::visit(Instruction *i)
   bld.setPosition(i-prev, true);
else
if (i-next)
-  bld.setPosition(i-next, false);
+  bld.setPosition(i, false);
else
   bld.setPosition(i-bb, true);
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): docs: Update GL3.txt.

2012-07-12 Thread Bryan Cain

On 07/11/2012 12:24 AM, Eric Anholt wrote:
 Kenneth Graunke k...@kemper.freedesktop.org writes:
 inverse() has been done for a while.
 Does the inverse() builtin constant expression handling work for
 you?  It doesn't here.

 None of us know what highp change means;
 GLSL 1.40 spec: Make the default precision qualification for fragment
 shader be high. -- it was also on our task list.

Like the commit message said, precision qualifiers are entirely ignored
by the GLSL compiler - they don't even make it to the IR stage.  So
there's no such thing as a default here since it doesn't have a value
at all for any variable in the IR.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: fix transform feedback of unsubscripted gl_ClipDistance array

2012-06-17 Thread Bryan Cain


On 6/16/2012 5:43 PM, Marcin Slusarz wrote:

gl_ClipDistance needs special treatment in form of lowering pass
which transforms gl_ClipDistance representation from float[] to
vec4[]. There are 2 implementations - at glsl linker level (enabled
by LowerClipDistance option) and at glsl_to_tgsi level (enabled
unconditionally for gallium drivers). Second implementation is
incomplete - it does not take into account transform feedback (see
commit 642e5b413e0890b2070ba78fde42db381eaf02e5 mesa: Fix transform
feedback of unsubscripted gl_ClipDistance array for details).

There are 2 possible fixes:
- adding transform feedback support into glsl_to_tgsi version
- ripping gl_ClipDistance support from glsl_to_tgsi and enabling
   gl_ClipDistance lowering on glsl linker side

This patch implements 2nd option. All it does is:
- reverts most of the commit 59be691638200797583bce39a83f641d30d97492
   st/mesa: add support for gl_ClipDistance
- changes LowerClipDistance to true

Fixes Piglit tests EXT_transform_feedback/builtin-varyings
gl_ClipDistance[{2,3,4,5,6,7,8}]-no-subscript on nv50.
---


I can't say I know much about how transform feedback works, but is there 
a reason that the first fix would be difficult?  It seems like a waste 
to not take advantage of hardware support for clip distances because the 
current implementation isn't complete.


-Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: add an IABS opcode to TGSI

2012-01-07 Thread Bryan Cain

This is a necessary operation that is missing from TGSI.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c |4 
 src/gallium/auxiliary/tgsi/tgsi_info.c |1 +
 src/gallium/docs/source/tgsi.rst   |   13 +
 src/gallium/include/pipe/p_shader_tokens.h |3 ++-
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 7ea8511..3e2b899 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -4193,6 +4193,10 @@ exec_instruction(
   exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_UINT, 
TGSI_EXEC_DATA_UINT);
   break;
 
+   case TGSI_OPCODE_IABS:
+  exec_vector_unary(mach, inst, micro_iabs, TGSI_EXEC_DATA_INT, 
TGSI_EXEC_DATA_INT);
+  break;
+
default:
   assert( 0 );
}
diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 6cd580a..c9acdb9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -192,6 +192,7 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =

{ 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL },
{ 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP },
+   { 1, 1, 0, 0, 0, 0, IABS, TGSI_OPCODE_IABS },
 };
 
 const struct tgsi_opcode_info *
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 45af528..7e7010f 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -1043,6 +1043,19 @@ XXX so let's discuss it, yeah?
   destination register, which is assumed to be an address (ADDR) register.
 
 
+.. opcode:: IABS - Integer Absolute Value
+
+.. math::
+
+  dst.x = |src.x|
+
+  dst.y = |src.y|
+
+  dst.z = |src.z|
+
+  dst.w = |src.w|
+
+
 .. opcode:: SAD - Sum Of Absolute Differences
 
 .. math::
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index b24b64c..75e17a1 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -376,8 +376,9 @@ struct tgsi_property_data {
 
 #define TGSI_OPCODE_UARL157
 #define TGSI_OPCODE_UCMP158
+#define TGSI_OPCODE_IABS159
 
-#define TGSI_OPCODE_LAST159
+#define TGSI_OPCODE_LAST160
 
 #define TGSI_SAT_NONE0  /* do not saturate */
 #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] softpipe GL3 status

2012-01-07 Thread Bryan Cain

On 01/07/2012 02:44 AM, Dave Airlie wrote:
 Let's add new opcodes for things like this.
 I'll add IABS.

I sent a patch to the mailing list that adds IABS about an hour ago,
before reading this message.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] glsl_to_tgsi: Create a new variable_store class replacing variables field in glsl_to_tgsi_visitor

2012-01-07 Thread Bryan Cain

This is good work.  I just have a few suggested changes.

On 01/07/2012 12:26 PM, Vincent Lejeune wrote:
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  414 
 +---
  1 files changed, 309 insertions(+), 105 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index cecceca..17ae525 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -230,14 +230,16 @@ public:
  class variable_storage : public exec_node {
  public:
 variable_storage(ir_variable *var, gl_register_file file, int index)
 -  : file(file), index(index), var(var)
 +  : file(file), index(index), type(var-type), 
 is_array(var-type-is_array() || var-type-is_record() || 
 var-type-is_matrix()), is_reladdressed(false)
 {
/* empty */
 }
  
 gl_register_file file;
 int index;
 -   ir_variable *var; /* variable that maps to this, if any */
 +   const glsl_type *type; /* variable that maps to this, if any */

The comment there doesn't make sense anymore.

 +   bool is_array;
 +   bool is_reladdressed;
  };
  
  class immediate_storage : public exec_node {
 @@ -286,6 +288,242 @@ public:
 st_src_reg return_reg;
  };
  
 +static int type_size(const glsl_type *type);
 +static int swizzle_for_size(int size);
 +
 +
 +/**
 + * Single place to store all temporary values (either explicit - 
 ir_variable* -
 + * or implicit - returned by get_temp -).

I don't think the last  - is supposed to be there.

 + *
 + * Explicit temps are stored in variables hash_table.
 + * Implicit temps are stored in rvalue_regs array.
 + */
 +class variable_store {
 +   friend class glsl_to_tgsi_variable_allocator;
 +protected:
 +   void *mem_ctx;
 +   hash_table* variables;
 +   unsigned next_temp;
 +   unsigned next_temp_array;
 +   static void reindex_reladdress(const void *, void *, void *);
 +   static void reindex_non_reladdress(const void *, void *, void *);
 +   void reindex_rvalue();
 +   void reindex_rvalue_reladdressed();
 +   variable_storage* rvalue_regs;
 +   unsigned rvalue_regs_count;
 +
 +public:
 +   bool native_integers;
 +   unsigned temp_amount() const;
 +   unsigned temp_array_amount() const;
 +   variable_store();
 +   ~variable_store();
 +   variable_storage *find_variable_storage(class ir_variable *var) const;
 +   variable_storage *push(class ir_variable *, gl_register_file, int);
 +   variable_storage *push(class ir_variable *);
 +   variable_storage *retrieve_anonymous_temp(unsigned);
 +   st_src_reg get_temp(const glsl_type *type);
 +   void optimise_access(void);
 +   unsigned *reindex_table;
 +};
 +
 +unsigned
 +variable_store::temp_amount() const
 +{
 +   return next_temp;
 +}
 +
 +unsigned
 +variable_store::temp_array_amount() const
 +{
 +   return next_temp_array;
 +}
 +
 +variable_store::variable_store():mem_ctx(ralloc_context(NULL)),next_temp(1),next_temp_array(1),rvalue_regs(NULL),rvalue_regs_count(0)
 +{
 +   variables = 
 hash_table_ctor(0,hash_table_pointer_hash,hash_table_pointer_compare);
 +}
 +
 +variable_store::~variable_store()
 +{
 +   hash_table_dtor(variables);
 +   ralloc_free(mem_ctx);
 +}
 +
 +variable_storage *
 +variable_store::find_variable_storage(ir_variable *var) const
 +{
 +   return (class variable_storage *) hash_table_find(variables,var);

The convention in the rest of glsl_to_tgsi (and Mesa) is to put a space
after the comma separating arguments.  I'd prefer it if this patch
followed the same convention.

 +}
 +
 +variable_storage*
 +variable_store::push(class ir_variable *var, gl_register_file file, int 
 index)
 +{
 +   variable_storage *storage = new (mem_ctx) 
 variable_storage(var,file,index);
 +   hash_table_insert(variables,storage,var);
 +   return storage;
 +}
 +
 +variable_storage*
 +variable_store::push(ir_variable *ir)
 +{
 +   variable_storage* retval = push(ir, PROGRAM_TEMPORARY, next_temp);
 +   next_temp += type_size(ir-type);
 +   if (ir-type-is_array() || ir-type-is_record() || 
 ir-type-is_matrix()) {
 +  retval-is_array = true;
 +   }
 +   return retval;
 +}
 +
 +variable_storage*
 +variable_store::retrieve_anonymous_temp(unsigned reg)
 +{
 +   for (unsigned i = 0; i  rvalue_regs_count; i++) {
 +  unsigned range_start = rvalue_regs[i].index;
 +  unsigned range_end = range_start + type_size(rvalue_regs[i].type) - 1;
 +  if (reg = range_start  reg = range_end) {
 + return rvalue_regs + i;
 + }
 +   }
 +   printf (Failed to get storage);
 +   exit(1);

I don't think writing to stdout and exiting cleanly like this is the
right way to handle a fatal error.  It should probably write the error
message to stderr (making clear that it's a fatal error in Mesa, not an
error in the application) and assert (so that a debugger can catch it),
then use call exit(1).

 +}
 +
 +/**
 + * In the initial pass of codegen, we assign temporary numbers to
 + * intermediate results.  (not SSA -- variable

Re: [Mesa-dev] softpipe GL3 status

2012-01-06 Thread Bryan Cain

On 01/06/2012 01:26 PM, Ian Romanick wrote:
 On 01/06/2012 09:04 AM, Dave Airlie wrote:
 Hi guys,

 Just a quick note, I've just spent a week or so trying to see where
 gallium and softpipe were w.r.t GL3.0 support.

 I've pushed a branch to my repo called softpipe-gl3. It contains
 patches in various state of usefulness but it brings the piglit
 results to 220 failures in 7623 tests, which isn't bad.

 Outstanding known problems (stuff I've dug into).

 smooth interpolation is broken in softpipe, worth about 70-100 fixes
 at a quick guess.

 integer abs - we have no TGSI representation for this, should we lower
 it to something?

 Or just generate some TGSI instructions to implement it.  You should
 be able to fake it with a CMP-like instruction.  I think that's how
 i915 does it in hardware.

Depends on whether there's any hardware with a native integer abs
instruciton.  If there is, we should just add a new IABS instruction to
TGSI and let drivers implement it how they want.  Otherwise, your
suggestion should work.


 integer SSG (set sign) - no TGSI for this, lower it?

 Where is SSG being generated?  I thought ir_to_mesa was the only thing
 that generated it, and Gallium shouldn't hit that path.

glsl_to_tgsi still generates the TGSI equivalent; that part hasn't been
changed from ir_to_mesa.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2 v4] Add support for clip distances in Gallium

2012-01-04 Thread Bryan Cain

On 01/02/2012 02:58 PM, Bryan Cain wrote:
 This is the fourth revision of my changes to add support for gl_ClipDistance
 with Gallium.  The first three revisions were sent in closer succession about
 two weeks ago.  This revision is identical to v3 except for the inclusion of
 the changes suggested by Brian Paul in reply to the v3 patches.

Does anyone mind if I push this?  It's been on the list for two full
days with no objections.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2 v4] Add support for clip distances in Gallium

2012-01-02 Thread Bryan Cain

This is the fourth revision of my changes to add support for gl_ClipDistance
with Gallium.  The first three revisions were sent in closer succession about
two weeks ago.  This revision is identical to v3 except for the inclusion of
the changes suggested by Brian Paul in reply to the v3 patches.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium: add support for clip distances

2012-01-02 Thread Bryan Cain

---
 src/gallium/auxiliary/tgsi/tgsi_dump.c |3 +-
 src/gallium/auxiliary/tgsi/tgsi_text.c |3 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c |   38 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |6 
 src/gallium/include/pipe/p_shader_tokens.h |3 +-
 5 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e830aa5..bd299b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -129,7 +129,8 @@ static const char *semantic_names[] =
PRIM_ID,
INSTANCEID,
VERTEXID,
-   STENCIL
+   STENCIL,
+   CLIPDIST
 };
 
 static const char *immediate_type_names[] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index eb9190c..f46ba19 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -1024,7 +1024,8 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] =
PRIM_ID,
INSTANCEID,
VERTEXID,
-   STENCIL
+   STENCIL,
+   CLIPDIST
 };
 
 static const char *interpolate_names[TGSI_INTERPOLATE_COUNT] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 17f9ce2..0f9aa3a 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -122,6 +122,7 @@ struct ureg_program
struct {
   unsigned semantic_name;
   unsigned semantic_index;
+  unsigned usage_mask; /* = TGSI_WRITEMASK_* */
} output[UREG_MAX_OUTPUT];
unsigned nr_outputs;
 
@@ -396,21 +397,27 @@ ureg_DECL_system_value(struct ureg_program *ureg,
 
 
 struct ureg_dst 
-ureg_DECL_output( struct ureg_program *ureg,
-  unsigned name,
-  unsigned index )
+ureg_DECL_output_masked( struct ureg_program *ureg,
+ unsigned name,
+ unsigned index,
+ unsigned usage_mask )
 {
unsigned i;
 
+   assert(usage_mask != 0);
+
for (i = 0; i  ureg-nr_outputs; i++) {
   if (ureg-output[i].semantic_name == name 
-  ureg-output[i].semantic_index == index) 
+  ureg-output[i].semantic_index == index) { 
+ ureg-output[i].usage_mask |= usage_mask;
  goto out;
+  }
}
 
if (ureg-nr_outputs  UREG_MAX_OUTPUT) {
   ureg-output[i].semantic_name = name;
   ureg-output[i].semantic_index = index;
+  ureg-output[i].usage_mask = usage_mask;
   ureg-nr_outputs++;
}
else {
@@ -422,6 +429,15 @@ out:
 }
 
 
+struct ureg_dst 
+ureg_DECL_output( struct ureg_program *ureg,
+  unsigned name,
+  unsigned index )
+{
+   return ureg_DECL_output_masked(ureg, name, index, TGSI_WRITEMASK_XYZW);
+}
+
+
 /* Returns a new constant register.  Keep track of which have been
  * referred to so that we can emit decls later.
  *
@@ -1181,7 +1197,8 @@ emit_decl_semantic(struct ureg_program *ureg,
unsigned file,
unsigned index,
unsigned semantic_name,
-   unsigned semantic_index)
+   unsigned semantic_index,
+   unsigned usage_mask)
 {
union tgsi_any_token *out = get_tokens(ureg, DOMAIN_DECL, 3);
 
@@ -1189,7 +1206,7 @@ emit_decl_semantic(struct ureg_program *ureg,
out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION;
out[0].decl.NrTokens = 3;
out[0].decl.File = file;
-   out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; /* FIXME! */
+   out[0].decl.UsageMask = usage_mask;
out[0].decl.Semantic = 1;
 
out[1].value = 0;
@@ -1427,7 +1444,8 @@ static void emit_decls( struct ureg_program *ureg )
 TGSI_FILE_INPUT,
 ureg-gs_input[i].index,
 ureg-gs_input[i].semantic_name,
-ureg-gs_input[i].semantic_index);
+ureg-gs_input[i].semantic_index,
+TGSI_WRITEMASK_XYZW);
   }
}
 
@@ -1436,7 +1454,8 @@ static void emit_decls( struct ureg_program *ureg )
  TGSI_FILE_SYSTEM_VALUE,
  ureg-system_value[i].index,
  ureg-system_value[i].semantic_name,
- ureg-system_value[i].semantic_index);
+ ureg-system_value[i].semantic_index,
+ TGSI_WRITEMASK_XYZW);
}
 
for (i = 0; i  ureg-nr_outputs; i++) {
@@ -1444,7 +1463,8 @@ static void emit_decls( struct ureg_program *ureg )
  TGSI_FILE_OUTPUT,
  i,
  ureg-output[i].semantic_name,
- ureg-output[i].semantic_index);
+ ureg-output[i].semantic_index,
+ ureg-output[i].usage_mask);
}
 
for (i = 0; i

[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance

2012-01-02 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   48 +---
 src/mesa/state_tracker/st_program.c|   18 ++
 2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 1515fc1..bc3005e 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -304,6 +304,7 @@ public:
int samplers_used;
bool indirect_addr_temps;
bool indirect_addr_consts;
+   int num_clip_distances;

int glsl_version;
bool native_integers;
@@ -4627,9 +4628,17 @@ st_translate_program(
   }
 
   for (i = 0; i  numOutputs; i++) {
- t-outputs[i] = ureg_DECL_output(ureg,
-  outputSemanticName[i],
-  outputSemanticIndex[i]);
+ if (outputSemanticName[i] == TGSI_SEMANTIC_CLIPDIST) {
+int mask = ((1  (program-num_clip_distances - 
4*outputSemanticIndex[i])) - 1)  TGSI_WRITEMASK_XYZW;
+t-outputs[i] = ureg_DECL_output_masked(ureg,
+outputSemanticName[i],
+outputSemanticIndex[i],
+mask);
+ } else {
+t-outputs[i] = ureg_DECL_output(ureg,
+ outputSemanticName[i],
+ outputSemanticIndex[i]);
+ }
  if ((outputSemanticName[i] == TGSI_SEMANTIC_PSIZE)  proginfo-Id) {
 /* Writing to the point size result register requires special
  * handling to implement clamping.
@@ -4806,7 +4815,8 @@ out:
 static struct gl_program *
 get_mesa_program(struct gl_context *ctx,
  struct gl_shader_program *shader_program,
-struct gl_shader *shader)
+ struct gl_shader *shader,
+ int num_clip_distances)
 {
glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor();
struct gl_program *prog;
@@ -4851,6 +4861,7 @@ get_mesa_program(struct gl_context *ctx,
v-options = options;
v-glsl_version = ctx-Const.GLSLVersion;
v-native_integers = ctx-Const.NativeIntegers;
+   v-num_clip_distances = num_clip_distances;
 
_mesa_generate_parameters_list_for_uniforms(shader_program, shader,
   prog-Parameters);
@@ -4980,6 +4991,25 @@ get_mesa_program(struct gl_context *ctx,
return prog;
 }
 
+/**
+ * Searches through the IR for a declaration of gl_ClipDistance and returns the
+ * declared size of the gl_ClipDistance array.  Returns 0 if gl_ClipDistance is
+ * not declared in the IR.
+ */
+int get_clip_distance_size(exec_list *ir)
+{
+   foreach_iter (exec_list_iterator, iter, *ir) {
+  ir_instruction *inst = (ir_instruction *)iter.get();
+  ir_variable *var = inst-as_variable();
+  if (var == NULL) continue;
+  if (!strcmp(var-name, gl_ClipDistance)) {
+ return var-type-length;
+  }
+   }
+   
+   return 0;
+}
+
 extern C {
 
 struct gl_shader *
@@ -5018,6 +5048,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name)
 GLboolean
 st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
 {
+   int num_clip_distances[MESA_SHADER_TYPES];
assert(prog-LinkStatus);
 
for (unsigned i = 0; i  MESA_SHADER_TYPES; i++) {
@@ -5029,6 +5060,11 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   const struct gl_shader_compiler_options *options =
 
ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)];
 
+  /* We have to determine the length of the gl_ClipDistance array before
+   * the array is lowered to two vec4s by lower_clip_distance().
+   */
+  num_clip_distances[i] = get_clip_distance_size(ir);
+
   do {
  progress = false;
 
@@ -5045,6 +5081,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   || progress;
 
  progress = lower_quadop_vector(ir, false) || progress;
+ progress = lower_clip_distance(ir) || progress;
 
  if (options-MaxIfDepth == 0)
 progress = lower_discard(ir) || progress;
@@ -5079,7 +5116,8 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (prog-_LinkedShaders[i] == NULL)
  continue;
 
-  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]);
+  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i],
+ num_clip_distances[i]);
 
   if (linked_prog) {
 static const GLenum targets[] = {
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index d62bfcd..aceaaf8 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -244,6

[Mesa-dev] [PATCH 1/2] gallium: add support for clip distances

2011-12-17 Thread Bryan Cain

---
 src/gallium/auxiliary/tgsi/tgsi_dump.c |3 +-
 src/gallium/auxiliary/tgsi/tgsi_text.c |3 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c |   36 +---
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |6 
 src/gallium/include/pipe/p_shader_tokens.h |3 +-
 5 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e830aa5..bd299b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -129,7 +129,8 @@ static const char *semantic_names[] =
PRIM_ID,
INSTANCEID,
VERTEXID,
-   STENCIL
+   STENCIL,
+   CLIPDIST
 };
 
 static const char *immediate_type_names[] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index eb9190c..f46ba19 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -1024,7 +1024,8 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] =
PRIM_ID,
INSTANCEID,
VERTEXID,
-   STENCIL
+   STENCIL,
+   CLIPDIST
 };
 
 static const char *interpolate_names[TGSI_INTERPOLATE_COUNT] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 17f9ce2..56c4492 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -122,6 +122,7 @@ struct ureg_program
struct {
   unsigned semantic_name;
   unsigned semantic_index;
+  unsigned usage_mask;
} output[UREG_MAX_OUTPUT];
unsigned nr_outputs;
 
@@ -396,21 +397,25 @@ ureg_DECL_system_value(struct ureg_program *ureg,
 
 
 struct ureg_dst 
-ureg_DECL_output( struct ureg_program *ureg,
-  unsigned name,
-  unsigned index )
+ureg_DECL_output_masked( struct ureg_program *ureg,
+ unsigned name,
+ unsigned index,
+ unsigned usage_mask )
 {
unsigned i;
 
for (i = 0; i  ureg-nr_outputs; i++) {
   if (ureg-output[i].semantic_name == name 
-  ureg-output[i].semantic_index == index) 
+  ureg-output[i].semantic_index == index) { 
+ ureg-output[i].usage_mask |= usage_mask;
  goto out;
+  }
}
 
if (ureg-nr_outputs  UREG_MAX_OUTPUT) {
   ureg-output[i].semantic_name = name;
   ureg-output[i].semantic_index = index;
+  ureg-output[i].usage_mask = usage_mask;
   ureg-nr_outputs++;
}
else {
@@ -422,6 +427,15 @@ out:
 }
 
 
+struct ureg_dst 
+ureg_DECL_output( struct ureg_program *ureg,
+  unsigned name,
+  unsigned index )
+{
+   return ureg_DECL_output_masked(ureg, name, index, TGSI_WRITEMASK_XYZW);
+}
+
+
 /* Returns a new constant register.  Keep track of which have been
  * referred to so that we can emit decls later.
  *
@@ -1181,7 +1195,8 @@ emit_decl_semantic(struct ureg_program *ureg,
unsigned file,
unsigned index,
unsigned semantic_name,
-   unsigned semantic_index)
+   unsigned semantic_index,
+   unsigned usage_mask)
 {
union tgsi_any_token *out = get_tokens(ureg, DOMAIN_DECL, 3);
 
@@ -1189,7 +1204,7 @@ emit_decl_semantic(struct ureg_program *ureg,
out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION;
out[0].decl.NrTokens = 3;
out[0].decl.File = file;
-   out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; /* FIXME! */
+   out[0].decl.UsageMask = usage_mask;
out[0].decl.Semantic = 1;
 
out[1].value = 0;
@@ -1427,7 +1442,8 @@ static void emit_decls( struct ureg_program *ureg )
 TGSI_FILE_INPUT,
 ureg-gs_input[i].index,
 ureg-gs_input[i].semantic_name,
-ureg-gs_input[i].semantic_index);
+ureg-gs_input[i].semantic_index,
+TGSI_WRITEMASK_XYZW);
   }
}
 
@@ -1436,7 +1452,8 @@ static void emit_decls( struct ureg_program *ureg )
  TGSI_FILE_SYSTEM_VALUE,
  ureg-system_value[i].index,
  ureg-system_value[i].semantic_name,
- ureg-system_value[i].semantic_index);
+ ureg-system_value[i].semantic_index,
+ TGSI_WRITEMASK_XYZW);
}
 
for (i = 0; i  ureg-nr_outputs; i++) {
@@ -1444,7 +1461,8 @@ static void emit_decls( struct ureg_program *ureg )
  TGSI_FILE_OUTPUT,
  i,
  ureg-output[i].semantic_name,
- ureg-output[i].semantic_index);
+ ureg-output[i].semantic_index,
+ ureg-output[i].usage_mask);
}
 
for (i = 0; i  ureg-nr_samplers; i++) {
diff --git

[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance

2011-12-17 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   49 +---
 src/mesa/state_tracker/st_program.c|   18 ++
 2 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index b929806..3e8df78 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -304,6 +304,7 @@ public:
int samplers_used;
bool indirect_addr_temps;
bool indirect_addr_consts;
+   int num_clip_distances;

int glsl_version;
bool native_integers;
@@ -4618,9 +4619,16 @@ st_translate_program(
   }
 
   for (i = 0; i  numOutputs; i++) {
- t-outputs[i] = ureg_DECL_output(ureg,
-  outputSemanticName[i],
-  outputSemanticIndex[i]);
+ if (outputSemanticName[i] == TGSI_SEMANTIC_CLIPDIST) {
+int mask = ((1  (program-num_clip_distances - 
4*outputSemanticIndex[i])) - 1)  0xf;
+t-outputs[i] = ureg_DECL_output_masked(ureg,
+outputSemanticName[i],
+outputSemanticIndex[i],
+mask);
+ } else
+t-outputs[i] = ureg_DECL_output(ureg,
+ outputSemanticName[i],
+ outputSemanticIndex[i]);
  if ((outputSemanticName[i] == TGSI_SEMANTIC_PSIZE)  proginfo-Id) {
 /* Writing to the point size result register requires special
  * handling to implement clamping.
@@ -4797,7 +4805,8 @@ out:
 static struct gl_program *
 get_mesa_program(struct gl_context *ctx,
  struct gl_shader_program *shader_program,
-struct gl_shader *shader)
+ struct gl_shader *shader,
+ int num_clip_distances)
 {
glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor();
struct gl_program *prog;
@@ -4842,6 +4851,7 @@ get_mesa_program(struct gl_context *ctx,
v-options = options;
v-glsl_version = ctx-Const.GLSLVersion;
v-native_integers = ctx-Const.NativeIntegers;
+   v-num_clip_distances = num_clip_distances;
 
_mesa_generate_parameters_list_for_uniforms(shader_program, shader,
   prog-Parameters);
@@ -4971,6 +4981,27 @@ get_mesa_program(struct gl_context *ctx,
return prog;
 }
 
+/**
+ * Searches through the IR for a declaration of gl_ClipDistance and returns the
+ * declared size of the gl_ClipDistance array.  Returns 0 if gl_ClipDistance is
+ * not declared in the IR.
+ */
+int get_clip_distance_size(exec_list *ir)
+{
+   foreach_iter (exec_list_iterator, iter, *ir) {
+  ir_instruction *inst = (ir_instruction *)iter.get();
+  ir_variable *var = inst-as_variable();
+  if (var == NULL) continue;
+  if (!strcmp(var-name, gl_ClipDistance))
+  {
+ fprintf(stderr, gl_ClipDistance found with size %i\n, 
var-type-length);
+ return var-type-length;
+  }
+   }
+   
+   return 0;
+}
+
 extern C {
 
 struct gl_shader *
@@ -5009,6 +5040,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name)
 GLboolean
 st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
 {
+   int num_clip_distances[MESA_SHADER_TYPES];
assert(prog-LinkStatus);
 
for (unsigned i = 0; i  MESA_SHADER_TYPES; i++) {
@@ -5020,6 +5052,11 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   const struct gl_shader_compiler_options *options =
 
ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)];
 
+  /* We have to determine the length of the gl_ClipDistance array before
+   * the array is lowered to two vec4s by lower_clip_distance().
+   */
+  num_clip_distances[i] = get_clip_distance_size(ir);
+
   do {
  progress = false;
 
@@ -5036,6 +5073,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   || progress;
 
  progress = lower_quadop_vector(ir, false) || progress;
+ progress = lower_clip_distance(ir) || progress;
 
  if (options-MaxIfDepth == 0)
 progress = lower_discard(ir) || progress;
@@ -5070,7 +5108,8 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (prog-_LinkedShaders[i] == NULL)
  continue;
 
-  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]);
+  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i],
+ num_clip_distances[i]);
 
   if (linked_prog) {
 static const GLenum targets[] = {
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index b83c561..b404503 100644
---

[Mesa-dev] [PATCH 1/2] gallium: add TGSI_SEMANTIC_CLIPDIST for clip distance

2011-12-13 Thread Bryan Cain

---
 src/gallium/auxiliary/tgsi/tgsi_dump.c |3 ++-
 src/gallium/include/pipe/p_shader_tokens.h |3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e830aa5..bd299b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -129,7 +129,8 @@ static const char *semantic_names[] =
PRIM_ID,
INSTANCEID,
VERTEXID,
-   STENCIL
+   STENCIL,
+   CLIPDIST
 };
 
 static const char *immediate_type_names[] =
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 10cfaf6..330e0ba 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -146,7 +146,8 @@ struct tgsi_declaration_dimension
 #define TGSI_SEMANTIC_INSTANCEID 10
 #define TGSI_SEMANTIC_VERTEXID   11
 #define TGSI_SEMANTIC_STENCIL12
-#define TGSI_SEMANTIC_COUNT  13 /** number of semantic values */
+#define TGSI_SEMANTIC_CLIPDIST   13
+#define TGSI_SEMANTIC_COUNT  14 /** number of semantic values */
 
 struct tgsi_declaration_semantic
 {
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance

2011-12-13 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |1 +
 src/mesa/state_tracker/st_program.c|   18 ++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9ef65c8..d50176d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5036,6 +5036,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   || progress;
 
  progress = lower_quadop_vector(ir, false) || progress;
+ progress = lower_clip_distance(ir) || progress;
 
  if (options-MaxIfDepth == 0)
 progress = lower_discard(ir) || progress;
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 04d3ef6..73581dd 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -244,6 +244,14 @@ st_prepare_vertex_program(struct gl_context *ctx,
 stvp-output_semantic_name[slot] = TGSI_SEMANTIC_PSIZE;
 stvp-output_semantic_index[slot] = 0;
 break;
+ case VERT_RESULT_CLIP_DIST0:
+stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+stvp-output_semantic_index[slot] = 0;
+break;
+ case VERT_RESULT_CLIP_DIST1:
+stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+stvp-output_semantic_index[slot] = 1;
+break;
  case VERT_RESULT_EDGE:
 assert(0);
 break;
@@ -541,6 +549,16 @@ st_translate_fragment_program(struct st_context *st,
input_semantic_index[slot] = 0;
interpMode[slot] = TGSI_INTERPOLATE_CONSTANT;
break;
+case FRAG_ATTRIB_CLIP_DIST0:
+   input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+   input_semantic_index[slot] = 0;
+   interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+   break;
+case FRAG_ATTRIB_CLIP_DIST1:
+   input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+   input_semantic_index[slot] = 1;
+   interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+   break;
/* In most cases, there is nothing special about these
 * inputs, so adopt a convention to use the generic
 * semantic name and the mesa FRAG_ATTRIB_ number as the
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

This is an updated version of the patch set I sent to the list a few hours
ago.  There is now a TGSI property called TGSI_PROPERTY_NUM_CLIP_DISTANCES
that drivers can use to determine how many of the 8 available clip distances
are actually used by a shader.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] st/mesa: add support for gl_ClipDistance

2011-12-13 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   39 ++-
 src/mesa/state_tracker/st_program.c|   18 +
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9ef65c8..5e54465 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -304,6 +304,7 @@ public:
int samplers_used;
bool indirect_addr_temps;
bool indirect_addr_consts;
+   int num_clip_distances;

int glsl_version;
bool native_integers;
@@ -4738,6 +4739,9 @@ st_translate_program(
  t-samplers[i] = ureg_DECL_sampler(ureg, i);
   }
}
+   
+   /* Set the number of clip distances used in the shader. */
+   ureg_property_num_clip_distances(ureg, program-num_clip_distances);
 
/* Emit each instruction in turn:
 */
@@ -4797,7 +4801,8 @@ out:
 static struct gl_program *
 get_mesa_program(struct gl_context *ctx,
  struct gl_shader_program *shader_program,
-struct gl_shader *shader)
+ struct gl_shader *shader,
+ int num_clip_distances)
 {
glsl_to_tgsi_visitor* v = new glsl_to_tgsi_visitor();
struct gl_program *prog;
@@ -4842,6 +4847,7 @@ get_mesa_program(struct gl_context *ctx,
v-options = options;
v-glsl_version = ctx-Const.GLSLVersion;
v-native_integers = ctx-Const.NativeIntegers;
+   v-num_clip_distances = num_clip_distances;
 
_mesa_generate_parameters_list_for_uniforms(shader_program, shader,
   prog-Parameters);
@@ -4971,6 +4977,27 @@ get_mesa_program(struct gl_context *ctx,
return prog;
 }
 
+/**
+ * Searches through the IR for a declaration of gl_ClipDistance and returns the
+ * declared size of the gl_ClipDistance array.  Returns 0 if gl_ClipDistance is
+ * not declared in the IR.
+ */
+int get_clip_distance_size(exec_list *ir)
+{
+   foreach_iter (exec_list_iterator, iter, *ir) {
+  ir_instruction *inst = (ir_instruction *)iter.get();
+  ir_variable *var = inst-as_variable();
+  if (var == NULL) continue;
+  if (!strcmp(var-name, gl_ClipDistance))
+  {
+ fprintf(stderr, gl_ClipDistance found with size %i\n, 
var-type-length);
+ return var-type-length;
+  }
+   }
+   
+   return 0;
+}
+
 extern C {
 
 struct gl_shader *
@@ -5009,6 +5036,7 @@ st_new_shader_program(struct gl_context *ctx, GLuint name)
 GLboolean
 st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
 {
+   int num_clip_distances[MESA_SHADER_TYPES];
assert(prog-LinkStatus);
 
for (unsigned i = 0; i  MESA_SHADER_TYPES; i++) {
@@ -5020,6 +5048,11 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   const struct gl_shader_compiler_options *options =
 
ctx-ShaderCompilerOptions[_mesa_shader_type_to_index(prog-_LinkedShaders[i]-Type)];
 
+  /* We have to determine the length of the gl_ClipDistance array before 
the
+   * array is lowered to two vec4s by lower_clip_distance().
+   */
+  num_clip_distances[i] = get_clip_distance_size(ir);
+
   do {
  progress = false;
 
@@ -5036,6 +5069,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   || progress;
 
  progress = lower_quadop_vector(ir, false) || progress;
+ progress = lower_clip_distance(ir) || progress;
 
  if (options-MaxIfDepth == 0)
 progress = lower_discard(ir) || progress;
@@ -5070,7 +5104,8 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (prog-_LinkedShaders[i] == NULL)
  continue;
 
-  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i]);
+  linked_prog = get_mesa_program(ctx, prog, prog-_LinkedShaders[i],
+ num_clip_distances[i]);
 
   if (linked_prog) {
 static const GLenum targets[] = {
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 04d3ef6..73581dd 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -244,6 +244,14 @@ st_prepare_vertex_program(struct gl_context *ctx,
 stvp-output_semantic_name[slot] = TGSI_SEMANTIC_PSIZE;
 stvp-output_semantic_index[slot] = 0;
 break;
+ case VERT_RESULT_CLIP_DIST0:
+stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+stvp-output_semantic_index[slot] = 0;
+break;
+ case VERT_RESULT_CLIP_DIST1:
+stvp-output_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
+stvp-output_semantic_index[slot] = 1;
+break;
  case VERT_RESULT_EDGE:
 assert(0);
 break;
@@ -541,6 +549,16 @@ st_translate_fragment_program(struct st_context *st,
input_semantic_index[slot]

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the list a few
 hours
 ago.  
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8 available clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily derived from the 
 shader, and queried through src/gallium/auxiliary/tgsi/tgsi_scan.h ?

No.  The clip distances can be indirectly addressed (there are up to 2
of them in vec4 form for a total of 8 floats), which makes it impossible
to determine which ones are used by analyzing the shader.

 Could you also elaborate on why TGSI_SEMANTIC_CLIPDIST is useful for the 
 drivers? I personally don't have nothing against it, but just like to 
 understand why it makes a difference.

 Jose

Unless my understanding of clip distances is wrong, the GPU uses clip
distances to decide whether a vertex should be clipped, and thus needs
to know which of the vertex shader outputs are clip distances.

Bryan


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 03:09 PM, Jose Fonseca wrote:

 - Original Message -
 On 12/13/2011 12:26 PM, Bryan Cain wrote:
 On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the list a
 few
 hours
 ago.
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8 available
 clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
 derived from the shader, and queried through
 src/gallium/auxiliary/tgsi/tgsi_scan.h ?
 No.  The clip distances can be indirectly addressed (there are up
 to 2
 of them in vec4 form for a total of 8 floats), which makes it
 impossible
 to determine which ones are used by analyzing the shader.
 The description is almost complete. :)  The issue is that the shader
 may
 declare

 out float gl_ClipDistance[4];

 the use non-constant addressing of the array.  The compiler knows
 that
 gl_ClipDistance has at most 4 elements, but post-hoc analysis would
 not
 be able to determine that.  Often the fixed-function hardware (see
 below) needs to know which clip distance values are actually written.
 But don't all the clip distances written by the shader need to be declared?

 E.g.:

 DCL OUT[0], CLIPDIST[0]
 DCL OUT[1], CLIPDIST[1]
 DCL OUT[2], CLIPDIST[2]
 DCL OUT[3], CLIPDIST[3]

 therefore a trivial analysis of the declarations convey that?

No.  Clip distance is an array of up to 8 floats in GLSL, but it's
represented in the hardware as 2 vec4s.  You can tell by analyzing the
declarations whether there are more than 4 clip distances in use, but
not which components the shader writes to. 
TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use, not
the number of full vectors.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] gallium: add TGSI_SEMANTIC_CLIPDIST for clip distance

2011-12-13 Thread Bryan Cain

On 12/13/2011 03:07 PM, Brian Paul wrote:
 On Tue, Dec 13, 2011 at 9:59 AM, Bryan Cain bryanca...@gmail.com wrote:
 ---
  src/gallium/auxiliary/tgsi/tgsi_dump.c |3 ++-
  src/gallium/include/pipe/p_shader_tokens.h |3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
 b/src/gallium/auxiliary/tgsi/tgsi_dump.c
 index e830aa5..bd299b0 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
 @@ -129,7 +129,8 @@ static const char *semantic_names[] =
PRIM_ID,
INSTANCEID,
VERTEXID,
 -   STENCIL
 +   STENCIL,
 +   CLIPDIST
  };

  static const char *immediate_type_names[] =
 diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
 b/src/gallium/include/pipe/p_shader_tokens.h
 index 10cfaf6..330e0ba 100644
 --- a/src/gallium/include/pipe/p_shader_tokens.h
 +++ b/src/gallium/include/pipe/p_shader_tokens.h
 @@ -146,7 +146,8 @@ struct tgsi_declaration_dimension
  #define TGSI_SEMANTIC_INSTANCEID 10
  #define TGSI_SEMANTIC_VERTEXID   11
  #define TGSI_SEMANTIC_STENCIL12
 -#define TGSI_SEMANTIC_COUNT  13 /** number of semantic values */
 +#define TGSI_SEMANTIC_CLIPDIST   13
 +#define TGSI_SEMANTIC_COUNT  14 /** number of semantic values */

  struct tgsi_declaration_semantic
  {
 Reviewed-by: Brian Paul bri...@vmware.com

Thanks for reviewing, but this is the first version of this patch. 
There's an updated one that was sent a few hours after this one.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 03:25 PM, Jose Fonseca wrote:

 - Original Message -
 On 12/13/2011 03:09 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 12:26 PM, Bryan Cain wrote:
 On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the list
 a
 few
 hours
 ago.
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8 available
 clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
 derived from the shader, and queried through
 src/gallium/auxiliary/tgsi/tgsi_scan.h ?
 No.  The clip distances can be indirectly addressed (there are up
 to 2
 of them in vec4 form for a total of 8 floats), which makes it
 impossible
 to determine which ones are used by analyzing the shader.
 The description is almost complete. :)  The issue is that the
 shader
 may
 declare

 out float gl_ClipDistance[4];

 the use non-constant addressing of the array.  The compiler knows
 that
 gl_ClipDistance has at most 4 elements, but post-hoc analysis
 would
 not
 be able to determine that.  Often the fixed-function hardware (see
 below) needs to know which clip distance values are actually
 written.
 But don't all the clip distances written by the shader need to be
 declared?

 E.g.:

 DCL OUT[0], CLIPDIST[0]
 DCL OUT[1], CLIPDIST[1]
 DCL OUT[2], CLIPDIST[2]
 DCL OUT[3], CLIPDIST[3]

 therefore a trivial analysis of the declarations convey that?
 No.  Clip distance is an array of up to 8 floats in GLSL, but it's
 represented in the hardware as 2 vec4s.  You can tell by analyzing
 the
 declarations whether there are more than 4 clip distances in use, but
 not which components the shader writes to.
 TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use,
 not
 the number of full vectors.
 Lets imagine 

   out float gl_ClipDistance[6];

 Each a clip distance is a scalar float.

 Either all hardware represents the 8 clip distances as two 4 vectors, and we 
 do:

   DCL OUT[0].xywz, CLIPDIST[0]
   DCL OUT[1].xy, CLIPDIST[1]

 using the full range of struct tgsi_declaration::UsageMask [1] or we 
 represent them as as scalars:

   DCL OUT[0].x, CLIPDIST[0]
   DCL OUT[1].x, CLIPDIST[1]
   DCL OUT[2].x, CLIPDIST[2]
   DCL OUT[3].x, CLIPDIST[3]
   DCL OUT[4].x, CLIPDIST[4]
   DCL OUT[5].x, CLIPDIST[5]

 If indirect addressing is allowed as I read bore, then maybe the later is 
 better.

 I confess my ignorance about clipping and maybe I'm being dense, but I still 
 don't see the need for this TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you 
 please draft an example TGSI shader showing this property (or just paste one 
 generated with your change)?  I think that would help a lot.


 Jose


 [1] I don't know if tgsi_dump pays much attention to  
 tgsi_declaration::UsageMask, but it does exist.

UsageMask might work, but before that can be considered a viable
solution, someone will need to make it possible to actually declare it
from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw no
matter what on all declared inputs and outputs.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 03:48 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:25 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:09 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 12:26 PM, Bryan Cain wrote:
 On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the
 list
 a
 few
 hours
 ago.
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8 available
 clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
 derived from the shader, and queried through
 src/gallium/auxiliary/tgsi/tgsi_scan.h ?
 No.  The clip distances can be indirectly addressed (there are
 up
 to 2
 of them in vec4 form for a total of 8 floats), which makes it
 impossible
 to determine which ones are used by analyzing the shader.
 The description is almost complete. :)  The issue is that the
 shader
 may
 declare

 out float gl_ClipDistance[4];

 the use non-constant addressing of the array.  The compiler
 knows
 that
 gl_ClipDistance has at most 4 elements, but post-hoc analysis
 would
 not
 be able to determine that.  Often the fixed-function hardware
 (see
 below) needs to know which clip distance values are actually
 written.
 But don't all the clip distances written by the shader need to be
 declared?

 E.g.:

 DCL OUT[0], CLIPDIST[0]
 DCL OUT[1], CLIPDIST[1]
 DCL OUT[2], CLIPDIST[2]
 DCL OUT[3], CLIPDIST[3]

 therefore a trivial analysis of the declarations convey that?
 No.  Clip distance is an array of up to 8 floats in GLSL, but it's
 represented in the hardware as 2 vec4s.  You can tell by analyzing
 the
 declarations whether there are more than 4 clip distances in use,
 but
 not which components the shader writes to.
 TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in
 use,
 not
 the number of full vectors.
 Lets imagine

   out float gl_ClipDistance[6];

 Each a clip distance is a scalar float.

 Either all hardware represents the 8 clip distances as two 4
 vectors, and we do:

   DCL OUT[0].xywz, CLIPDIST[0]
   DCL OUT[1].xy, CLIPDIST[1]

 using the full range of struct tgsi_declaration::UsageMask [1] or
 we represent them as as scalars:

   DCL OUT[0].x, CLIPDIST[0]
   DCL OUT[1].x, CLIPDIST[1]
   DCL OUT[2].x, CLIPDIST[2]
   DCL OUT[3].x, CLIPDIST[3]
   DCL OUT[4].x, CLIPDIST[4]
   DCL OUT[5].x, CLIPDIST[5]

 If indirect addressing is allowed as I read bore, then maybe the
 later is better.

 I confess my ignorance about clipping and maybe I'm being dense,
 but I still don't see the need for this
 TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
 example TGSI shader showing this property (or just paste one
 generated with your change)?  I think that would help a lot.


 Jose


 [1] I don't know if tgsi_dump pays much attention to
  tgsi_declaration::UsageMask, but it does exist.
 UsageMask might work, but before that can be considered a viable
 solution, someone will need to make it possible to actually declare
 it
 from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw no
 matter what on all declared inputs and outputs.
 ureg automatically fills the UsageMask from the destionation register masks, 
 since it easy to determine from the opcodes.

 Which leads me to my second point, if indirect addressing of CLIPDIST is 
 allowed, then we can't really pack the clip distance as 4-elem vectors in 
 TGSI: not only the syntax would be very weird, but it would create havoc on 
 all tgsi-translating code that makes decisions based on indirect addressing 
 of registers.

 That is, 

   float gl_ClipDistance[6];

   gl_ClipDistance[i] = foo;

 would become

DCL OUT[0].x, CLIPDIST[0]
DCL OUT[1].x, CLIPDIST[1]
DCL OUT[2].x, CLIPDIST[2]
DCL OUT[3].x, CLIPDIST[3]
DCL OUT[4].x, CLIPDIST[4]
DCL OUT[5].x, CLIPDIST[5]
MOV OUT[ADDR[0].x].x, foo

 and the info from TGSI_PROPERTY_NUM_CLIP_DISTANCES can be obtained by walking 
 the declaration (which can/should be done only once in tgsi_scan).

 But this just doesn't look like it would ever work:

DCL OUT[0].xyzw, CLIPDIST[0]
DCL OUT[1].xy  , CLIPDIST[1]
MOV OUT[ADDR[0].x]., foo

 Jose

If ureg automatically fills the UsageMask from the accessed components,
it's probably a better solution than the property.

About the indirect addressing of components: the GLSL compiler lowers
indirect addressing of the gl_ClipDistance array to indirect addressing
of the 2 vec4s, combined with conditional moves to the different
components.  Which is okay, because although indirect addressing of
gl_ClipDistance is allowed by the GLSL specification, meaning have to
support it, it's not something that's actually useful in practical
situations.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 03:48 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:25 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:09 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 12:26 PM, Bryan Cain wrote:
 On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the
 list
 a
 few
 hours
 ago.
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8 available
 clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
 derived from the shader, and queried through
 src/gallium/auxiliary/tgsi/tgsi_scan.h ?
 No.  The clip distances can be indirectly addressed (there are
 up
 to 2
 of them in vec4 form for a total of 8 floats), which makes it
 impossible
 to determine which ones are used by analyzing the shader.
 The description is almost complete. :)  The issue is that the
 shader
 may
 declare

 out float gl_ClipDistance[4];

 the use non-constant addressing of the array.  The compiler
 knows
 that
 gl_ClipDistance has at most 4 elements, but post-hoc analysis
 would
 not
 be able to determine that.  Often the fixed-function hardware
 (see
 below) needs to know which clip distance values are actually
 written.
 But don't all the clip distances written by the shader need to be
 declared?

 E.g.:

 DCL OUT[0], CLIPDIST[0]
 DCL OUT[1], CLIPDIST[1]
 DCL OUT[2], CLIPDIST[2]
 DCL OUT[3], CLIPDIST[3]

 therefore a trivial analysis of the declarations convey that?
 No.  Clip distance is an array of up to 8 floats in GLSL, but it's
 represented in the hardware as 2 vec4s.  You can tell by analyzing
 the
 declarations whether there are more than 4 clip distances in use,
 but
 not which components the shader writes to.
 TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in
 use,
 not
 the number of full vectors.
 Lets imagine

   out float gl_ClipDistance[6];

 Each a clip distance is a scalar float.

 Either all hardware represents the 8 clip distances as two 4
 vectors, and we do:

   DCL OUT[0].xywz, CLIPDIST[0]
   DCL OUT[1].xy, CLIPDIST[1]

 using the full range of struct tgsi_declaration::UsageMask [1] or
 we represent them as as scalars:

   DCL OUT[0].x, CLIPDIST[0]
   DCL OUT[1].x, CLIPDIST[1]
   DCL OUT[2].x, CLIPDIST[2]
   DCL OUT[3].x, CLIPDIST[3]
   DCL OUT[4].x, CLIPDIST[4]
   DCL OUT[5].x, CLIPDIST[5]

 If indirect addressing is allowed as I read bore, then maybe the
 later is better.

 I confess my ignorance about clipping and maybe I'm being dense,
 but I still don't see the need for this
 TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
 example TGSI shader showing this property (or just paste one
 generated with your change)?  I think that would help a lot.


 Jose


 [1] I don't know if tgsi_dump pays much attention to
  tgsi_declaration::UsageMask, but it does exist.
 UsageMask might work, but before that can be considered a viable
 solution, someone will need to make it possible to actually declare
 it
 from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw no
 matter what on all declared inputs and outputs.
 ureg automatically fills the UsageMask from the destionation register masks, 
 since it easy to determine from the opcodes.

Wait, where does it do that?  When I search through tgsi_ureg.c for
UsageMask, all it shows are assignments of TGSI_WRITEMASK_XYZW to the
UsageMask property.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2 v2] Add support for clip distances in Gallium

2011-12-13 Thread Bryan Cain

On 12/13/2011 04:22 PM, Jose Fonseca wrote:
 - Original Message -

 - Original Message -
 On 12/13/2011 03:48 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:25 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 03:09 PM, Jose Fonseca wrote:
 - Original Message -
 On 12/13/2011 12:26 PM, Bryan Cain wrote:
 On 12/13/2011 02:11 PM, Jose Fonseca wrote:
 - Original Message -
 This is an updated version of the patch set I sent to the
 list
 a
 few
 hours
 ago.
 There is now a TGSI property called
 TGSI_PROPERTY_NUM_CLIP_DISTANCES
 that drivers can use to determine how many of the 8
 available
 clip
 distances
 are actually used by a shader.
 Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be
 easily
 derived from the shader, and queried through
 src/gallium/auxiliary/tgsi/tgsi_scan.h ?
 No.  The clip distances can be indirectly addressed (there
 are
 up
 to 2
 of them in vec4 form for a total of 8 floats), which makes
 it
 impossible
 to determine which ones are used by analyzing the shader.
 The description is almost complete. :)  The issue is that
 the
 shader
 may
 declare

 out float gl_ClipDistance[4];

 the use non-constant addressing of the array.  The compiler
 knows
 that
 gl_ClipDistance has at most 4 elements, but post-hoc
 analysis
 would
 not
 be able to determine that.  Often the fixed-function
 hardware
 (see
 below) needs to know which clip distance values are actually
 written.
 But don't all the clip distances written by the shader need
 to
 be
 declared?

 E.g.:

 DCL OUT[0], CLIPDIST[0]
 DCL OUT[1], CLIPDIST[1]
 DCL OUT[2], CLIPDIST[2]
 DCL OUT[3], CLIPDIST[3]

 therefore a trivial analysis of the declarations convey that?
 No.  Clip distance is an array of up to 8 floats in GLSL, but
 it's
 represented in the hardware as 2 vec4s.  You can tell by
 analyzing
 the
 declarations whether there are more than 4 clip distances in
 use,
 but
 not which components the shader writes to.
 TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components
 in
 use,
 not
 the number of full vectors.
 Lets imagine

   out float gl_ClipDistance[6];

 Each a clip distance is a scalar float.

 Either all hardware represents the 8 clip distances as two 4
 vectors, and we do:

   DCL OUT[0].xywz, CLIPDIST[0]
   DCL OUT[1].xy, CLIPDIST[1]

 using the full range of struct tgsi_declaration::UsageMask [1]
 or
 we represent them as as scalars:

   DCL OUT[0].x, CLIPDIST[0]
   DCL OUT[1].x, CLIPDIST[1]
   DCL OUT[2].x, CLIPDIST[2]
   DCL OUT[3].x, CLIPDIST[3]
   DCL OUT[4].x, CLIPDIST[4]
   DCL OUT[5].x, CLIPDIST[5]

 If indirect addressing is allowed as I read bore, then maybe
 the
 later is better.

 I confess my ignorance about clipping and maybe I'm being
 dense,
 but I still don't see the need for this
 TGSI_PROPERTY_NUM_CLIP_DISTANCES.  Could you please draft an
 example TGSI shader showing this property (or just paste one
 generated with your change)?  I think that would help a lot.


 Jose


 [1] I don't know if tgsi_dump pays much attention to
  tgsi_declaration::UsageMask, but it does exist.
 UsageMask might work, but before that can be considered a viable
 solution, someone will need to make it possible to actually
 declare
 it
 from ureg.  As it is, ureg is hardcoded to set UsageMask to xyzw
 no
 matter what on all declared inputs and outputs.
 ureg automatically fills the UsageMask from the destionation
 register masks, since it easy to determine from the opcodes.
 Wait, where does it do that?  When I search through tgsi_ureg.c for
 UsageMask, all it shows are assignments of TGSI_WRITEMASK_XYZW to
 the
 UsageMask property.
 ah. I may be lying. But I'm pretty sure I wrote such code somewhere,
 sometime. Let me dig it.
 I was lying.  I wrote tgsi_util_get_inst_usage_mask() in 
 src/gallium/auxiliary/tgsi/tgsi_util.c , but it only analyses which registers 
 are _read_, and never got hooked into ureg anyway.

 I don't want you to go over hoops just to pass a scalar quantity. So may be 
 just add ability to ureg to specify declaration's output mask?

One problem with this is that the output mask would have to be a
parameter to a new declaration function in ureg, like
ureg_DECL_output_with_mask() instead of just ureg_DECL_output().  In
this case, ureg_DECL_output_with_mask() would be the only DECL function
with a usage mask specified.  If that asymmetry is okay with you, I
think I could go the UsageMask route.

 Another approach would be just to add the property, and kill output mask. Two 
 ways of doing the same is what I'd like to avoid. I'll need a day (it's late 
 here) to think about this and see how output mask is being actually used in 
 the code.

 Jose


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl_to_tgsi: Invalidate and revalidate uniform backing storage

2011-11-16 Thread Bryan Cain

It's unfortunately been a while since I've done anything with
glsl_to_tgsi.  What are the various functions that modify the TGSI IR
and try to propagate changes about that up to the gl_program?  If I can
see where it is in the code, I can probably remember if there's a reason
it was done that way.

Bryan

On 11/16/2011 01:13 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 If glUniform1i and friends are going to dump data directly in
 driver-allocated, the pointers have to be updated when the storage
 moves.  This should fix the regressions seen with commit 7199096.

 I'm not sure if this is the only place that needs this treatment.  I'm
 a little uncertain about the various functions in st_glsl_to_tgsi that
 modify the TGSI IR and try to propagate changes about that up to the
 gl_program.  That seems sketchy to me.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: Vadim Girlin vadimgir...@gmail.com
 Cc: Bryan Cain bryanca...@gmail.com
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   14 ++
  1 files changed, 14 insertions(+), 0 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 0bf6766..c8385bb 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -4571,6 +4571,13 @@ st_translate_program(
 t-pointSizeOutIndex = -1;
 t-prevInstWrotePointSize = GL_FALSE;
  
 +   for (i = 0; i  program-shader_program-NumUserUniformStorage; i++) {
 +  struct gl_uniform_storage *const storage =
 +  program-shader_program-UniformStorage[i];
 +
 +  _mesa_uniform_detach_all_driver_storage(storage);
 +   }
 +
 /*
  * Declare input attributes.
  */
 @@ -4797,6 +4804,13 @@ st_translate_program(
 t-insn[t-labels[i].branch_target]);
 }
  
 +   /* This has to be done last.  Any operation the can cause
 +* prog-ParameterValues to get reallocated (e.g., anything that adds a
 +* program constant) has to happen before creating this linkage.
 +*/
 +   _mesa_associate_uniform_storage(ctx, program-shader_program,
 +proginfo-Parameters);
 +
  out:
 if (t) {
FREE(t-insn);

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] TGSI declarations missing type info

2011-11-13 Thread Bryan Cain

On 11/13/2011 09:06 AM, Dave Airlie wrote:
 Hi guys,

 Just been looking at llvmpipe integer support and it seems like we
 lose some information about the type of data stored into temporaries,

 after st_glsl_to_cpp we no longer know what type the temporaries are,
 and llvm would really like to know and I can't see any reason that
 TGSI doesn't contain the info. Having untyped temp decls means we'd
 have to allocate some sort of union via aliases I guess in llvmpipe
 for all temps so we can store int/float in them.

 I've attached a run of glsl-vs-loop from llvmpipe with integer opcodes
 forced on. (llvmpipe-int-test branch of my repo).

 Dave.

If you do add types to TGSI registers, it's worth noting that the
internal IR used by glsl_to_tgsi (glsl_to_tgsi_instruction) already the
types of all src and dst registers, and it's only lost when converting
that to TGSI.  However, it was only intended to be good enough to
determine whether to emit an integer or float instruction, so there
might be some mistakes remaining somewhere that would need to be corrected.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: Add uniform_locations_assigned parameter to do_dead_code opt pass

2011-10-21 Thread Bryan Cain

If it's worth anything, the glsl_to_tgsi part is

Reviewed-by: Bryan Cain bryanca...@gmail.com

On 10/21/2011 01:49 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Setting this flag prevents declarations of uniforms from being removed
 from the IR.  Since the IR is directly used by several API functions
 that query uniforms in shaders, uniform declarations cannot be removed
 after the locations have been set.  However, it should still be safe
 to reorder the declarations (this is not tested).

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980
 Cc: Brian Paul bri...@vmware.com
 Cc: Bryan Cain bryanca...@gmail.com
 Cc: Vinson Lee v...@vmware.com
 Cc: José Fonseca jfons...@vmware.com
 Cc: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/glsl_parser_extras.cpp|   23 +--
  src/glsl/ir_optimization.h |6 --
  src/glsl/linker.cpp|2 +-
  src/glsl/main.cpp  |2 +-
  src/glsl/opt_dead_code.cpp |   14 ++
  src/glsl/test_optpass.cpp  |4 ++--
  src/mesa/drivers/dri/i965/brw_shader.cpp   |3 ++-
  src/mesa/main/ff_fragment_shader.cpp   |2 +-
  src/mesa/program/ir_to_mesa.cpp|6 --
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |4 +++-
  10 files changed, 49 insertions(+), 17 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Short-circuit lower_if_to_cond_assign when MaxIfDepth is UINT_MAX.

2011-10-19 Thread Bryan Cain

Looks good to me.

Bryan

On 10/18/2011 05:20 PM, Kenneth Graunke wrote:
 Setting MaxIfDepth to UINT_MAX effectively means don't lower anything.

 Explicitly checking for this common case allows us to avoid walking the
 IR, computing nesting levels, and so on.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: Bryan Cain bryanca...@gmail.com
 Cc: Ian Romanick i...@freedesktop.org
 ---
  src/glsl/lower_if_to_cond_assign.cpp |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/src/glsl/lower_if_to_cond_assign.cpp 
 b/src/glsl/lower_if_to_cond_assign.cpp
 index 7b89a15..2c5d561 100644
 --- a/src/glsl/lower_if_to_cond_assign.cpp
 +++ b/src/glsl/lower_if_to_cond_assign.cpp
 @@ -79,6 +79,9 @@ public:
  bool
  lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth)
  {
 +   if (max_depth == UINT_MAX)
 +  return false;
 +
 ir_if_to_cond_assign_visitor v(max_depth);
  
 visit_list_elements(v, instructions);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: kill instruction if writemask=0 in eliminate_dead_code_advanced()

2011-10-09 Thread Bryan Cain

I don't think there's any reason we can't eliminate a dead instruction
when the writemask is zero.  I do wonder, though, why this patch
actually makes a difference.  There's an if (!inst-dead_mask ||
!inst-dst.writemask) three lines before the code visible in the patch
that makes it not kill the instruction if the writemask is zero.  I
don't remember why that's there, but if it weren't there, and the
writemask is zero, the dead_mask should also be zero, so it should be
handled by the else if block.

In short, I think that entire if/else if/else statement could use a look.

Bryan

On 10/07/2011 10:40 AM, Brian Paul wrote:
 From: Brian Paul bri...@vmware.com

 This fixes a bug where we'd wind up emitting an invalid instruction like
 MOVE R[0]., R[1];  - note the empty/zero writemask.  If we don't write to
 any dest register channels, cull the instruction.
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index d8ef8a3..44b1149 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -3776,8 +3776,14 @@ 
 glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
   iter.remove();
   delete inst;
   removed++;
 -  } else
 +  } else {
   inst-dst.writemask = ~(inst-dead_mask);
 + if (inst-dst.writemask == 0) {
 +iter.remove();
 +delete inst;
 +removed++;
 + }
 +  }
 }
  
 ralloc_free(write_level);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: kill instruction if writemask=0 in eliminate_dead_code_advanced()

2011-10-09 Thread Bryan Cain

What does it do if there's no destination register?  In any case, I
don't think glsl_to_tgsi emits any ARLs of that form, so it shouldn't be
a problem.

Bryan

On 10/07/2011 01:06 PM, Marek Olšák wrote:
 I think ARL is allowed to have no destination register, right? In that
 case, there should be a special case not to eliminate ARLs.

 Marek

 On Fri, Oct 7, 2011 at 5:40 PM, Brian Paul brian.e.p...@gmail.com wrote:
 From: Brian Paul bri...@vmware.com

 This fixes a bug where we'd wind up emitting an invalid instruction like
 MOVE R[0]., R[1];  - note the empty/zero writemask.  If we don't write to
 any dest register channels, cull the instruction.
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index d8ef8a3..44b1149 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -3776,8 +3776,14 @@ 
 glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
  iter.remove();
  delete inst;
  removed++;
 -  } else
 +  } else {
  inst-dst.writemask = ~(inst-dead_mask);
 + if (inst-dst.writemask == 0) {
 +iter.remove();
 +delete inst;
 +removed++;
 + }
 +  }
}

ralloc_free(write_level);
 --
 1.7.3.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] glsl_to_tgsi: Use _mesa_generate_parameters_list_for_uniforms

2011-10-07 Thread Bryan Cain

On 10/07/2011 07:06 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: Bryan Cain bryanca...@gmail.com
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  119 
 +---
  1 files changed, 2 insertions(+), 117 deletions(-)


Reviewed-by: Bryan Cain bryanca...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl_to_tgsi: implement ir_binop_all_equal and ir_binop_any_nequal for native integers

2011-09-19 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  119 
 1 files changed, 85 insertions(+), 34 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 8921698..f68270d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1528,15 +1528,45 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
  st_src_reg temp = get_temp(native_integers ?
glsl_type::get_instance(ir-operands[0]-type-base_type, 4, 1) 
:
glsl_type::vec4_type);
- assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT);
- emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]);
- 
- /* After the dot-product, the value will be an integer on the
-  * range [0,4].  Zero becomes 1.0, and positive values become zero.
-  */
- emit_dp(ir, result_dst, temp, temp, vector_elements);
  
- if (result_dst.type == GLSL_TYPE_FLOAT) {
+ if (native_integers) {
+st_dst_reg temp_dst = st_dst_reg(temp);
+st_src_reg temp1 = st_src_reg(temp), temp2 = st_src_reg(temp);
+
+emit(ir, TGSI_OPCODE_SEQ, st_dst_reg(temp), op[0], op[1]);
+
+/* Emit 1-3 AND operations to combine the SEQ results. */
+switch (ir-operands[0]-type-vector_elements) {
+case 2:
+   break;
+case 3:
+   temp_dst.writemask = WRITEMASK_Y;
+   temp1.swizzle = SWIZZLE_;
+   temp2.swizzle = SWIZZLE_;
+   emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2);
+   break;
+case 4:
+   temp_dst.writemask = WRITEMASK_X;
+   temp1.swizzle = SWIZZLE_;
+   temp2.swizzle = SWIZZLE_;
+   emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2);
+   temp_dst.writemask = WRITEMASK_Y;
+   temp1.swizzle = SWIZZLE_;
+   temp2.swizzle = SWIZZLE_;
+   emit(ir, TGSI_OPCODE_AND, temp_dst, temp1, temp2);
+}
+
+temp1.swizzle = SWIZZLE_;
+temp2.swizzle = SWIZZLE_;
+emit(ir, TGSI_OPCODE_AND, result_dst, temp1, temp2);
+ } else {
+emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]);
+
+/* After the dot-product, the value will be an integer on the
+ * range [0,4].  Zero becomes 1.0, and positive values become zero.
+ */
+emit_dp(ir, result_dst, temp, temp, vector_elements);
+
 /* Negating the result of the dot-product gives values on the range
  * [-4, 0].  Zero becomes 1.0, and negative values become zero.
  * This is achieved using SGE.
@@ -1544,11 +1574,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
 st_src_reg sge_src = result_src;
 sge_src.negate = ~sge_src.negate;
 emit(ir, TGSI_OPCODE_SGE, result_dst, sge_src, 
st_src_reg_for_float(0.0));
- } else {
-/* The TGSI negate flag doesn't work for integers, so use SEQ 0
- * instead.
- */
-emit(ir, TGSI_OPCODE_SEQ, result_dst, result_src, 
st_src_reg_for_int(0));
  }
   } else {
  emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], op[1]);
@@ -1561,30 +1586,56 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
  st_src_reg temp = get_temp(native_integers ?
glsl_type::get_instance(ir-operands[0]-type-base_type, 4, 1) 
:
glsl_type::vec4_type);
- assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT);
  emit(ir, TGSI_OPCODE_SNE, st_dst_reg(temp), op[0], op[1]);
 
- /* After the dot-product, the value will be an integer on the
-  * range [0,4].  Zero stays zero, and positive values become 1.0.
-  */
- glsl_to_tgsi_instruction *const dp =
-   emit_dp(ir, result_dst, temp, temp, vector_elements);
- if (this-prog-Target == GL_FRAGMENT_PROGRAM_ARB 
- result_dst.type == GLSL_TYPE_FLOAT) {
-/* The clamping to [0,1] can be done for free in the fragment
- * shader with a saturate.
- */
-dp-saturate = true;
- } else if (result_dst.type == GLSL_TYPE_FLOAT) {
-/* Negating the result of the dot-product gives values on the range
- * [-4, 0].  Zero stays zero, and negative values become 1.0.  This
- * achieved using SLT.
- */
-st_src_reg slt_src = result_src;
-slt_src.negate = ~slt_src.negate;
-emit(ir, TGSI_OPCODE_SLT, result_dst, slt_src, 
st_src_reg_for_float(0.0));
+ if (native_integers) {
+st_dst_reg temp_dst = st_dst_reg(temp);
+st_src_reg temp1 =

Re: [Mesa-dev] [PATCH v2 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL

2011-09-14 Thread Bryan Cain

I don't see any reason why this patch would make a difference, but since
it apparently does, I'll take a look at it and fix it.  What program is
that?

Bryan

On 09/14/2011 07:01 AM, Marek Olšák wrote:
 Bryan,

 This commit causes hardlocks on r600g (integers disabled).

 Maybe you can see better than me what's wrong.

 This is a side-by-side diff of the failing TGSI shader. The right-hand
 one is from Mesa master. The left-hand one is with the commit
 reverted.
 http://pastebin.com/raw.php?i=QXB3SZAE

 Thanks,
 Marek

 On Sat, Sep 10, 2011 at 7:36 PM, Bryan Cain bryanca...@gmail.com wrote:
 Since TGSI now has a UARL opcode that takes an integer as the source, it is
 no longer necessary to hack around the lack of an integer ARL opcode using 
 I2F.
 UARL is only emitted when native integers are enabled; ARL is still used
 otherwise.

 Reviewed-by: Brian Paul bri...@vmware.com
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   18 +++---
  1 files changed, 7 insertions(+), 11 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index e2857ed..05d4d33 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,

inst-function = NULL;

 -   if (op == TGSI_OPCODE_ARL)
 +   if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL)
   this-num_address_regs = 1;

/* Update indirect addressing status used by TGSI */
 @@ -724,16 +724,12 @@ void
  glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir,
st_dst_reg dst, st_src_reg src0)
  {
 -   st_src_reg tmp = get_temp(glsl_type::float_type);
 +   int op = TGSI_OPCODE_ARL;

 -   if (src0.type == GLSL_TYPE_INT)
 -  emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0);
 -   else if (src0.type == GLSL_TYPE_UINT)
 -  emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0);
 -   else
 -  tmp = src0;
 -
 -   emit(NULL, TGSI_OPCODE_ARL, dst, tmp);
 +   if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT)
 +  op = TGSI_OPCODE_UARL;
 +
 +   emit(NULL, op, dst, src0);
  }

  /**
 --
 1.7.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL

2011-09-10 Thread Bryan Cain

Can one of the Gallium interface maintainers please review this patch so
I can push it?

Bryan

On 09/02/2011 11:09 AM, Bryan Cain wrote:
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   18 +++---
  1 files changed, 7 insertions(+), 11 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index e2857ed..05d4d33 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,
  
 inst-function = NULL;
 
 -   if (op == TGSI_OPCODE_ARL)
 +   if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL)
this-num_address_regs = 1;
 
 /* Update indirect addressing status used by TGSI */
 @@ -724,16 +724,12 @@ void
  glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir,
   st_dst_reg dst, st_src_reg src0)
  {
 -   st_src_reg tmp = get_temp(glsl_type::float_type);
 +   int op = TGSI_OPCODE_ARL;
  
 -   if (src0.type == GLSL_TYPE_INT)
 -  emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0);
 -   else if (src0.type == GLSL_TYPE_UINT)
 -  emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0);
 -   else
 -  tmp = src0;
 -   
 -   emit(NULL, TGSI_OPCODE_ARL, dst, tmp);
 +   if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT)
 +  op = TGSI_OPCODE_UARL;
 +
 +   emit(NULL, op, dst, src0);
  }
  
  /**
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] gallium: add TGSI opcodes UARL and UCMP

2011-09-10 Thread Bryan Cain

Can one of the Gallium interface maintainers please review this patch so
I can push it?

Bryan

On 09/02/2011 11:09 AM, Bryan Cain wrote:
 They are needed by glsl_to_tgsi for an efficient implementation using native
 integers.
 ---
  src/gallium/auxiliary/tgsi/tgsi_exec.c |   30 
 
  src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++
  src/gallium/include/pipe/p_shader_tokens.h |5 +++-
  3 files changed, 37 insertions(+), 1 deletions(-)

 diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
 b/src/gallium/auxiliary/tgsi/tgsi_exec.c
 index 38dc1ef..896d871 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
 @@ -3312,6 +3312,28 @@ micro_usne(union tgsi_exec_channel *dst,
  }
  
  static void
 +micro_uarl(union tgsi_exec_channel *dst,
 +   const union tgsi_exec_channel *src)
 +{
 +   dst-i[0] = src-u[0];
 +   dst-i[1] = src-u[1];
 +   dst-i[2] = src-u[2];
 +   dst-i[3] = src-u[3];
 +}
 +
 +static void
 +micro_ucmp(union tgsi_exec_channel *dst,
 +   const union tgsi_exec_channel *src0,
 +   const union tgsi_exec_channel *src1,
 +   const union tgsi_exec_channel *src2)
 +{
 +   dst-f[0] = src0-u[0] ? src1-f[0] : src2-f[0];
 +   dst-f[1] = src0-u[1] ? src1-f[1] : src2-f[1];
 +   dst-f[2] = src0-u[2] ? src1-f[2] : src2-f[2];
 +   dst-f[3] = src0-u[3] ? src1-f[3] : src2-f[3];
 +}
 +
 +static void
  exec_instruction(
 struct tgsi_exec_machine *mach,
 const struct tgsi_full_instruction *inst,
 @@ -4071,6 +4093,14 @@ exec_instruction(
assert(0);
break;
  
 +   case TGSI_OPCODE_UARL:
 +  exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, 
 TGSI_EXEC_DATA_UINT);
 +  break;
 +
 +   case TGSI_OPCODE_UCMP:
 +  exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, 
 TGSI_EXEC_DATA_UINT);
 +  break;
 +
 default:
assert( 0 );
 }
 diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
 b/src/gallium/auxiliary/tgsi/tgsi_info.c
 index 14ed56a..6cd580a 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
 @@ -189,6 +189,9 @@ static const struct tgsi_opcode_info 
 opcode_info[TGSI_OPCODE_LAST] =
 { 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO },
 { 1, 2, 0, 0, 0, 0, SAMPLE_POS,  TGSI_OPCODE_SAMPLE_POS },
 { 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO },
 +   
 +   { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL },
 +   { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP },
  };
  
  const struct tgsi_opcode_info *
 diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
 b/src/gallium/include/pipe/p_shader_tokens.h
 index d3a3632..0a26e39 100644
 --- a/src/gallium/include/pipe/p_shader_tokens.h
 +++ b/src/gallium/include/pipe/p_shader_tokens.h
 @@ -363,7 +363,10 @@ struct tgsi_property_data {
  #define TGSI_OPCODE_SAMPLE_POS  155
  #define TGSI_OPCODE_SAMPLE_INFO 156
  
 -#define TGSI_OPCODE_LAST157
 +#define TGSI_OPCODE_UARL157
 +#define TGSI_OPCODE_UCMP158
 +
 +#define TGSI_OPCODE_LAST159
  
  #define TGSI_SAT_NONE0  /* do not saturate */
  #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL

2011-09-10 Thread Bryan Cain

Disregard this, I meant to send this for patch 1/2 and not 2/2.  This
patch doesn't contain any Gallium interface changes.

Bryan

On 09/10/2011 11:43 AM, Bryan Cain wrote:
 Can one of the Gallium interface maintainers please review this patch so
 I can push it?

 Bryan

 On 09/02/2011 11:09 AM, Bryan Cain wrote:
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   18 +++---
  1 files changed, 7 insertions(+), 11 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index e2857ed..05d4d33 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,
  
 inst-function = NULL;
 
 -   if (op == TGSI_OPCODE_ARL)
 +   if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL)
this-num_address_regs = 1;
 
 /* Update indirect addressing status used by TGSI */
 @@ -724,16 +724,12 @@ void
  glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir,
  st_dst_reg dst, st_src_reg src0)
  {
 -   st_src_reg tmp = get_temp(glsl_type::float_type);
 +   int op = TGSI_OPCODE_ARL;
  
 -   if (src0.type == GLSL_TYPE_INT)
 -  emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0);
 -   else if (src0.type == GLSL_TYPE_UINT)
 -  emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0);
 -   else
 -  tmp = src0;
 -   
 -   emit(NULL, TGSI_OPCODE_ARL, dst, tmp);
 +   if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT)
 +  op = TGSI_OPCODE_UARL;
 +
 +   emit(NULL, op, dst, src0);
  }
  
  /**
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL

2011-09-10 Thread Bryan Cain

It will only be emitted when the driver supports integer operations. 
The I2F+ARL combination is currently what is emitted when integer
support is enabled (float targets only need ARL) but I can make that
more clear in the commit message.

On 09/10/2011 12:07 PM, Brian Paul wrote:
 I guess my question is: do the drivers need to be updated for
 TGSI_OPCODE_UARL?  Or will this only be emitted when the driver
 supports integer operations?  If that's the case, please say so in the
 commit message or code.

 Otherwise: Reviewed-by: Brian Paul bri...@vmware.com


 On 09/10/2011 10:48 AM, Bryan Cain wrote:
 Disregard this, I meant to send this for patch 1/2 and not 2/2.  This
 patch doesn't contain any Gallium interface changes.

 Bryan

 On 09/10/2011 11:43 AM, Bryan Cain wrote:
 Can one of the Gallium interface maintainers please review this
 patch so
 I can push it?

 Bryan

 On 09/02/2011 11:09 AM, Bryan Cain wrote:
 ---
   src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   18 +++---
   1 files changed, 7 insertions(+), 11 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index e2857ed..05d4d33 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir,
 unsigned op,

  inst-function = NULL;

 -   if (op == TGSI_OPCODE_ARL)
 +   if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL)
 this-num_address_regs = 1;

  /* Update indirect addressing status used by TGSI */
 @@ -724,16 +724,12 @@ void
   glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir,
   st_dst_reg dst, st_src_reg src0)
   {
 -   st_src_reg tmp = get_temp(glsl_type::float_type);
 +   int op = TGSI_OPCODE_ARL;

 -   if (src0.type == GLSL_TYPE_INT)
 -  emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0);
 -   else if (src0.type == GLSL_TYPE_UINT)
 -  emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0);
 -   else
 -  tmp = src0;
 -
 -   emit(NULL, TGSI_OPCODE_ARL, dst, tmp);
 +   if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT)
 +  op = TGSI_OPCODE_UARL;
 +
 +   emit(NULL, op, dst, src0);
   }

   /**
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/2] gallium: add TGSI opcodes UARL and UCMP

2011-09-10 Thread Bryan Cain

They are needed by glsl_to_tgsi for an efficient implementation using native
integers.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c |   30 
 src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++
 src/gallium/docs/source/tgsi.rst   |   19 +
 src/gallium/include/pipe/p_shader_tokens.h |5 +++-
 4 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index d9de41b..ce6399c 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -3367,6 +3367,28 @@ micro_usne(union tgsi_exec_channel *dst,
 }
 
 static void
+micro_uarl(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src)
+{
+   dst-i[0] = src-u[0];
+   dst-i[1] = src-u[1];
+   dst-i[2] = src-u[2];
+   dst-i[3] = src-u[3];
+}
+
+static void
+micro_ucmp(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1,
+   const union tgsi_exec_channel *src2)
+{
+   dst-u[0] = src0-u[0] ? src1-u[0] : src2-u[0];
+   dst-u[1] = src0-u[1] ? src1-u[1] : src2-u[1];
+   dst-u[2] = src0-u[2] ? src1-u[2] : src2-u[2];
+   dst-u[3] = src0-u[3] ? src1-u[3] : src2-u[3];
+}
+
+static void
 exec_instruction(
struct tgsi_exec_machine *mach,
const struct tgsi_full_instruction *inst,
@@ -4126,6 +4148,14 @@ exec_instruction(
   assert(0);
   break;
 
+   case TGSI_OPCODE_UARL:
+  exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, 
TGSI_EXEC_DATA_UINT);
+  break;
+
+   case TGSI_OPCODE_UCMP:
+  exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, 
TGSI_EXEC_DATA_UINT);
+  break;
+
default:
   assert( 0 );
}
diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 14ed56a..6cd580a 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -189,6 +189,9 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO },
{ 1, 2, 0, 0, 0, 0, SAMPLE_POS,  TGSI_OPCODE_SAMPLE_POS },
{ 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO },
+   
+   { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL },
+   { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP },
 };
 
 const struct tgsi_opcode_info *
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 5cf0875..d7f50b1 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -1013,6 +1013,25 @@ XXX so let's discuss it, yeah?
   dst.w = src0.w \oplus src1.w
 
 
+.. opcode:: UCMP - Integer Conditional Move
+
+.. math::
+
+  dst.x = src0.x ? src1.x : src2.x
+
+  dst.y = src0.y ? src1.y : src2.y
+
+  dst.z = src0.z ? src1.z : src2.z
+
+  dst.w = src0.w ? src1.w : src2.w
+
+
+.. opcode:: UARL - Integer Address Register Load
+
+  Moves the contents of the source register, assumed to be an integer, into the
+  destination register, which is assumed to be an address (ADDR) register.
+
+
 .. opcode:: SAD - Sum Of Absolute Differences
 
 .. math::
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index b9e3dcf..7236c92 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -363,7 +363,10 @@ struct tgsi_property_data {
 #define TGSI_OPCODE_SAMPLE_POS  155
 #define TGSI_OPCODE_SAMPLE_INFO 156
 
-#define TGSI_OPCODE_LAST157
+#define TGSI_OPCODE_UARL157
+#define TGSI_OPCODE_UCMP158
+
+#define TGSI_OPCODE_LAST159
 
 #define TGSI_SAT_NONE0  /* do not saturate */
 #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: add a UniformBooleanTrue option

2011-09-05 Thread Bryan Cain

Drivers supporting native integers set UniformBooleanTrue to the integer value
that should be used for true when uploading uniform booleans.  This is ~0 for
Gallium and 1 for i965.
---
 src/mesa/drivers/dri/i965/brw_context.c |4 +++-
 src/mesa/main/mtypes.h  |6 ++
 src/mesa/main/uniforms.c|5 -
 src/mesa/state_tracker/st_extensions.c  |1 +
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6ef0fcb..5ea7385 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -253,8 +253,10 @@ GLboolean brwCreateContext( int api,
/* If we're using the new shader backend, we require integer uniforms
 * stored as actual integers.
 */
-   if (brw-new_vs_backend)
+   if (brw-new_vs_backend) {
   ctx-Const.NativeIntegers = true;
+  ctx-Const.UniformBooleanTrue = 1;
+   }
 
return GL_TRUE;
 }
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 44ebf0a..ad59797 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2722,6 +2722,12 @@ struct gl_constants
 */
GLboolean NativeIntegers;
 
+   /**
+* If the driver supports real 32-bit integers, what integer value should be
+* used for boolean true in uniform uploads?  (Usually 1 or ~0.)
+*/
+   GLuint UniformBooleanTrue;
+
/** Which texture units support GL_ATI_envmap_bumpmap as targets */
GLbitfield SupportedBumpUnits;
 
diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index fe1ce6d..b0f9c33 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -802,7 +802,10 @@ set_program_uniform(struct gl_context *ctx, struct 
gl_program *program,
else
   uniformVal[i].b = uniformVal[i].u ? 1 : 0;

-   if (!ctx-Const.NativeIntegers)
+   if (ctx-Const.NativeIntegers)
+  uniformVal[i].u =
+uniformVal[i].b ? ctx-Const.UniformBooleanTrue : 0;
+   else
   uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
 }
  }
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index aa7f3b5..76e84eb 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -207,6 +207,7 @@ void st_init_limits(struct st_context *st)
   c-MaxProgramTexelOffset = screen-get_param(screen, 
PIPE_CAP_MAX_TEXEL_OFFSET);
 
   c-GLSLVersion = 120;
+  c-UniformBooleanTrue = ~0;
}
 }
 
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans

2011-09-03 Thread Bryan Cain

On 09/02/2011 06:13 PM, Eric Anholt wrote:
 On Wed, 31 Aug 2011 01:33:59 -0500, Bryan Cain bryanca...@gmail.com wrote:
 With this patch, there are no piglit regressions on softpipe with native
 integers enabled.  Unlike my previous patch, this uses integer values of
 ~0 and 0 for true and false, respectively, instead of the float values 1.0
 and 0.0.
 This will break b2* conversions on our driver, since we expect true to
 be 1, not ~0.  The minimal change would require emit(AND(temp, uniform,
 1)) when deferencing boolean components of uniform variables.

Your driver already emits an AND instruction on every compare, according
to this code from brw_fs_visitor.cpp:

   case ir_binop_less:
   case ir_binop_greater:
   case ir_binop_lequal:
   case ir_binop_gequal:
   case ir_binop_equal:
   case ir_binop_all_equal:
   case ir_binop_nequal:
   case ir_binop_any_nequal:
  temp = this-result;
  /* original gen4 does implicit conversion before comparison. */
  if (intel-gen  5)
 temp.type = op[0].type;

  inst = emit(BRW_OPCODE_CMP, temp, op[0], op[1]);
  inst-conditional_mod = brw_conditional_for_comparison(ir-operation);
  emit(BRW_OPCODE_AND, this-result, this-result, fs_reg(0x1));
  break;

If the hardware sets ~0 on compares anyway, why not use that in your driver?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium: add TGSI opcodes UARL and UCMP

2011-09-02 Thread Bryan Cain

They are needed by glsl_to_tgsi for an efficient implementation using native
integers.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c |   30 
 src/gallium/auxiliary/tgsi/tgsi_info.c |3 ++
 src/gallium/include/pipe/p_shader_tokens.h |5 +++-
 3 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 38dc1ef..896d871 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -3312,6 +3312,28 @@ micro_usne(union tgsi_exec_channel *dst,
 }
 
 static void
+micro_uarl(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src)
+{
+   dst-i[0] = src-u[0];
+   dst-i[1] = src-u[1];
+   dst-i[2] = src-u[2];
+   dst-i[3] = src-u[3];
+}
+
+static void
+micro_ucmp(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1,
+   const union tgsi_exec_channel *src2)
+{
+   dst-f[0] = src0-u[0] ? src1-f[0] : src2-f[0];
+   dst-f[1] = src0-u[1] ? src1-f[1] : src2-f[1];
+   dst-f[2] = src0-u[2] ? src1-f[2] : src2-f[2];
+   dst-f[3] = src0-u[3] ? src1-f[3] : src2-f[3];
+}
+
+static void
 exec_instruction(
struct tgsi_exec_machine *mach,
const struct tgsi_full_instruction *inst,
@@ -4071,6 +4093,14 @@ exec_instruction(
   assert(0);
   break;
 
+   case TGSI_OPCODE_UARL:
+  exec_vector_unary(mach, inst, micro_uarl, TGSI_EXEC_DATA_INT, 
TGSI_EXEC_DATA_UINT);
+  break;
+
+   case TGSI_OPCODE_UCMP:
+  exec_vector_trinary(mach, inst, micro_ucmp, TGSI_EXEC_DATA_FLOAT, 
TGSI_EXEC_DATA_UINT);
+  break;
+
default:
   assert( 0 );
}
diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 14ed56a..6cd580a 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -189,6 +189,9 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO },
{ 1, 2, 0, 0, 0, 0, SAMPLE_POS,  TGSI_OPCODE_SAMPLE_POS },
{ 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO },
+   
+   { 1, 1, 0, 0, 0, 0, UARL, TGSI_OPCODE_UARL },
+   { 1, 3, 0, 0, 0, 0, UCMP, TGSI_OPCODE_UCMP },
 };
 
 const struct tgsi_opcode_info *
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index d3a3632..0a26e39 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -363,7 +363,10 @@ struct tgsi_property_data {
 #define TGSI_OPCODE_SAMPLE_POS  155
 #define TGSI_OPCODE_SAMPLE_INFO 156
 
-#define TGSI_OPCODE_LAST157
+#define TGSI_OPCODE_UARL157
+#define TGSI_OPCODE_UCMP158
+
+#define TGSI_OPCODE_LAST159
 
 #define TGSI_SAT_NONE0  /* do not saturate */
 #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] glsl_to_tgsi: use UARL instead of I2F and ARL

2011-09-02 Thread Bryan Cain

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   18 +++---
 1 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index e2857ed..05d4d33 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -512,7 +512,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op,
 
inst-function = NULL;

-   if (op == TGSI_OPCODE_ARL)
+   if (op == TGSI_OPCODE_ARL || op == TGSI_OPCODE_UARL)
   this-num_address_regs = 1;

/* Update indirect addressing status used by TGSI */
@@ -724,16 +724,12 @@ void
 glsl_to_tgsi_visitor::emit_arl(ir_instruction *ir,
st_dst_reg dst, st_src_reg src0)
 {
-   st_src_reg tmp = get_temp(glsl_type::float_type);
+   int op = TGSI_OPCODE_ARL;
 
-   if (src0.type == GLSL_TYPE_INT)
-  emit(NULL, TGSI_OPCODE_I2F, st_dst_reg(tmp), src0);
-   else if (src0.type == GLSL_TYPE_UINT)
-  emit(NULL, TGSI_OPCODE_U2F, st_dst_reg(tmp), src0);
-   else
-  tmp = src0;
-   
-   emit(NULL, TGSI_OPCODE_ARL, dst, tmp);
+   if (src0.type == GLSL_TYPE_INT || src0.type == GLSL_TYPE_UINT)
+  op = TGSI_OPCODE_UARL;
+
+   emit(NULL, op, dst, src0);
 }
 
 /**
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans

2011-09-02 Thread Bryan Cain

Are there any objections to pushing this?

Bryan

On 08/31/2011 01:33 AM, Bryan Cain wrote:
 With this patch, there are no piglit regressions on softpipe with native
 integers enabled.  Unlike my previous patch, this uses integer values of
 ~0 and 0 for true and false, respectively, instead of the float values 1.0
 and 0.0.
 ---
  src/mesa/main/uniforms.c   |6 +-
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  160 
 
  2 files changed, 116 insertions(+), 50 deletions(-)

 diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
 index cda840f..fa96fd3 100644
 --- a/src/mesa/main/uniforms.c
 +++ b/src/mesa/main/uniforms.c
 @@ -777,12 +777,12 @@ set_program_uniform(struct gl_context *ctx, struct 
 gl_program *program,
   if (isUniformBool) {
  for (i = 0; i  elems; i++) {
 if (basicType == GL_FLOAT)
 -  uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0;
 +  uniformVal[i].u = uniformVal[i].f != 0.0f ? ~0 : 0;
 else
 -  uniformVal[i].b = uniformVal[i].u ? 1 : 0;
 +  uniformVal[i].u = uniformVal[i].u ? ~0 : 0;
 
 if (!ctx-Const.NativeIntegers)
 -  uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
 +  uniformVal[i].f = uniformVal[i].u ? 1.0f : 0.0f;
  }
   }
}
 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 2266083..c8f790a 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -385,6 +385,8 @@ public:
 void emit_scalar(ir_instruction *ir, unsigned op,
   st_dst_reg dst, st_src_reg src0, st_src_reg src1);
  
 +   void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
 +
 void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
  
 void emit_scs(ir_instruction *ir, unsigned op,
 @@ -562,7 +564,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,
 }
  
 this-instructions.push_tail(inst);
 -   
 +
 +   if (native_integers)
 +  try_emit_float_set(ir, op, dst);
 +
 return inst;
  }
  
 @@ -588,6 +593,25 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op)
 return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
  }
  
 + /**
 + * Emits the code to convert the result of float SET instructions to 
 integers.
 + */
 +void
 +glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op,
 +  st_dst_reg dst)
 +{
 +   if ((op == TGSI_OPCODE_SEQ ||
 +op == TGSI_OPCODE_SNE ||
 +op == TGSI_OPCODE_SGE ||
 +op == TGSI_OPCODE_SLT))
 +   {
 +  st_src_reg src = st_src_reg(dst);
 +  src.negate = ~src.negate;
 +  dst.type = GLSL_TYPE_FLOAT;
 +  emit(ir, TGSI_OPCODE_F2I, dst, src);
 +   }
 +}
 +
  /**
   * Determines whether to use an integer, unsigned integer, or float opcode 
   * based on the operands and input opcode, then emits the result.
 @@ -604,7 +628,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
 unsigned op,
 if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
type = GLSL_TYPE_FLOAT;
 else if (native_integers)
 -  type = src0.type;
 +  type = src0.type == GLSL_TYPE_BOOL ? GLSL_TYPE_INT : src0.type;
  
  #define case4(c, f, i, u) \
 case TGSI_OPCODE_##c: \
 @@ -630,12 +654,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
 unsigned op,
case3(SGE, ISGE, USGE);
case3(SLT, ISLT, USLT);

 -  case2iu(SHL, SHL);
case2iu(ISHR, USHR);
 -  case2iu(NOT, NOT);
 -  case2iu(AND, AND);
 -  case2iu(OR, OR);
 -  case2iu(XOR, XOR);

default: break;
 }
 @@ -1389,7 +1408,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
 switch (ir-operation) {
 case ir_unop_logic_not:
if (result_dst.type != GLSL_TYPE_FLOAT)
 - emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], 
 st_src_reg_for_type(result_dst.type, 0));
 + emit(ir, TGSI_OPCODE_NOT, result_dst, op[0]);
else {
   /* Previously 'SEQ dst, src, 0.0' was used for this.  However, many
* older GPUs implement SEQ using multiple instructions (i915 uses 
 two
 @@ -1489,10 +1508,10 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
emit(ir, TGSI_OPCODE_SLT, result_dst, op[0], op[1]);
break;
 case ir_binop_greater:
 -  emit(ir, TGSI_OPCODE_SGT, result_dst, op[0], op[1]);
 +  emit(ir, TGSI_OPCODE_SLT, result_dst, op[1], op[0]);
break;
 case ir_binop_lequal:
 -  emit(ir, TGSI_OPCODE_SLE, result_dst, op[0], op[1]);
 +  emit(ir, TGSI_OPCODE_SGE, result_dst, op[1], op[0]);
break;
 case ir_binop_gequal:
emit(ir, TGSI_OPCODE_SGE, result_dst, op[0], op[1]);
 @@ -1605,41 +1624,52 @@ glsl_to_tgsi_visitor

Re: [Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans

2011-09-02 Thread Bryan Cain

On 09/02/2011 06:13 PM, Eric Anholt wrote:
 On Wed, 31 Aug 2011 01:33:59 -0500, Bryan Cain bryanca...@gmail.com wrote:
 With this patch, there are no piglit regressions on softpipe with native
 integers enabled.  Unlike my previous patch, this uses integer values of
 ~0 and 0 for true and false, respectively, instead of the float values 1.0
 and 0.0.
 This will break b2* conversions on our driver, since we expect true to
 be 1, not ~0.  The minimal change would require emit(AND(temp, uniform,
 1)) when deferencing boolean components of uniform variables.

Can you please change your driver if the hardware is not able to support
it, then?  If that's not possible, we need to come up with a way to
upload booleans as ~0 for Gallium drivers and 1 for i965.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] glsl_to_tgsi, mesa: fixes for native integers and integer booleans

2011-08-31 Thread Bryan Cain

With this patch, there are no piglit regressions on softpipe with native
integers enabled.  Unlike my previous patch, this uses integer values of
~0 and 0 for true and false, respectively, instead of the float values 1.0
and 0.0.
---
 src/mesa/main/uniforms.c   |6 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  160 
 2 files changed, 116 insertions(+), 50 deletions(-)

diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index cda840f..fa96fd3 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -777,12 +777,12 @@ set_program_uniform(struct gl_context *ctx, struct 
gl_program *program,
  if (isUniformBool) {
 for (i = 0; i  elems; i++) {
if (basicType == GL_FLOAT)
-  uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0;
+  uniformVal[i].u = uniformVal[i].f != 0.0f ? ~0 : 0;
else
-  uniformVal[i].b = uniformVal[i].u ? 1 : 0;
+  uniformVal[i].u = uniformVal[i].u ? ~0 : 0;

if (!ctx-Const.NativeIntegers)
-  uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
+  uniformVal[i].f = uniformVal[i].u ? 1.0f : 0.0f;
 }
  }
   }
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 2266083..c8f790a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -385,6 +385,8 @@ public:
void emit_scalar(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_src_reg src0, st_src_reg src1);
 
+   void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
+
void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
 
void emit_scs(ir_instruction *ir, unsigned op,
@@ -562,7 +564,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op,
}
 
this-instructions.push_tail(inst);
-   
+
+   if (native_integers)
+  try_emit_float_set(ir, op, dst);
+
return inst;
 }
 
@@ -588,6 +593,25 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op)
return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
 }
 
+ /**
+ * Emits the code to convert the result of float SET instructions to integers.
+ */
+void
+glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op,
+st_dst_reg dst)
+{
+   if ((op == TGSI_OPCODE_SEQ ||
+op == TGSI_OPCODE_SNE ||
+op == TGSI_OPCODE_SGE ||
+op == TGSI_OPCODE_SLT))
+   {
+  st_src_reg src = st_src_reg(dst);
+  src.negate = ~src.negate;
+  dst.type = GLSL_TYPE_FLOAT;
+  emit(ir, TGSI_OPCODE_F2I, dst, src);
+   }
+}
+
 /**
  * Determines whether to use an integer, unsigned integer, or float opcode 
  * based on the operands and input opcode, then emits the result.
@@ -604,7 +628,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
   type = GLSL_TYPE_FLOAT;
else if (native_integers)
-  type = src0.type;
+  type = src0.type == GLSL_TYPE_BOOL ? GLSL_TYPE_INT : src0.type;
 
 #define case4(c, f, i, u) \
case TGSI_OPCODE_##c: \
@@ -630,12 +654,7 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
   case3(SGE, ISGE, USGE);
   case3(SLT, ISLT, USLT);
   
-  case2iu(SHL, SHL);
   case2iu(ISHR, USHR);
-  case2iu(NOT, NOT);
-  case2iu(AND, AND);
-  case2iu(OR, OR);
-  case2iu(XOR, XOR);
   
   default: break;
}
@@ -1389,7 +1408,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
switch (ir-operation) {
case ir_unop_logic_not:
   if (result_dst.type != GLSL_TYPE_FLOAT)
- emit(ir, TGSI_OPCODE_SEQ, result_dst, op[0], 
st_src_reg_for_type(result_dst.type, 0));
+ emit(ir, TGSI_OPCODE_NOT, result_dst, op[0]);
   else {
  /* Previously 'SEQ dst, src, 0.0' was used for this.  However, many
   * older GPUs implement SEQ using multiple instructions (i915 uses two
@@ -1489,10 +1508,10 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   emit(ir, TGSI_OPCODE_SLT, result_dst, op[0], op[1]);
   break;
case ir_binop_greater:
-  emit(ir, TGSI_OPCODE_SGT, result_dst, op[0], op[1]);
+  emit(ir, TGSI_OPCODE_SLT, result_dst, op[1], op[0]);
   break;
case ir_binop_lequal:
-  emit(ir, TGSI_OPCODE_SLE, result_dst, op[0], op[1]);
+  emit(ir, TGSI_OPCODE_SGE, result_dst, op[1], op[0]);
   break;
case ir_binop_gequal:
   emit(ir, TGSI_OPCODE_SGE, result_dst, op[0], op[1]);
@@ -1605,41 +1624,52 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
}
 
case ir_binop_logic_xor:
-  emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], op[1]);
+  if (native_integers)
+ emit(ir, TGSI_OPCODE_XOR, result_dst, op[0], op[1]);
+  else

[Mesa-dev] [PATCH] mesa: Replace the EmitNoIfs compiler flag with a MaxIfLevel flag.

2011-08-31 Thread Bryan Cain

This is a better, more fine-grained way of lowering if statements.  Fixes the
game And Yet It Moves on nv50.
---
 src/mesa/drivers/dri/i915/i915_context.c   |2 +-
 src/mesa/main/mtypes.h |6 +-
 src/mesa/program/ir_to_mesa.cpp|8 
 src/mesa/state_tracker/st_extensions.c |2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |6 +++---
 5 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/i915_context.c 
b/src/mesa/drivers/dri/i915/i915_context.c
index 11bee14..7a40ba1 100644
--- a/src/mesa/drivers/dri/i915/i915_context.c
+++ b/src/mesa/drivers/dri/i915/i915_context.c
@@ -189,7 +189,7 @@ i915CreateContext(int api,
 
struct gl_shader_compiler_options *const fs_options =
ctx-ShaderCompilerOptions[MESA_SHADER_FRAGMENT];
-   fs_options-EmitNoIfs = GL_TRUE;
+   fs_options-MaxIfLevel = 0;
fs_options-EmitNoNoise = GL_TRUE;
fs_options-EmitNoPow = GL_TRUE;
fs_options-EmitNoMainReturn = GL_TRUE;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index f2eb889..9f95bcd 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2266,11 +2266,6 @@ struct gl_shader_compiler_options
/** Driver-selectable options: */
GLboolean EmitCondCodes; /** Use condition codes? */
GLboolean EmitNVTempInitialization;  /** 0-fill NV temp registers */
-   /**
-* Attempts to flatten all ir_if (OPCODE_IF) for GPUs that can't
-* support control flow.
-*/
-   GLboolean EmitNoIfs;
GLboolean EmitNoLoops;
GLboolean EmitNoFunctions;
GLboolean EmitNoCont;  /** Emit CONT opcode? */
@@ -2288,6 +2283,7 @@ struct gl_shader_compiler_options
GLboolean EmitNoIndirectUniform; /** No indirect addressing of constants */
/*@}*/
 
+   GLuint MaxIfLevel;   /** Maximum nested IF blocks */
GLuint MaxUnrollIterations;
 
struct gl_sl_pragmas DefaultPragmas; /** Default #pragma settings */
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index dd154db..7fb286e 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3119,7 +3119,7 @@ get_mesa_program(struct gl_context *ctx,
 
   switch (mesa_inst-Opcode) {
   case OPCODE_IF:
-if (options-EmitNoIfs) {
+if (options-MaxIfLevel == 0) {
linker_warning(shader_program,
   Couldn't flatten if-statement.  
   This will likely result in software 
@@ -3241,10 +3241,10 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
 progress = lower_quadop_vector(ir, true) || progress;
 
-if (options-EmitNoIfs) {
+if (options-MaxIfLevel == 0)
progress = lower_discard(ir) || progress;
-   progress = lower_if_to_cond_assign(ir) || progress;
-}
+
+progress = lower_if_to_cond_assign(ir, options-MaxIfLevel) || 
progress;
 
 if (options-EmitNoNoise)
progress = lower_noise(ir) || progress;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 8e90093..9f429d9 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -173,7 +173,7 @@ void st_init_limits(struct st_context *st)
   options-EmitNoNoise = TRUE;
 
   /* TODO: make these more fine-grained if anyone needs it */
-  options-EmitNoIfs = !screen-get_shader_param(screen, sh, 
PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH);
+  options-MaxIfLevel = screen-get_shader_param(screen, sh, 
PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH);
   options-EmitNoLoops = !screen-get_shader_param(screen, sh, 
PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH);
   options-EmitNoFunctions = !screen-get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUBROUTINES);
   options-EmitNoMainReturn = !screen-get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUBROUTINES);
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 98bea69..f5232af 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -4991,10 +4991,10 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
  progress = lower_quadop_vector(ir, false) || progress;
 
- if (options-EmitNoIfs) {
+ if (options-MaxIfLevel == 0)
 progress = lower_discard(ir) || progress;
-progress = lower_if_to_cond_assign(ir) || progress;
- }
+
+ progress = lower_if_to_cond_assign(ir, options-MaxIfLevel) || 
progress;
 
  if (options-EmitNoNoise)
 progress = lower_noise(ir) || progress;
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Missing integer set opcodes in Gallium

2011-08-29 Thread Bryan Cain

I somehow didn't notice this until now, but why are there no Gallium
set opcodes for integers that are equivalent to the SGT or SLE opcodes
for floats?  Does it have to do with available hardware features?

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Missing integer set opcodes in Gallium

2011-08-29 Thread Bryan Cain

On 08/29/2011 03:41 PM, Zack Rusin wrote:
 On Monday, August 29, 2011 04:02:08 PM Zack Rusin wrote:
 Either way though if GL needs those ops then, like Brian mentioned, it'd be
 a  good idea to add them.
 Actually I think it seems easier to just flip the ops rather than add new 
 instructions, i.e. change stuff like a  b, to b  a, and a = b to b = a. 
 Just an opinion though, I don't have strong preferences here.

You're right; that's an easier solution.  That way, the only opcode I'll
need to add is UARL.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers

2011-08-28 Thread Bryan Cain

On 08/26/2011 09:58 PM, Bryan Cain wrote:
 This fixes all of the piglit regressions in softpipe when native integers are
 enabled.
 ---
  src/mesa/main/uniforms.c   |8 +
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 
 ++--
  2 files changed, 43 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
 index cda840f..0801476 100644
 --- a/src/mesa/main/uniforms.c
 +++ b/src/mesa/main/uniforms.c
 @@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct 
 gl_program *program,
   /* if the uniform is bool-valued, convert to 1 or 0 */
   if (isUniformBool) {
  for (i = 0; i  elems; i++) {
 -   if (basicType == GL_FLOAT)
 -  uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0;
 -   else
 -  uniformVal[i].b = uniformVal[i].u ? 1 : 0;
 -   
 -   if (!ctx-Const.NativeIntegers)
 -  uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
 +   uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f;
  }
   }
}

I'd like to push this to master (with one typo fix in the glsl_to_tgsi
part of the patch) since no more objections have been raised, but could
someone verify that this core Mesa change is okay?

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Use a separate div_to_mul_rcp lowering flag for integers.

2011-08-28 Thread Bryan Cain

On 08/28/2011 07:38 PM, Eric Anholt wrote:
 On Sat, 27 Aug 2011 20:18:55 -0700, Kenneth Graunke kenn...@whitecape.org 
 wrote:
 From: Bryan Cain bryanca...@gmail.com

 Using multiply and reciprocal for integer division involves potentially
 lossy floating point conversions.  This is okay for older GPUs that
 represent integers as floating point, but undesirable for GPUs with
 native integer division instructions.

 TGSI, for example, has UDIV/IDIV instructions for integer division,
 so it makes sense to handle this directly.  Likewise for i965.

 Signed-off-by: Bryan Cain bryanca...@gmail.com
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

 ---
 case ir_binop_div:
 -  if (lowering(DIV_TO_MUL_RCP))
 +  if (lowering(INT_DIV_TO_MUL_RCP)  
 ir-operands[1]-type-is_integer())
 + int_div_to_mul_rcp(ir);
 +  else if (lowering(DIV_TO_MUL_RCP))
   div_to_mul_rcp(ir);
break;
  
 Sure looks odd to me for one of these to be checking the type and ther
 other not.

It works, though.  If it's not an integer type, it's going to be a float
type.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers

2011-08-27 Thread Bryan Cain

On 08/27/2011 05:39 AM, Christoph Bumiller wrote:
 On 27.08.2011 04:58, Bryan Cain wrote:
 This fixes all of the piglit regressions in softpipe when native integers are
 enabled.
 ---
  src/mesa/main/uniforms.c   |8 +
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 
 ++--
  2 files changed, 43 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
 index cda840f..0801476 100644
 --- a/src/mesa/main/uniforms.c
 +++ b/src/mesa/main/uniforms.c
 @@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct 
 gl_program *program,
   /* if the uniform is bool-valued, convert to 1 or 0 */
   if (isUniformBool) {
  for (i = 0; i  elems; i++) {
 -   if (basicType == GL_FLOAT)
 -  uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0;
 -   else
 -  uniformVal[i].b = uniformVal[i].u ? 1 : 0;
 -   
 -   if (!ctx-Const.NativeIntegers)
 -  uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
 +   uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f;
  }
   }
}
 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 9cac309..d1674eb 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -385,6 +385,8 @@ public:
 void emit_scalar(ir_instruction *ir, unsigned op,
  st_dst_reg dst, st_src_reg src0, st_src_reg src1);
  
 +   void try_emit_integer_set(ir_instruction *ir, unsigned op, st_dst_reg 
 dst);
 +
 void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
  
 void emit_scs(ir_instruction *ir, unsigned op,
 @@ -563,6 +565,8 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,
  
 this-instructions.push_tail(inst);
 
 +   try_emit_integer_set(ir, op, dst);
 +   
 return inst;
  }
  
 @@ -589,6 +593,32 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op)
  }
  
  /**
 + * Emits the code to convert the result of integer SET instructions to 
 floats.
 + */
 +void
 +glsl_to_tgsi_visitor::try_emit_integer_set(ir_instruction *ir, unsigned op,
 + st_dst_reg dst)
 +{
 +   st_src_reg src;
 +
 +   if (!(op == TGSI_OPCODE_USEQ ||
 + op == TGSI_OPCODE_USNE ||
 + op == TGSI_OPCODE_ISGE ||
 + op == TGSI_OPCODE_USGE ||
 + op == TGSI_OPCODE_ISLT ||
 + op == TGSI_OPCODE_USLT))
 +   return;
 +
 +   src = st_src_reg(dst);
 +   src.type = GLSL_TYPE_INT;
 +
 +   dst.type = GLSL_TYPE_FLOAT;
 +   emit(ir, TGSI_OPCODE_I2F, dst, src);
 +   src.type = GLSL_TYPE_FLOAT;
 +   emit(ir, TGSI_OPCODE_SNE, dst, src, st_src_reg_for_float(0.0));
 +}
 +
 emit(ir, TGSI_OPCODE_AND, dst, src, st_src_reg_for_float(1.0f));

 I still don't quite like booleans as floats, but I guess I can just
 easily track back to the source SET to emit my set-predicate-register op
 without having all the fuss in between.

For now, I'm trying to get things working.  Once integers are working on
softpipe and r600g with no piglit regressions, I'll look into
optimizations like using AND.

 +/**
   * Determines whether to use an integer, unsigned integer, or float opcode 
   * based on the operands and input opcode, then emits the result.
   * 
 @@ -1662,7 +1692,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, op[0]);
break;
 case ir_unop_i2f:
 -   case ir_unop_b2f:
if (native_integers) {
   emit(ir, TGSI_OPCODE_I2F, result_dst, op[0]);
   break;
 @@ -1670,10 +1699,14 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
 case ir_unop_i2u:
 case ir_unop_u2i:
/* Converting between signed and unsigned integers is a no-op. */
 -   case ir_unop_b2i:
 -  /* Booleans are stored as integers (or floats in GLSL 1.20 and 
 lower). */
 +   case ir_unop_b2f:
 +  /* Booleans are stored as floats. */
result_src = op[0];
break;
 +   case ir_unop_b2i:
 +  if (native_integers)
 + emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]);
 +  break;
 case ir_unop_f2i:
if (native_integers)
   emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]);
 @@ -1681,6 +1714,11 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   emit(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
break;
 case ir_unop_f2b:
 +  if (native_integers) {
 + emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], 
 st_src_reg_for_float(0.0));
 + emit(ir, TGSI_OPCODE_F2I, result_dst, result_src);
 This is inconsistent, why do you convert to integer here if you store
 all booleans as floats ?

 +  }
 +  break;
 case ir_unop_i2b:
emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], 
  st_src_reg_for_type(result_dst.type, 0));
 This can go wrong

[Mesa-dev] [PATCH] glsl: use a separate div_to_mul_rcp lowering flag for integers

2011-08-27 Thread Bryan Cain

TGSI, at the very least, has UDIV/IDIV instructions for integer division.
---
 src/glsl/ir_optimization.h |   13 +++--
 src/glsl/lower_instructions.cpp|4 +++-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |1 +
 src/mesa/program/ir_to_mesa.cpp|2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +-
 5 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index f7808bd..48448d4 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -29,12 +29,13 @@
  */
 
 /* Operations for lower_instructions() */
-#define SUB_TO_ADD_NEG 0x01
-#define DIV_TO_MUL_RCP 0x02
-#define EXP_TO_EXP20x04
-#define POW_TO_EXP20x08
-#define LOG_TO_LOG20x10
-#define MOD_TO_FRACT   0x20
+#define SUB_TO_ADD_NEG 0x01
+#define DIV_TO_MUL_RCP 0x02
+#define EXP_TO_EXP20x04
+#define POW_TO_EXP20x08
+#define LOG_TO_LOG20x10
+#define MOD_TO_FRACT   0x20
+#define INT_DIV_TO_MUL_RCP 0x40
 
 bool do_common_optimization(exec_list *ir, bool linked, unsigned 
max_unroll_iterations);
 
diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp
index 23aa19b..fd0c760 100644
--- a/src/glsl/lower_instructions.cpp
+++ b/src/glsl/lower_instructions.cpp
@@ -265,7 +265,9 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
   break;
 
case ir_binop_div:
-  if (lowering(DIV_TO_MUL_RCP))
+  if (lowering(INT_DIV_TO_MUL_RCP)  ir-operands[1]-type-is_integer())
+div_to_mul_rcp(ir);
+  else if (lowering(DIV_TO_MUL_RCP))
 div_to_mul_rcp(ir);
   break;
 
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 3ff6bba..7e53097 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -100,6 +100,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   lower_instructions(shader-ir,
 MOD_TO_FRACT |
 DIV_TO_MUL_RCP |
+INT_DIV_TO_MUL_RCP |
 SUB_TO_ADD_NEG |
 EXP_TO_EXP2 |
 LOG_TO_LOG2);
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 6820e4c..dd154db 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3232,7 +3232,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 /* Lowering */
 do_mat_op_to_vec(ir);
 lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2
-| LOG_TO_LOG2
+| LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP
 | ((options-EmitNoPow) ? POW_TO_EXP2 : 0)));
 
 progress = do_lower_jumps(ir, true, true, options-EmitNoMainReturn, 
options-EmitNoCont, options-EmitNoLoops) || progress;
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9cac309..7aef4aa 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5012,7 +5012,7 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
  /* Lowering */
  do_mat_op_to_vec(ir);
  lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2
-| LOG_TO_LOG2
+| LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP
 | ((options-EmitNoPow) ? POW_TO_EXP2 : 0)));
 
  progress = do_lower_jumps(ir, true, true, options-EmitNoMainReturn, 
options-EmitNoCont, options-EmitNoLoops) || progress;
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa, glsl_to_tgsi: fixes for native integers

2011-08-26 Thread Bryan Cain

This fixes all of the piglit regressions in softpipe when native integers are
enabled.
---
 src/mesa/main/uniforms.c   |8 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 ++--
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index cda840f..0801476 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -776,13 +776,7 @@ set_program_uniform(struct gl_context *ctx, struct 
gl_program *program,
  /* if the uniform is bool-valued, convert to 1 or 0 */
  if (isUniformBool) {
 for (i = 0; i  elems; i++) {
-   if (basicType == GL_FLOAT)
-  uniformVal[i].b = uniformVal[i].f != 0.0f ? 1 : 0;
-   else
-  uniformVal[i].b = uniformVal[i].u ? 1 : 0;
-   
-   if (!ctx-Const.NativeIntegers)
-  uniformVal[i].f = uniformVal[i].b ? 1.0f : 0.0f;
+   uniformVal[i].f = uniformVal[i].f != 0.0f ? 1.0f : 0.0f;
 }
  }
   }
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9cac309..d1674eb 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -385,6 +385,8 @@ public:
void emit_scalar(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_src_reg src0, st_src_reg src1);
 
+   void try_emit_integer_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
+
void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
 
void emit_scs(ir_instruction *ir, unsigned op,
@@ -563,6 +565,8 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op,
 
this-instructions.push_tail(inst);

+   try_emit_integer_set(ir, op, dst);
+   
return inst;
 }
 
@@ -589,6 +593,32 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op)
 }
 
 /**
+ * Emits the code to convert the result of integer SET instructions to floats.
+ */
+void
+glsl_to_tgsi_visitor::try_emit_integer_set(ir_instruction *ir, unsigned op,
+st_dst_reg dst)
+{
+   st_src_reg src;
+
+   if (!(op == TGSI_OPCODE_USEQ ||
+ op == TGSI_OPCODE_USNE ||
+ op == TGSI_OPCODE_ISGE ||
+ op == TGSI_OPCODE_USGE ||
+ op == TGSI_OPCODE_ISLT ||
+ op == TGSI_OPCODE_USLT))
+  return;
+
+   src = st_src_reg(dst);
+   src.type = GLSL_TYPE_INT;
+
+   dst.type = GLSL_TYPE_FLOAT;
+   emit(ir, TGSI_OPCODE_I2F, dst, src);
+   src.type = GLSL_TYPE_FLOAT;
+   emit(ir, TGSI_OPCODE_SNE, dst, src, st_src_reg_for_float(0.0));
+}
+
+/**
  * Determines whether to use an integer, unsigned integer, or float opcode 
  * based on the operands and input opcode, then emits the result.
  * 
@@ -1662,7 +1692,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, op[0]);
   break;
case ir_unop_i2f:
-   case ir_unop_b2f:
   if (native_integers) {
  emit(ir, TGSI_OPCODE_I2F, result_dst, op[0]);
  break;
@@ -1670,10 +1699,14 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
case ir_unop_i2u:
case ir_unop_u2i:
   /* Converting between signed and unsigned integers is a no-op. */
-   case ir_unop_b2i:
-  /* Booleans are stored as integers (or floats in GLSL 1.20 and lower). */
+   case ir_unop_b2f:
+  /* Booleans are stored as floats. */
   result_src = op[0];
   break;
+   case ir_unop_b2i:
+  if (native_integers)
+ emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]);
+  break;
case ir_unop_f2i:
   if (native_integers)
  emit(ir, TGSI_OPCODE_F2I, result_dst, op[0]);
@@ -1681,6 +1714,11 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
  emit(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
   break;
case ir_unop_f2b:
+  if (native_integers) {
+ emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], 
st_src_reg_for_float(0.0));
+ emit(ir, TGSI_OPCODE_F2I, result_dst, result_src);
+  }
+  break;
case ir_unop_i2b:
   emit(ir, TGSI_OPCODE_SNE, result_dst, op[0], 
 st_src_reg_for_type(result_dst.type, 0));
@@ -2154,6 +2192,7 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
   inst = (glsl_to_tgsi_instruction *)this-instructions.get_tail();
   new_inst = emit(ir, inst-op, l, inst-src[0], inst-src[1], 
inst-src[2]);
   new_inst-saturate = inst-saturate;
+  inst-dead_mask = inst-dst.writemask;
} else {
   for (i = 0; i  type_size(ir-lhs-type); i++) {
  emit(ir, TGSI_OPCODE_MOV, l, r);
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] st_glsl_to_tgsi: implement TXS/TXQ. (v2)

2011-08-25 Thread Bryan Cain

The usual commit prefix for the GLSL-TGSI translator is glsl_to_tgsi.

Other than that,
Reviewed-by: Bryan Cain bryanca...@gmail.com

On 08/25/2011 09:46 AM, Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com

 GLSL uses TXS, call the gallium TXQ opcode.

 v2: fix indent from 4-3.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   27 ++-
  1 files changed, 18 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 6f0d9fa..fb5060c 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -2426,16 +2426,18 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
 glsl_to_tgsi_instruction *inst = NULL;
 unsigned opcode = TGSI_OPCODE_NOP;
  
 -   ir-coordinate-accept(this);
 +   if (ir-coordinate) {
 +  ir-coordinate-accept(this);
  
 -   /* Put our coords in a temp.  We'll need to modify them for shadow,
 -* projection, or LOD, so the only case we'd use it as is is if
 -* we're doing plain old texturing.  The optimization passes on
 -* glsl_to_tgsi_visitor should handle cleaning up our mess in that case.
 -*/
 -   coord = get_temp(glsl_type::vec4_type);
 -   coord_dst = st_dst_reg(coord);
 -   emit(ir, TGSI_OPCODE_MOV, coord_dst, this-result);
 +  /* Put our coords in a temp.  We'll need to modify them for shadow,
 +   * projection, or LOD, so the only case we'd use it as is is if
 +   * we're doing plain old texturing.  The optimization passes on
 +   * glsl_to_tgsi_visitor should handle cleaning up our mess in that 
 case.
 +   */
 +  coord = get_temp(glsl_type::vec4_type);
 +  coord_dst = st_dst_reg(coord);
 +  emit(ir, TGSI_OPCODE_MOV, coord_dst, this-result);
 +   }
  
 if (ir-projector) {
ir-projector-accept(this);
 @@ -2470,6 +2472,10 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
dy = this-result;
break;
 case ir_txs:
 +  opcode = TGSI_OPCODE_TXQ;
 +  ir-lod_info.lod-accept(this);
 +  lod_info = this-result;
 +  break;
 case ir_txf: /* TODO: use TGSI_OPCODE_TXF here */
assert(!GLSL 1.30 features unsupported);
break;
 @@ -2544,6 +2550,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
  
 if (opcode == TGSI_OPCODE_TXD)
inst = emit(ir, opcode, result_dst, coord, dx, dy);
 +   else if (opcode == TGSI_OPCODE_TXQ)
 +  inst = emit(ir, opcode, result_dst, lod_info);
 else
inst = emit(ir, opcode, result_dst, coord);
  
 @@ -4276,6 +4284,7 @@ compile_tgsi_instruction(struct st_translate *t,
 case TGSI_OPCODE_TXD:
 case TGSI_OPCODE_TXL:
 case TGSI_OPCODE_TXP:
 +   case TGSI_OPCODE_TXQ:
src[num_src++] = t-samplers[inst-sampler];
ureg_tex_insn(ureg,
  inst-op,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glsl-tgsi: add TXF support.

2011-08-25 Thread Bryan Cain

Like the other patch, the commit prefix should probably be glsl_to_tgsi.

On 08/25/2011 09:51 AM, Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com

 This adds texelFetch support to translate from GLSL to TGSI TXF opcode.

 I've tested this works with an r600g and softpipe backend.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |8 ++--
  1 files changed, 6 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index fb5060c..8f9b843 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -2477,7 +2477,9 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
lod_info = this-result;
break;
 case ir_txf: /* TODO: use TGSI_OPCODE_TXF here */

You can remove the comment on the above line now that TXF is actually
being used there.

 -  assert(!GLSL 1.30 features unsupported);
 +  opcode = TGSI_OPCODE_TXF;
 +  ir-lod_info.lod-accept(this);
 +  lod_info = this-result;
break;
 }
  
 @@ -2541,7 +2543,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
coord_dst.writemask = WRITEMASK_XYZW;
 }
  
 -   if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXB) {
 +   if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXB ||
 +   opcode == TGSI_OPCODE_TXF) {
/* TGSI stores LOD or LOD bias in the last channel of the coords. */
coord_dst.writemask = WRITEMASK_W;
emit(ir, TGSI_OPCODE_MOV, coord_dst, lod_info);
 @@ -4285,6 +4288,7 @@ compile_tgsi_instruction(struct st_translate *t,
 case TGSI_OPCODE_TXL:
 case TGSI_OPCODE_TXP:
 case TGSI_OPCODE_TXQ:
 +   case TGSI_OPCODE_TXF:
src[num_src++] = t-samplers[inst-sampler];
ureg_tex_insn(ureg,
  inst-op,

Aside from those things,
Reviewed-by: Bryan Cain bryanca...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] is_tex bit for TGSI sampling instructions

2011-08-24 Thread Bryan Cain

On 08/24/2011 07:26 AM, Michal Krol wrote:
 - Original Message -
 Any reasons

{ 1, 2, 0, 0, 0, 0, LOAD,TGSI_OPCODE_LOAD },
{ 1, 2, 0, 0, 0, 0, LOAD_MS, TGSI_OPCODE_LOAD_MS },
{ 1, 3, 0, 0, 0, 0, SAMPLE,  TGSI_OPCODE_SAMPLE },
{ 1, 4, 0, 0, 0, 0, SAMPLE_B,TGSI_OPCODE_SAMPLE_B },
{ 1, 4, 0, 0, 0, 0, SAMPLE_C,TGSI_OPCODE_SAMPLE_C },
{ 1, 4, 0, 0, 0, 0, SAMPLE_C_LZ, TGSI_OPCODE_SAMPLE_C_LZ },
{ 1, 5, 0, 0, 0, 0, SAMPLE_D,TGSI_OPCODE_SAMPLE_D },
{ 1, 3, 0, 0, 0, 0, SAMPLE_L,TGSI_OPCODE_SAMPLE_L },
{ 1, 3, 0, 0, 0, 0, GATHER4, TGSI_OPCODE_GATHER4 },
{ 1, 2, 0, 0, 0, 0, RESINFO, TGSI_OPCODE_RESINFO },
{ 1, 2, 0, 0, 0, 0, SAMPLE_POS,  TGSI_OPCODE_SAMPLE_POS },
{ 1, 2, 0, 0, 0, 0, SAMPLE_INFO, TGSI_OPCODE_SAMPLE_INFO },

 these don't have the is_tex bit set?

 This bit marks legacy texture sample instructions that have texture target 
 encoded directly into the instruction token. In contrast, the instructions 
 above take an extra argument of type RESOURCE, for which the texture target 
 is encoded in register declaration.

Like Dave said, the GLSL-TGSI translator needs to account for this. 
I'm not sure on how to initialize a resource in TGSI, though.  Is there
a state tracker that uses it that could provide an example?

I also have an unrelated TGSI question: has it been determined exactly
what the source regs for TXF should be and what order they should be in?

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] is_tex bit for TGSI sampling instructions

2011-08-24 Thread Bryan Cain

On 08/24/2011 11:42 PM, Zack Rusin wrote:
 On Wednesday, August 24, 2011 10:14:48 PM Bryan Cain wrote:
 Like Dave said, the GLSL-TGSI translator needs to account for this.
 Probably not, at least yet. All of those instructions are DX10.1 level 
 instructions which support splitting of samplers and the underlaying 
 resource, 
 we don't support that yet.

 I'm not sure on how to initialize a resource in TGSI, though.  Is there
 a state tracker that uses it that could provide an example?
 I don't think any of the open drivers actually implement this scheme quite 
 yet. The resources are declared using ureg_DECL_resource, the resulting code 
 would look something like:
 DCL RES[0], 2D, FLOAT

 LOAD DST[0], SRC[0], RES[0]

Okay, thanks for your answer.

 I also have an unrelated TGSI question: has it been determined exactly
 what the source regs for TXF should be and what order they should be in?
 Two four component integer vectors, first is the coordinate the other is the 
 offset. I think gpu_program4 specifies that instruction.

 Ultimately I wanted to replace all the texture sampling/fetching/loading 
 instructions with the DX10.1 variants. There's less of them and they're 
 cleaner. Drivers which don't support that split could just ignore it. For 
 various reasons I never finished it so this texture sampling code is quite 
 frankly half-assed, I'd just use the documented/driver implemented opcodes 
 and 
 leave others untouched.

 z

The problem is that we need these for GLSL 1.30.  Since all the pieces
are falling into place on the Mesa side, we need to have support for
these operations soon on the Gallium side.  That's why I want someone to
actually implement these operations in at least one driver.  Especially
TXF; how hard would that really be for someone who understands the code
to implement in tgsi_exec?

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] mesa: Make the gl_constant_value's bool occupy the same space as float/int.

2011-08-20 Thread Bryan Cain

On 08/20/2011 03:05 PM, Dan McCabe wrote:
 What are the implications for other architectures that support doubles?

I don't see what you mean.  gl_constant_value doesn't support doubles yet.

Bryan


 On 08/19/2011 05:56 PM, Eric Anholt wrote:
 At least for Intel, all our uniform components are of uint32_t size,
 either
 float or signed or unsigned int.  For uploading uniform data in the
 driver,
 it's much easier to upload a full dword per uniform element instead
 of trying
 to pick out the bool byte and then fill in the top 3 bytes of pad
 with 0.
 ---
   src/mesa/program/prog_parameter.h |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/program/prog_parameter.h
 b/src/mesa/program/prog_parameter.h
 index 1a5ed34..4c2773a 100644
 --- a/src/mesa/program/prog_parameter.h
 +++ b/src/mesa/program/prog_parameter.h
 @@ -53,7 +53,7 @@
   typedef union gl_constant_value
   {
  GLfloat f;
 -   GLboolean b;
 +   GLint b;
  GLint i;
  GLuint u;
   } gl_constant_value;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] mesa: Make the gl_constant_value's bool occupy the same space as float/int.

2011-08-20 Thread Bryan Cain

On 08/20/2011 05:10 PM, Dan McCabe wrote:
 On 08/20/2011 01:30 PM, Bryan Cain wrote:
 On 08/20/2011 03:05 PM, Dan McCabe wrote:
 What are the implications for other architectures that support doubles?

 I don't see what you mean.  gl_constant_value doesn't support doubles
 yet.

 Yet - that is the operative word.

 You can buy GPUs that support doubles today. Therefore, double support
 should be on our radar (it appears to be on Eric's radar, based on his
 commit comment). And we should understand what the implications are
 for our code. Clearly, I don't understand the implications; if I did,
 I wouldn't have asked. But perhaps Eric might.

 There are a couple of possible answers.

 I don't know - OK, but at least let's ask the question and start
 thinking about it's answer.

 No impact - I like that answer.

 It affects this, and this, and this - While not ideal, at least we
 then know what to do in the future and prepare ourselves for that future.

 cheers, danm

This particular commit has no impact at all.  A core assumption of
gl_constant_value is that 4 of each type in the union will fit into a
vec4, so the gl_constant_value code will need to be reworked when double
support is added.  But this commit doesn't afffect that in any way.

Bryan



 Bryan


 On 08/19/2011 05:56 PM, Eric Anholt wrote:
 At least for Intel, all our uniform components are of uint32_t size,
 either
 float or signed or unsigned int.  For uploading uniform data in the
 driver,
 it's much easier to upload a full dword per uniform element instead
 of trying
 to pick out the bool byte and then fill in the top 3 bytes of pad
 with 0.
 ---
src/mesa/program/prog_parameter.h |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/program/prog_parameter.h
 b/src/mesa/program/prog_parameter.h
 index 1a5ed34..4c2773a 100644
 --- a/src/mesa/program/prog_parameter.h
 +++ b/src/mesa/program/prog_parameter.h
 @@ -53,7 +53,7 @@
typedef union gl_constant_value
{
   GLfloat f;
 -   GLboolean b;
 +   GLint b;
   GLint i;
   GLuint u;
} gl_constant_value;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: fix incorrect loop over instruction src regs

2011-08-17 Thread Bryan Cain

The usual commit message prefix for changes to glsl_to_tgsi is
glsl_to_tgsi, not st/mesa.

On 08/16/2011 05:33 PM, Brian Paul wrote:
 The array of src regs is of size 3, not 4.
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index aef23e7..7b90c81 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -3443,7 +3443,7 @@ glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
   /* Continuing the block, clear any channels from the write array 
 that
* are read by this instruction.
*/
 - for (int i = 0; i  4; i++) {
 + for (unsigned i = 0; i  Elements(inst-src); i++) {

Why not just use 3 here?

  if (inst-src[i].file == PROGRAM_TEMPORARY  
 inst-src[i].reladdr){
 /* Any temporary might be read, so no dead code elimination 
  * across this instruction.

You're right that the change itself is needed, though.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: fix incorrect loop over instruction src regs

2011-08-17 Thread Bryan Cain

On 08/17/2011 09:47 AM, Keith Whitwell wrote:
 On Wed, 2011-08-17 at 09:36 -0500, Bryan Cain wrote:
 The usual commit message prefix for changes to glsl_to_tgsi is
 glsl_to_tgsi, not st/mesa.

 On 08/16/2011 05:33 PM, Brian Paul wrote:
 The array of src regs is of size 3, not 4.
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index aef23e7..7b90c81 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -3443,7 +3443,7 @@ 
 glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
   /* Continuing the block, clear any channels from the write array 
 that
* are read by this instruction.
*/
 - for (int i = 0; i  4; i++) {
 + for (unsigned i = 0; i  Elements(inst-src); i++) {
 Why not just use 3 here?
 Elements(inst-src) is self-documenting.  

 3 is just a number and to figure out if it was the correct number you'd
 have to go and look at the header file to see if it matched the value
 there.

 Both should generate the same compiled code.

 Keith

Fair enough.  The patch was pushed to master shortly before I sent that
mail, so there's not much point in discussing it further.

Bryan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: inline st_prepare_fragment_program in st_translate_fragment_program

2011-08-07 Thread Bryan Cain

This reverts an unnecessary part of commit 4683529048ee and fixes misrendering
and an assertion failure in Cogs.

Fixes freedesktop.org bug 39888.
---
 src/mesa/state_tracker/st_program.c |  326 +--
 src/mesa/state_tracker/st_program.h |   15 --
 2 files changed, 162 insertions(+), 179 deletions(-)

diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index ca01d2e..a4f47ed 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -416,151 +416,6 @@ st_get_vp_variant(struct st_context *st,
return vpv;
 }
 
-/**
- * Translate Mesa fragment shader attributes to TGSI attributes.
- * \return GL_TRUE if color output should be written to all render targets, 
- * GL_FALSE if not
- */
-GLboolean
-st_prepare_fragment_program(struct gl_context *ctx,
-struct st_fragment_program *stfp)
-{
-   GLuint attr;
-   const GLbitfield inputsRead = stfp-Base.Base.InputsRead;
-   GLboolean write_all = GL_FALSE;
-
-   /*
-* Convert Mesa program inputs to TGSI input register semantics.
-*/
-   for (attr = 0; attr  FRAG_ATTRIB_MAX; attr++) {
-  if (inputsRead  (1  attr)) {
- const GLuint slot = stfp-num_inputs++;
-
- stfp-input_to_index[attr] = slot;
-
- switch (attr) {
- case FRAG_ATTRIB_WPOS:
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_POSITION;
-stfp-input_semantic_index[slot] = 0;
-stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR;
-break;
- case FRAG_ATTRIB_COL0:
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_COLOR;
-stfp-input_semantic_index[slot] = 0;
-stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR;
-break;
- case FRAG_ATTRIB_COL1:
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_COLOR;
-stfp-input_semantic_index[slot] = 1;
-stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR;
-break;
- case FRAG_ATTRIB_FOGC:
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_FOG;
-stfp-input_semantic_index[slot] = 0;
-stfp-interp_mode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
-break;
- case FRAG_ATTRIB_FACE:
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_FACE;
-stfp-input_semantic_index[slot] = 0;
-stfp-interp_mode[slot] = TGSI_INTERPOLATE_CONSTANT;
-break;
-/* In most cases, there is nothing special about these
- * inputs, so adopt a convention to use the generic
- * semantic name and the mesa FRAG_ATTRIB_ number as the
- * index. 
- * 
- * All that is required is that the vertex shader labels
- * its own outputs similarly, and that the vertex shader
- * generates at least every output required by the
- * fragment shader plus fixed-function hardware (such as
- * BFC).
- * 
- * There is no requirement that semantic indexes start at
- * zero or be restricted to a particular range -- nobody
- * should be building tables based on semantic index.
- */
- case FRAG_ATTRIB_PNTC:
- case FRAG_ATTRIB_TEX0:
- case FRAG_ATTRIB_TEX1:
- case FRAG_ATTRIB_TEX2:
- case FRAG_ATTRIB_TEX3:
- case FRAG_ATTRIB_TEX4:
- case FRAG_ATTRIB_TEX5:
- case FRAG_ATTRIB_TEX6:
- case FRAG_ATTRIB_TEX7:
- case FRAG_ATTRIB_VAR0:
- default:
-/* Actually, let's try and zero-base this just for
- * readability of the generated TGSI.
- */
-assert(attr = FRAG_ATTRIB_TEX0);
-stfp-input_semantic_index[slot] = (attr - FRAG_ATTRIB_TEX0);
-stfp-input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC;
-if (attr == FRAG_ATTRIB_PNTC)
-   stfp-interp_mode[slot] = TGSI_INTERPOLATE_LINEAR;
-else
-   stfp-interp_mode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
-break;
- }
-  }
-  else {
- stfp-input_to_index[attr] = -1;
-  }
-   }
-
-   /*
-* Semantics and mapping for outputs
-*/
-   {
-  uint numColors = 0;
-  GLbitfield64 outputsWritten = stfp-Base.Base.OutputsWritten;
-
-  /* if z is written, emit that first */
-  if (outputsWritten  BITFIELD64_BIT(FRAG_RESULT_DEPTH)) {
- stfp-output_semantic_name[stfp-num_outputs] = 
TGSI_SEMANTIC_POSITION;
- stfp-output_semantic_index[stfp-num_outputs] = 0;
- stfp-result_to_output[FRAG_RESULT_DEPTH] = stfp-num_outputs;
- stfp-num_outputs++;
- outputsWritten = ~(1  FRAG_RESULT_DEPTH);
-  }
-
-  if (outputsWritten  BITFIELD64_BIT(FRAG_RESULT_STENCIL)) {
- stfp-output_semantic_name[stfp-num_outputs] = TGSI_SEMANTIC_STENCIL;
-

Re: [Mesa-dev] Status of the GLSL-TGSI translator, part 2

2011-08-04 Thread Bryan Cain

On 08/04/2011 08:24 AM, Brian Paul wrote:
 On Tue, Aug 2, 2011 at 3:50 PM, Bryan Cain bryanca...@gmail.com wrote:
 On 08/02/2011 11:27 AM, Brian Paul wrote:
 On 08/01/2011 12:38 PM, Bryan Cain wrote:
 Since Mesa 7.11 is released now, I figure it's time to discuss merging
 the glsl-to-tgsi branch to master again.  The translator is more mature
 than last time.  There are no regressions that I know of on any driver.
 The code generation has improved so that it's the same as or better than
 ir_to_mesa in almost every case.

 It also will still emit TGSI integer opcodes if you change
 ctx-GLSLVersion from 120 to 130 in st_extensions.c.  Driver developers
 can use this to implement these opcodes in their drivers, since they
 will be needed for GLSL 1.30 eventually.

 Are there any objections to merging it to master now?  If there aren't,
 I'll revise some of the commit messages for correctness and wrap long
 lines since cgit doesn't do that automatically, then merge it after
 getting approval.
 Sounds OK to me.

 I was just skimming over the new stuff - it looks like
 create_color_map_texture() is duplicated in two files and could be
 shared.

 -Brian
 Yes, where should the function go and what header should it be declared
 in?  It appears to me that none of the functions in st_atom_*.c are
 declared directly in header files.
 How about st_texture.[ch]?  That's where st_texture_create() lives.

Okay, done.  Is there anything else that should be done, or is it ready
to merge to master?

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] S2TC - yet another attempt to solve the S3TC issue

2011-08-03 Thread Bryan Cain

On 08/03/2011 01:58 PM, Ian Romanick wrote:
 On 08/03/2011 08:04 AM, Brian Paul wrote:
  On Mon, Aug 1, 2011 at 11:44 AM, Rudolf Polzer
 divver...@xonotic.org wrote:
  Hi,
 
  I developed, together with Maik Merten, a replacement for S3TC with the
  following properties:
 
  - does not use any interesting algorithms (no color ramps, each
 4x4 block is
   just a 2 colors palette - basically, this is Color Cell
 Compression from
   August 1986)
  - is perfectly decodable by any S3TC decoder
  - can somewhat (at a noticeable quality loss) attempt to decode S3TC
  - due to the simpler nature of the algorithm, it outperforms the S3TC
   libtxc_dxtn
  - due to the compression being simpler, it was easier to write a good
   compression algorithm for it, and it sometimes yields better
 quality than
   NVidia Texture Tools encoding to full S3TC
 
  It is called S2TC because each block only has 2 colors, and expands
 to Super
  Simple Texture Compression.
 
  It comes in form of a libtxc_dxtn replacement library, so it can be
 used with
  Mesa right now.
 
  You can find more info on
 
  https://github.com/divVerent/s2tc/wiki
 
  including quality comparison pictures that compare it to full S3TC.
 
  When used in conjunction with an OpenGL driver, this would mean:
 
  - when a precompressed texture is uploaded, nothing changes - it is
 uploaded as
   full S3TC
  - when an uncompressed texture is uploaded as a compressed format,
 it is converted
   using the S2TC encoder, which typically yields less quality than a
 S3TC encoder,
   but still renders correctly and usually good enough (see
 screenshots on my
   wiki)
  - in case a software fallback hits (or llvmpipe is used), S2TC is
 used for
   decoding - due to some bit values only being defined in the full
 S3TC format
   spec, this will be somewhat wrong and yields reduced quality when
 precompressed
   S3TC textures are used
  - I believe this suffices to claim support for
 GL_EXT_texture_compression_s3tc
   without having an actual S3TC compressor, as compressing to a
 sub-format of
   S3TC is the same as just having a less clever S3TC compressor.
 Decompressing being
   incorrect (but still good enough for most uses) in case of
 software fallbacks
   however may be a problem, but then one could still instead upload
 it to the
   GPU, let it decode it, and download the decoded texture instead
 
  The question now is:
 
  - does Mesa have any interest in integrating this method (doesn't
 have to be my
   reference implementation, which implements quite many selectable
 variations
   of the encoding methods), and still using full S3TC if a
 libtxc_dxtn is found?
  - or would Mesa be interested in linking to this implementation as
 an alternative
   to full libtxc_dxtn especially for the USA, as it's likely that less
   licenses are required to use this method? One possible
 implementation BTW could
   be Linux distros shipping S2TC libtxc_dxtn by default, and
 providing a simple
   way for users to upgrade to a full S3TC library, if they are aware
 of the issues
   with it and see themselves not affected by them?

  Hi,

  I saw the story about this on Phronix a couple weeks ago.  I like it.
  As far as I'm concerned, I think it would be OK to integrate this into
  Mesa to use as a fallback when libtxc_dxtn isn't available.

  Does anyone else have an opinion?

 I think this solves the issue for the compressor and for the software
 decompressor.  I don't think this solves the problem for the
 decompressor for hardware drivers, so that may still pose a problem for
 Linux distros.

Pardon my ignorance, but why do hardware drivers need a decompressor?

Bryan

 Of course, all of this may actually be irrelevant soon:

 http://www.litigatingapple.com/blog/2011/7/24/why-htcs-courtship-of-s3-might-be-too-clever-by-half.html

 I think the ideal solution would be to have something in core Mesa that
 could eventually evolve into full S3TC support.  When this happens, I
 don't think libtxc_dxtn is the best answer.  I think libsquish
 (http://code.google.com/p/libsquish/) is probably a better answer.  This
 library is used in other projects, and it has quite a bit more optimized
 implementation.

 What we really want is to just do the compression on the GPU, but that's
 a bit larger project.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Status of the GLSL-TGSI translator, part 2

2011-08-01 Thread Bryan Cain

Since Mesa 7.11 is released now, I figure it's time to discuss merging
the glsl-to-tgsi branch to master again.  The translator is more mature
than last time.  There are no regressions that I know of on any driver. 
The code generation has improved so that it's the same as or better than
ir_to_mesa in almost every case.

It also will still emit TGSI integer opcodes if you change
ctx-GLSLVersion from 120 to 130 in st_extensions.c.  Driver developers
can use this to implement these opcodes in their drivers, since they
will be needed for GLSL 1.30 eventually.

Are there any objections to merging it to master now?  If there aren't,
I'll revise some of the commit messages for correctness and wrap long
lines since cgit doesn't do that automatically, then merge it after
getting approval.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] Make Gallium drivers respect the force_s3tc_enable environment variable

2011-07-26 Thread Bryan Cain

On Mon, Jul 25, 2011 at 1:50 PM, Jose Fonseca jfons...@vmware.com wrote:
 I have no objection FWIW.

 Jose

 - Original Message -
 This is a revised version of my previous patches.  Patch 1
 incorporates
 Brian's feedback, and patch 2 is unchanged from before.  I did test
 patch 2
 without patch 1, and found that both patches are neeeded for
 force_s3tc_enable
 to work.

 Both of these are candidates for 7.10 and 7.11, since games such as 0
 A.D.
 expect drivers to support this.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev



Can I go ahead and push these patches this afternoon if there are no objections?

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa 7.11 release candidate 3

2011-07-26 Thread Bryan Cain

On Mon, Jul 25, 2011 at 9:54 PM, Ian Romanick i...@freedesktop.org wrote:
 The only changes in the 7.11 branch (until the final release) should be
 low-risk fixes for significant bugs.  Other than updates to the release
 notes and changing the version, RC3 should represent 7.11 final.

Would my patches for making Gallium drivers respect force_s3tc_enable
qualify?  They are very low-risk, and fix
https://bugs.freedesktop.org/show_bug.cgi?id=29012.  I don't know what
qualifies as a significant bug, but the fix makes at least one game
(fs2_open) work out of the box with Gallium drivers that didn't
before.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2] Make Gallium drivers respect the force_s3tc_enable environment variable

2011-07-25 Thread Bryan Cain

This is a revised version of my previous patches.  Patch 1 incorporates
Brian's feedback, and patch 2 is unchanged from before.  I did test patch 2
without patch 1, and found that both patches are neeeded for force_s3tc_enable
to work.

Both of these are candidates for 7.10 and 7.11, since games such as 0 A.D.
expect drivers to support this.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable

2011-07-25 Thread Bryan Cain

NOTE: This is a candidate for the 7.10 and 7.11 branches.
---
 src/mesa/state_tracker/st_extensions.c |   11 ++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 99b231d..b5f6d35 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -208,6 +208,15 @@ void st_init_limits(struct st_context *st)
 }
 
 
+static GLboolean st_get_s3tc_override(void)
+{
+   const char *override = _mesa_getenv(force_s3tc_enable);
+   if (override  !strcmp(override, true))
+  return GL_TRUE;
+   return GL_FALSE;
+}
+
+
 /**
  * Use pipe_screen::get_param() to query PIPE_CAP_ values to determine
  * which GL extensions are supported.
@@ -426,7 +435,7 @@ void st_init_extensions(struct st_context *st)
if (screen-is_format_supported(screen, PIPE_FORMAT_DXT5_RGBA,
PIPE_TEXTURE_2D, 0,
PIPE_BIND_SAMPLER_VIEW) 
-   ctx-Mesa_DXTn) {
+   (ctx-Mesa_DXTn || st_get_s3tc_override())) {
   ctx-Extensions.EXT_texture_compression_s3tc = GL_TRUE;
   ctx-Extensions.S3_s3tc = GL_TRUE;
}
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] util: enable S3TC support when the force_s3tc_enable env var is set to true

2011-07-25 Thread Bryan Cain

NOTE: This is a candidate for the 7.10 and 7.11 branches.
---
 src/gallium/auxiliary/util/u_format_s3tc.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c 
b/src/gallium/auxiliary/util/u_format_s3tc.c
index bb989c2..d8a7c0d 100644
--- a/src/gallium/auxiliary/util/u_format_s3tc.c
+++ b/src/gallium/auxiliary/util/u_format_s3tc.c
@@ -119,8 +119,15 @@ util_format_s3tc_init(void)
 
library = util_dl_open(DXTN_LIBNAME);
if (!library) {
-  debug_printf(couldn't open  DXTN_LIBNAME , software DXTn 
- compression/decompression unavailable\n);
+  if (getenv(force_s3tc_enable) 
+  !strcmp(getenv(force_s3tc_enable), true)) {
+ debug_printf(couldn't open  DXTN_LIBNAME , enabling DXTn due to 
+force_s3tc_enable=true environment variable\n);
+ util_format_s3tc_enabled = TRUE;
+  } else {
+ debug_printf(couldn't open  DXTN_LIBNAME , software DXTn 
+compression/decompression unavailable\n);
+  }
   return;
}
 
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable

2011-07-21 Thread Bryan Cain

On 07/21/2011 08:48 AM, Brian Paul wrote:
 On 07/20/2011 04:53 PM, Bryan Cain wrote:
 ---
   src/mesa/state_tracker/st_extensions.c |   12 +++-
   1 files changed, 11 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/state_tracker/st_extensions.c
 b/src/mesa/state_tracker/st_extensions.c
 index 99b231d..073e72c 100644
 --- a/src/mesa/state_tracker/st_extensions.c
 +++ b/src/mesa/state_tracker/st_extensions.c
 @@ -208,6 +208,16 @@ void st_init_limits(struct st_context *st)
   }


 +static int st_get_s3tc_override(void)
 +{
 +   const char *override = _mesa_getenv(force_s3tc_enable);
 +   fprintf(stderr, force_s3tc_enable=%s\n, override);

 This fprintf() looks like left-over debug code.

Oops, you're right.


 +   if (override  !strcmp(override, true))
 +  return GL_TRUE;
 +   return GL_FALSE;

 If you're going to return GL_TRUE/GL_FALSE, the function type should
 be GLboolean.

Okay, I'll resubmit the patch with those changes.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Patches to make Gallium drivers respect the force_s3tc_enable environment variable

2011-07-20 Thread Bryan Cain

The purpose of the following two patches is to make st/mesa expose the S3TC
extensions when the force_s3tc_enable environment variable is used.  This is
to match the behavior of the DRI drivers, in which force_s3tc_enable is an
option in driconf.  Although st/mesa can't use the driconf functions, it can
at least respect the environment variable for this option.

These patches have been tested with fs2_open 3.6.12 and the Freespace 2
MediaVPs.  Setting force_s3tc_enable=true makes the game run flawlessly
without libtxc_dxtn, whereas without it the game writes several error messages
to its log file and then crashes.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] st/mesa: respect force_s3tc_enable environment variable

2011-07-20 Thread Bryan Cain

---
 src/mesa/state_tracker/st_extensions.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 99b231d..073e72c 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -208,6 +208,16 @@ void st_init_limits(struct st_context *st)
 }
 
 
+static int st_get_s3tc_override(void)
+{
+   const char *override = _mesa_getenv(force_s3tc_enable);
+   fprintf(stderr, force_s3tc_enable=%s\n, override);
+   if (override  !strcmp(override, true))
+  return GL_TRUE;
+   return GL_FALSE;
+}
+
+
 /**
  * Use pipe_screen::get_param() to query PIPE_CAP_ values to determine
  * which GL extensions are supported.
@@ -426,7 +436,7 @@ void st_init_extensions(struct st_context *st)
if (screen-is_format_supported(screen, PIPE_FORMAT_DXT5_RGBA,
PIPE_TEXTURE_2D, 0,
PIPE_BIND_SAMPLER_VIEW) 
-   ctx-Mesa_DXTn) {
+   (ctx-Mesa_DXTn || st_get_s3tc_override())) {
   ctx-Extensions.EXT_texture_compression_s3tc = GL_TRUE;
   ctx-Extensions.S3_s3tc = GL_TRUE;
}
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] util: enable S3TC support when the force_s3tc_enable env var is set to true

2011-07-20 Thread Bryan Cain

---
 src/gallium/auxiliary/util/u_format_s3tc.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c 
b/src/gallium/auxiliary/util/u_format_s3tc.c
index bb989c2..d8a7c0d 100644
--- a/src/gallium/auxiliary/util/u_format_s3tc.c
+++ b/src/gallium/auxiliary/util/u_format_s3tc.c
@@ -119,8 +119,15 @@ util_format_s3tc_init(void)
 
library = util_dl_open(DXTN_LIBNAME);
if (!library) {
-  debug_printf(couldn't open  DXTN_LIBNAME , software DXTn 
- compression/decompression unavailable\n);
+  if (getenv(force_s3tc_enable) 
+  !strcmp(getenv(force_s3tc_enable), true)) {
+ debug_printf(couldn't open  DXTN_LIBNAME , enabling DXTn due to 
+force_s3tc_enable=true environment variable\n);
+ util_format_s3tc_enabled = TRUE;
+  } else {
+ debug_printf(couldn't open  DXTN_LIBNAME , software DXTn 
+compression/decompression unavailable\n);
+  }
   return;
}
 
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Common Subexpression Elimination

2011-07-16 Thread Bryan Cain

On 07/16/2011 10:28 AM, Vincent wrote:
 Hi,

 I wrote an optimisation pass that perform CSE (that is, it spots
 expressions that appears twice or more, and factor them in a temporary
 to avoid recalculation).
 This is my first patch to Mesa, I would like to receive feedback :
 case where the algorithm does not work, crashes, ...
 I tried to follow coding style, using exec_list*, ralloced objects...
 The algorithm currently only spot binary expressions (that is x + y, x
 * y, ...) but unary expression (exp(x), ln(x)...) should follow soon.

 Regards,
 Vincent.

Hi Vincent,

For future reference, the diff you sent is currently reversed - the
lines you added are marked as - and the lines you removed are marked
as +.  It's best if you send patches using git-send-email to avoid this.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Merging glsl-to-tgsi to master

2011-07-13 Thread Bryan Cain

On Tue, Jul 12, 2011 at 12:00 PM, Ian Romanick i...@freedesktop.org wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 07/11/2011 09:53 AM, Egon Ashrafinia wrote:
  Hello guys.
 
  1 month ago, we talked about merging glsl-to-tgsi to master but it still
  not happend. What about now? I could compile and test it a bit. It works.
  Anyone who could do it? What does Bryan Cain say about it?

 One thing to consider is whether or not this will make it more difficult
 to cherry pick fixes to the 7.11 release branch.  Bryan has been doing
 some really cool work, and I'd like to see it get merged.  However, I
 also want 7.11 to ship on time and with as few bugs as possible.
 Anything that will make that more difficult should wait until August.


I don't think merging the glsl-to-tgsi branch would make it much more
difficult to cherry pick fixes to 7.11.  The main change in the branch is
the addition of a new file to st/mesa (st_glsl_to_tgsi.cpp) containing the
GLSL-TGSI translator.  That in and of itself doesn't make it any harder.

The only existing files with non-trivial changes that might affect
cherry-picking are uniforms.c and prog_parameter.c in core Mesa.  Several
other files in st/mesa and core Mesa have minor changes to make them work
with either the prog_parameter changes or to work without Mesa IR, but I
don't think those changes are enough to make cherry-picking difficult.
They're at least not worse than most changes in master in that regard.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Merging glsl-to-tgsi to master

2011-07-12 Thread Bryan Cain

On 07/11/2011 11:53 AM, Egon Ashrafinia wrote:
 Hello guys.

 1 month ago, we talked about merging glsl-to-tgsi to master but it
 still not happend. What about now? I could compile and test it a bit.
 It works.
 Anyone who could do it? What does Bryan Cain say about it?

Hi Egon,

Last month, I was trying to get it merged before the 7.11 merge window
closed.  When that didn't happen, I decided to stop being in a hurry and
instead make some more improvements before trying again to get it merged
to master.  Also, Brian Paul pointed out an issue that I hadn't noticed
before where st/mesa needed the current fragment shader to be in Mesa IR
form for glBitmap, glDrawPixels, and glCopyPixels to work correctly.

All of that is done now, as I finished the glBitmap path on Sunday and
the DrawPixels/CopyPixels path on Saturday.  So now I need to fix a few
minor things like commit messages and one case where the old path
generates better code than glsl_to_tgsi.  The merge might also be
delayed a bit further if it's decided that it should wait until after
7.11 is released, as Ian's reply suggested.

Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 129 matches

Mail list logo