Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Erik Faye-Lund
On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue
 abdiel.janul...@linux.intel.com wrote:
 v2: Check that the base type is float (Ian Romanick)

 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/glsl/opt_algebraic.cpp | 34 ++
  1 file changed, 34 insertions(+)

 diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
 index ac7514a..2ad561c 100644
 --- a/src/glsl/opt_algebraic.cpp
 +++ b/src/glsl/opt_algebraic.cpp
 @@ -614,6 +614,40 @@ ir_algebraic_visitor::handle_expression(ir_expression 
 *ir)

break;

 +   case ir_binop_min:
 +   case ir_binop_max:
 +  if (ir-type-base_type != GLSL_TYPE_FLOAT)
 + break;
 +  /* Replace min(max) operations and its commutative combinations with
 +   * a saturate operation
 +   */
 +  for (int op = 0; op  2; op++) {
 + ir_expression *minmax = op_expr[op];
 + ir_constant *outer_const = op_const[(op == 0) ? 1 : 0];

 We use (1 - op) in other places in opt_algebraic. (Same comment below)

 + ir_expression_operation op_cond = (ir-operation == ir_binop_max) ?
 +ir_binop_min : ir_binop_max;
 +
 + if (!minmax || !outer_const || (minmax-operation != op_cond))
 +continue;
 +
 + /* Found a min/max operation. Now try to see if its operands
 +  * meet our conditions that we can do just a single saturate 
 operation
 +  */
 + for (int minmax_op = 0; minmax_op  2; minmax_op++) {
 +ir_rvalue *inner_val_a = minmax-operands[minmax_op];
 +ir_rvalue *inner_val_b = minmax-operands[(minmax_op == 0) ? 1 
 : 0];
 +
 +if (!inner_val_a || !inner_val_b)
 +   continue;
 +
 +/* Found a min (max(x, 0), 1.0) */

 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

Hmm, but are optimizing both of these to saturate OK? Shouldn't
min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
1.0?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Extend compute-to-mrf pass to understand blocks of MOVs

2014-07-08 Thread Kristian Høgsberg
On Mon, Jun 30, 2014 at 3:14 PM, Matt Turner matts...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 12:00 PM, Kristian Høgsberg k...@bitplanet.net 
 wrote:
 From: Kristian Høgsberg krh@century-sparrow.local

 With your email address fixed,

Done, thanks for the review.  I realized that this also applies to MRT
shaders where we write the same value to two render targets.  From
piglits fbo-drawbuffers2-blend:

0x: pln(8)  g61F  g40,1,0F  g28,8,1F
 { align1 WE_normal 1Q compacted };
0x0008: pln(8)  g71F  g4.40,1,0Fg28,8,1F
 { align1 WE_normal 1Q compacted };
0x0010: pln(8)  g81F  g5.40,1,0Fg28,8,1F
 { align1 WE_normal 1Q compacted };
0x0018: math inv(8) g21F  g88,8,1F  null
 { align1 WE_normal 1Q compacted };
0x0020: mul(8)  g31F  g68,8,1F  g28,8,1F
 { align1 WE_normal 1Q compacted };
0x0028: mul(8)  g41F  g78,8,1F  g28,8,1F
 { align1 WE_normal 1Q compacted };
0x0030: send(8) g21UW g38,8,1F
sampler (1, 0, 0, 1) mlen 2 rlen 4
 { align1 WE_normal 1Q };
0x0040: mov(8)  g1131Fg28,8,1F
 { align1 WE_normal 1Q compacted };
0x0048: mov(8)  g1141Fg38,8,1F
 { align1 WE_normal 1Q compacted };
0x0050: mov(8)  g1151Fg48,8,1F
 { align1 WE_normal 1Q compacted };
0x0058: mov(8)  g1161Fg58,8,1F
 { align1 WE_normal 1Q compacted };
0x0060: sendc(8)nullg1138,8,1F
render RT write SIMD8 LastRT Surface = 0
mlen 4 rlen 0 { align1 WE_normal 1Q EOT };

becomes:

0x: pln(8)  g61F  g40,1,0F  g28,8,1F
 { align1 WE_normal 1Q compacted };
0x0008: pln(8)  g81F  g4.40,1,0Fg28,8,1F
 { align1 WE_normal 1Q compacted };
0x0010: pln(8)  g91F  g5.40,1,0Fg28,8,1F
 { align1 WE_normal 1Q compacted };
0x0018: math inv(8) g71F  g98,8,1F  null
 { align1 WE_normal 1Q compacted };
0x0020: mul(8)  g21F  g68,8,1F  g78,8,1F
 { align1 WE_normal 1Q compacted };
0x0028: mul(8)  g31F  g88,8,1F  g78,8,1F
 { align1 WE_normal 1Q compacted };
0x0030: send(8) g1131UW   g28,8,1F
sampler (1, 0, 0, 1) mlen 2 rlen 4
 { align1 WE_normal 1Q };
0x0040: sendc(8)nullg1138,8,1F
render RT write SIMD8 LastRT Surface = 0
mlen 4 rlen 0 { align1 WE_normal 1Q EOT };

which is lovely.

Kristian

 Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Extend compute-to-mrf pass to understand blocks of MOVs

2014-07-08 Thread Kristian Høgsberg
On Mon, Jul 7, 2014 at 11:07 PM, Kristian Høgsberg k...@bitplanet.net wrote:
 On Mon, Jun 30, 2014 at 3:14 PM, Matt Turner matts...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 12:00 PM, Kristian Høgsberg k...@bitplanet.net 
 wrote:
 From: Kristian Høgsberg krh@century-sparrow.local

 With your email address fixed,

 Done, thanks for the review.  I realized that this also applies to MRT
 shaders where we write the same value to two render targets.  From
 piglits fbo-drawbuffers2-blend:

 0x: pln(8)  g61F  g40,1,0F  g28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0008: pln(8)  g71F  g4.40,1,0Fg28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0010: pln(8)  g81F  g5.40,1,0Fg28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0018: math inv(8) g21F  g88,8,1F  null
  { align1 WE_normal 1Q compacted };
 0x0020: mul(8)  g31F  g68,8,1F  g28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0028: mul(8)  g41F  g78,8,1F  g28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0030: send(8) g21UW g38,8,1F
 sampler (1, 0, 0, 1) mlen 2 rlen 4

Ugh, nevermind that, that's a sampler send there...

  { align1 WE_normal 1Q };
 0x0040: mov(8)  g1131Fg28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0048: mov(8)  g1141Fg38,8,1F
  { align1 WE_normal 1Q compacted };
 0x0050: mov(8)  g1151Fg48,8,1F
  { align1 WE_normal 1Q compacted };
 0x0058: mov(8)  g1161Fg58,8,1F
  { align1 WE_normal 1Q compacted };
 0x0060: sendc(8)nullg1138,8,1F
 render RT write SIMD8 LastRT Surface = 0
 mlen 4 rlen 0 { align1 WE_normal 1Q EOT };

 becomes:

 0x: pln(8)  g61F  g40,1,0F  g28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0008: pln(8)  g81F  g4.40,1,0Fg28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0010: pln(8)  g91F  g5.40,1,0Fg28,8,1F
  { align1 WE_normal 1Q compacted };
 0x0018: math inv(8) g71F  g98,8,1F  null
  { align1 WE_normal 1Q compacted };
 0x0020: mul(8)  g21F  g68,8,1F  g78,8,1F
  { align1 WE_normal 1Q compacted };
 0x0028: mul(8)  g31F  g88,8,1F  g78,8,1F
  { align1 WE_normal 1Q compacted };
 0x0030: send(8) g1131UW   g28,8,1F
 sampler (1, 0, 0, 1) mlen 2 rlen 4
  { align1 WE_normal 1Q };
 0x0040: sendc(8)nullg1138,8,1F
 render RT write SIMD8 LastRT Surface = 0
 mlen 4 rlen 0 { align1 WE_normal 1Q EOT };

 which is lovely.

 Kristian

 Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Matt Turner
On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?

Under standard IEEE rules, wouldn't the NaN propagate through all of
these expressions?

The GLSL 4.40 spec says

Operations and built-in functions that operate on a NaN are not required to
 return a NaN as the result.

So it seems like we have a lot of flexibility here. Is there some text
I'm missing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965: Avoid crashing while dumping vec4 insn operands

2014-07-08 Thread Matt Turner
On Sun, Jul 6, 2014 at 1:11 AM, Chris Forbes chr...@ijw.co.nz wrote:
 We'd otherwise go looking into virtual_grf_sizes for things that aren't
 in there at all.

 Signed-off-by: Chris Forbes chr...@ijw.co.nz
 ---
  src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 5ba7d9c..4b7576a 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -1337,7 +1337,9 @@ vec4_visitor::dump_instruction(backend_instruction 
 *be_inst, FILE *file)
   break;
}

 -  if (virtual_grf_sizes[inst-src[i].reg] != 1)
 +  if (inst-src[i].reg_offset 

I totally ignored what the actual code in the if block was doing, so I
was confused by the relation of reg_offset != 0 and looking things up
in virtual_grf_sizes[]. Probably add the != 0, and maybe a comment
that says /* Don't print .0 */ or something.

For the series

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Erik Faye-Lund
On Tue, Jul 8, 2014 at 8:22 AM, Matt Turner matts...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?

 Under standard IEEE rules, wouldn't the NaN propagate through all of
 these expressions?

 The GLSL 4.40 spec says

 Operations and built-in functions that operate on a NaN are not required to
  return a NaN as the result.

 So it seems like we have a lot of flexibility here. Is there some text
 I'm missing?

I think the point about flexibility is a bit weak or even
misunderstood: surely this applies to the built-ins used, not what we
generate after optimizing. So if we chose to do that, we'd need to
prevent min and max from propagating NaN even when issued stand-alone,
which might negatively impact performance in some ISAs.

As to why I reacted, I was just remembering that I read somewhere (one
of Humus' low-level shader optimization paper, IIRC) that the HLSL
compiler refused to do some similar optimizations for NaN-reasons.

Checking the spec a bit closer, though:
- min(x, y) is defined as Returns y if y  x, otherwise it returns x
- max(x, y) is defined as Returns y if x  y, otherwise it returns x.

All comparisons with NaN should AFAIK fail, making both the first and
second comparison return NaN, if NaN were to be properly supported.
So my initial analysis was wrong. However, all of the following
permutations of the same logic would still be inconsistent.

min(max(0.0, x), 1.0)
max(min(1.0, x), 0.0)
min(1.0, max(0.0, x))
max(0.0, min(1.0, x))
min(1.0, max(x, 0.0))
max(0.0, min(x, 1.0))

I don't understand the code well enough to figure out if the patch
optimizes these, though.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Abdiel Janulgue

On 08.07.2014 09:47, Erik Faye-Lund wrote:
 On Tue, Jul 8, 2014 at 8:22 AM, Matt Turner matts...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?

 Under standard IEEE rules, wouldn't the NaN propagate through all of
 these expressions?

 The GLSL 4.40 spec says

 Operations and built-in functions that operate on a NaN are not required to
  return a NaN as the result.

 So it seems like we have a lot of flexibility here. Is there some text
 I'm missing?
 
 I think the point about flexibility is a bit weak or even
 misunderstood: surely this applies to the built-ins used, not what we
 generate after optimizing. So if we chose to do that, we'd need to
 prevent min and max from propagating NaN even when issued stand-alone,
 which might negatively impact performance in some ISAs.
 
 As to why I reacted, I was just remembering that I read somewhere (one
 of Humus' low-level shader optimization paper, IIRC) that the HLSL
 compiler refused to do some similar optimizations for NaN-reasons.
 
 Checking the spec a bit closer, though:
 - min(x, y) is defined as Returns y if y  x, otherwise it returns x
 - max(x, y) is defined as Returns y if x  y, otherwise it returns x.
 
 All comparisons with NaN should AFAIK fail, making both the first and
 second comparison return NaN, if NaN were to be properly supported.
 So my initial analysis was wrong. However, all of the following
 permutations of the same logic would still be inconsistent.
 
 min(max(0.0, x), 1.0)
 max(min(1.0, x), 0.0)
 min(1.0, max(0.0, x))
 max(0.0, min(1.0, x))
 min(1.0, max(x, 0.0))
 max(0.0, min(x, 1.0))
 
 I don't understand the code well enough to figure out if the patch
 optimizes these, though.

The code actually goes through all 8 commutative variations and tries to
optimize if it spots one of these conditions.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i965: overwriting parts of a register and optimization passes

2014-07-08 Thread Iago Toral

Hi,

I have some code that first initializes a register and then overwrites a 
specific subregister. However, after the optimization passes in 
brw_vec4.cpp the initialization instruction goes away. I see that live 
intervals are computed for the destination register without considering 
if we are writing only to a specific subregister, so I guess that is the 
reason for this behaviour, but this strikes me as odd, as it defeats the 
purpose of overwriting only specific subregs, so I guess there is 
something else that I am missing.


This an example of what is happening:

mov(8)  g61.xUD   g04,4,1UD { 
align16 WE_all 1Q };
mov(1)  g6.21UD   0x0007UD{ align1 
 WE_all compacted };


The first MOV is removed from the instruction set when I put the second 
MOV, however the second MOV should only be writing to subreg 6.2 and 
keep the rest of register 6 intact, or at least that is what I am trying 
to do...


Can someone explain what is wrong with those two instructions? how 
should I overwrite only a specific subregister without causing this 
behavior?


Thanks,
Iago
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b 1.0 as min(saturate(x), b)

2014-07-08 Thread Abdiel Janulgue

On 07.07.2014 20:25, Matt Turner wrote:
 On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue
 abdiel.janul...@linux.intel.com wrote:
 v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by 
 Ilia Mirkin
 - Make sure we do component-wise comparison for vectors (Ian Romanick)

 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/glsl/opt_algebraic.cpp | 10 ++
  1 file changed, 10 insertions(+)

 diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
 index 2ad561c..bcda7a9 100644
 --- a/src/glsl/opt_algebraic.cpp
 +++ b/src/glsl/opt_algebraic.cpp
 @@ -643,6 +643,16 @@ ir_algebraic_visitor::handle_expression(ir_expression 
 *ir)
  /* Found a min (max(x, 0), 1.0) */
  if (outer_const-is_one()  inner_val_a-is_zero())
 return saturate(inner_val_b);
 +
 +unsigned component = 0;
 +for (int c = 0; c  outer_const-type-vector_elements; c++) {
 +   if (outer_const-get_float_component(c)  1.0f)
 +  component++;
 +}
 +/* Found a min (max(x, 0.0) b), where b  1.0 */
 
 Is this case the only thing it finds?
 
 Does it find max(min(x, b), 0.0)?

Unfortunately, you are right. The detection only works if the inner
constant is zero. I'll fix this in the next revision

 
 +if ((component == outer_const-type-vector_elements) 
 +inner_val_b-is_zero())
 +   return expr(ir_binop_min, saturate(inner_val_a), 
 outer_const);
 
 It seems like it would incorrectly recognize max(min(x, 0.0), b).
 

This one it would correctly recognize.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b 1.0 as min(saturate(x), b)

2014-07-08 Thread Abdiel Janulgue

On 08.07.2014 12:37, Abdiel Janulgue wrote:
 
 On 07.07.2014 20:25, Matt Turner wrote:
 On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue
 abdiel.janul...@linux.intel.com wrote:
 v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by 
 Ilia Mirkin
 - Make sure we do component-wise comparison for vectors (Ian Romanick)

 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/glsl/opt_algebraic.cpp | 10 ++
  1 file changed, 10 insertions(+)

 diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
 index 2ad561c..bcda7a9 100644
 --- a/src/glsl/opt_algebraic.cpp
 +++ b/src/glsl/opt_algebraic.cpp
 @@ -643,6 +643,16 @@ ir_algebraic_visitor::handle_expression(ir_expression 
 *ir)
  /* Found a min (max(x, 0), 1.0) */
  if (outer_const-is_one()  inner_val_a-is_zero())
 return saturate(inner_val_b);
 +
 +unsigned component = 0;
 +for (int c = 0; c  outer_const-type-vector_elements; c++) {
 +   if (outer_const-get_float_component(c)  1.0f)
 +  component++;
 +}
 +/* Found a min (max(x, 0.0) b), where b  1.0 */

 Is this case the only thing it finds?

 Does it find max(min(x, b), 0.0)?
 
 Unfortunately, you are right. The detection only works if the inner
 constant is zero. I'll fix this in the next revision
 

 +if ((component == outer_const-type-vector_elements) 
 +inner_val_b-is_zero())
 +   return expr(ir_binop_min, saturate(inner_val_a), 
 outer_const);

 It seems like it would incorrectly recognize max(min(x, 0.0), b).

 
 This one it would correctly recognize.

Actually no, this one actually does not work either. Fix coming up!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Roland Scheidegger
Am 08.07.2014 08:22, schrieb Matt Turner:
 On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?
 
 Under standard IEEE rules, wouldn't the NaN propagate through all of
 these expressions?
not under ieee754-2008 rules, no. min/max return the other number if one
of them is a NaN. This is enforced by d3d10 so at least d3d10 capable
hardware is going to honor that.
http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v=vs.85%29.aspx
The order of min/max though indeed matters.

 
 The GLSL 4.40 spec says
 
 Operations and built-in functions that operate on a NaN are not required to
  return a NaN as the result.
 
 So it seems like we have a lot of flexibility here. Is there some text
 I'm missing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [v3 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Abdiel Janulgue
v2: - Check that the base type is float (Ian Romanick)
v3: - Make sure comments reflect that we are doing a commutative operation
- Add missing condition where the inner constant is 1.0 and outer constant 
is 0.0
- Make indexing of operands easier to read (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 36 
 1 file changed, 36 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index ac7514a..4b052933 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -614,6 +614,42 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 
   break;
 
+   case ir_binop_min:
+   case ir_binop_max:
+  if (ir-type-base_type != GLSL_TYPE_FLOAT)
+ break;
+
+  /* Replace min(max) operations and its commutative combinations with
+   * a saturate operation
+   */
+  for (int op = 0; op  2; op++) {
+ ir_expression *minmax = op_expr[op];
+ ir_constant *outer_const = op_const[1 - op];
+ ir_expression_operation op_cond = (ir-operation == ir_binop_max) ?
+ir_binop_min : ir_binop_max;
+
+ if (!minmax || !outer_const || (minmax-operation != op_cond))
+continue;
+
+ /* Found a min(max) combination. Now try to see if its operands
+  * meet our conditions that we can do just a single saturate operation
+  */
+ for (int minmax_op = 0; minmax_op  2; minmax_op++) {
+ir_rvalue *inner_val_a = minmax-operands[minmax_op];
+ir_rvalue *inner_val_b = minmax-operands[1 - minmax_op];
+
+if (!inner_val_a || !inner_val_b)
+   continue;
+
+/* Found a {min|max} ({max|min} (x, 0.0), 1.0) operation and its 
variations */
+if ((outer_const-is_one()  inner_val_a-is_zero()) ||
+(inner_val_a-is_one()  outer_const-is_zero()))
+   return saturate(inner_val_b);
+ }
+  }
+
+  break;
+
case ir_unop_rcp:
   if (op_expr[0]  op_expr[0]-operation == ir_unop_rcp)
 return op_expr[0]-operands[0];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [v3 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b 1.0 as min(saturate(x), b)

2014-07-08 Thread Abdiel Janulgue
v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia 
Mirkin
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is zero and
  inner constant is  1
- Fix comments to reflect we are doing a commutative operation (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 4b052933..6dfb681 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -110,6 +110,33 @@ is_vec_basis(ir_constant *ir)
return (ir == NULL) ? false : ir-is_basis();
 }
 
+static inline bool
+is_valid_vec_const(ir_constant *ir)
+{
+   if (ir == NULL)
+  return false;
+
+   if (!ir-type-is_scalar()  !ir-type-is_vector())
+  return false;
+
+   return true;
+}
+
+static inline bool
+is_less_than_one(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c  ir-type-vector_elements; c++) {
+  if (ir-get_float_component(c)  1.0f)
+ component++;
+   }
+
+   return (component == ir-type-vector_elements);
+}
+
 static void
 update_type(ir_expression *ir)
 {
@@ -645,6 +672,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 if ((outer_const-is_one()  inner_val_a-is_zero()) ||
 (inner_val_a-is_one()  outer_const-is_zero()))
return saturate(inner_val_b);
+
+/* Found a {min|max} ({max|min} (x, 0.0), b) where b  1.0
+ * and its variations
+ */
+if (is_less_than_one(outer_const)  inner_val_b-is_zero())
+   return expr(ir_binop_min, saturate(inner_val_a), outer_const);
+
+if (!inner_val_b-as_constant())
+   continue;
+
+if (is_less_than_one(inner_val_b-as_constant())  
outer_const-is_zero())
+   return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
  }
   }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [v3 PATCH 11/16] glsl: Optimize clamp(x, b, 1.0), where b 0.0 as max(saturate(x), b)

2014-07-08 Thread Abdiel Janulgue
v2: - Output max(saturate(x),b) instead of saturate(max(x,b))
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is  0.0 and
  inner constant is 1.0.
- Fix comments to show that the optimization is a commutative operation
  (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 6dfb681..447618f 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -137,6 +137,21 @@ is_less_than_one(ir_constant *ir)
return (component == ir-type-vector_elements);
 }
 
+static inline bool
+is_greater_than_zero(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c  ir-type-vector_elements; c++) {
+  if (ir-get_float_component(c)  0.0f)
+ component++;
+   }
+
+   return (component == ir-type-vector_elements);
+}
+
 static void
 update_type(ir_expression *ir)
 {
@@ -684,6 +699,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 
 if (is_less_than_one(inner_val_b-as_constant())  
outer_const-is_zero())
return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
+
+/* Found a {min|max} ({max|min} (x, b), 1.0), where b  0.0
+ * and its variations
+ */
+if (outer_const-is_one()  
is_greater_than_zero(inner_val_b-as_constant()))
+   return expr(ir_binop_max, saturate(inner_val_a), inner_val_b);
+if (inner_val_b-as_constant()-is_one()  
is_greater_than_zero(outer_const))
+   return expr(ir_binop_max, saturate(inner_val_a), outer_const);
  }
   }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [v3 PATCH 12/16] i965/fs: Allow propagation of instructions with saturate flag to sel

2014-07-08 Thread Abdiel Janulgue
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge  dst a 0.25F

To be propagated as:
sel.ge.sat dst b 0.25F

v3: Syntax clarification in inst-saturate assignment (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 28e59c6..9d1aeab 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -43,6 +43,7 @@ struct acp_entry : public exec_node {
fs_reg dst;
fs_reg src;
enum opcode opcode;
+   bool saturate;
 };
 
 struct block_data {
@@ -348,11 +349,26 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
   }
}
 
+   if (entry-saturate) {
+  switch(inst-opcode) {
+  case BRW_OPCODE_SEL:
+ if (inst-src[1].file != IMM ||
+ inst-src[1].fixed_hw_reg.dw1.f  0.0 ||
+ inst-src[1].fixed_hw_reg.dw1.f  1.0) {
+return false;
+ }
+ break;
+  default:
+ return false;
+  }
+   }
+
inst-src[arg].file = entry-src.file;
inst-src[arg].reg = entry-src.reg;
inst-src[arg].reg_offset = entry-src.reg_offset;
inst-src[arg].subreg_offset = entry-src.subreg_offset;
inst-src[arg].stride *= entry-src.stride;
+   inst-saturate = inst-saturate || entry-saturate;
 
if (!inst-src[arg].abs) {
   inst-src[arg].abs = entry-src.abs;
@@ -515,7 +531,6 @@ can_propagate_from(fs_inst *inst)
 inst-src[0].file == UNIFORM ||
 inst-src[0].file == IMM) 
inst-src[0].type == inst-dst.type 
-   !inst-saturate 
!inst-is_partial_write());
 }
 
@@ -570,6 +585,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, 
bblock_t *block,
 entry-dst = inst-dst;
 entry-src = inst-src[0];
  entry-opcode = inst-opcode;
+ entry-saturate = inst-saturate;
 acp[entry-dst.reg % ACP_HASH_SIZE].push_tail(entry);
   } else if (inst-opcode == SHADER_OPCODE_LOAD_PAYLOAD 
  inst-dst.file == GRF) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [v3 PATCH 13/16] i965/vec4: Allow propagation of instructions with saturate flag to sel

2014-07-08 Thread Abdiel Janulgue
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge  dst a 0.25F

To be propagated as:
sel.ge.sat dst b 0.25F

v3: - Syntax clarification in inst-saturate assignment
- Remove extra parenthesis when assigning src_reg value
  from copy_entry (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 75 ++
 1 file changed, 48 insertions(+), 27 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 2c41d02..3251551 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -36,13 +36,17 @@ extern C {
 
 namespace brw {
 
+struct copy_entry {
+   src_reg *value[4];
+   bool saturate;
+};
+
 static bool
 is_direct_copy(vec4_instruction *inst)
 {
return (inst-opcode == BRW_OPCODE_MOV 
   !inst-predicate 
   inst-dst.file == GRF 
-  !inst-saturate 
   !inst-dst.reladdr 
   !inst-src[0].reladdr 
   inst-dst.type == inst-src[0].type);
@@ -74,16 +78,16 @@ is_channel_updated(vec4_instruction *inst, src_reg 
*values[4], int ch)
 
 static bool
 try_constant_propagate(struct brw_context *brw, vec4_instruction *inst,
-   int arg, src_reg *values[4])
+   int arg, struct copy_entry *entry)
 {
/* For constant propagation, we only handle the same constant
 * across all 4 channels.  Some day, we should handle the 8-bit
 * float vector format, which would let us constant propagate
 * vectors better.
 */
-   src_reg value = *values[0];
+   src_reg value = *entry-value[0];
for (int i = 1; i  4; i++) {
-  if (!value.equals(*values[i]))
+  if (!value.equals(*entry-value[i]))
 return false;
}
 
@@ -213,22 +217,22 @@ is_logic_op(enum opcode opcode)
 
 static bool
 try_copy_propagate(struct brw_context *brw, vec4_instruction *inst,
-   int arg, src_reg *values[4])
+   int arg, struct copy_entry *entry)
 {
/* For constant propagation, we only handle the same constant
 * across all 4 channels.  Some day, we should handle the 8-bit
 * float vector format, which would let us constant propagate
 * vectors better.
 */
-   src_reg value = *values[0];
+   src_reg value = *entry-value[0];
for (int i = 1; i  4; i++) {
   /* This is equals() except we don't care about the swizzle. */
-  if (value.file != values[i]-file ||
- value.reg != values[i]-reg ||
- value.reg_offset != values[i]-reg_offset ||
- value.type != values[i]-type ||
- value.negate != values[i]-negate ||
- value.abs != values[i]-abs) {
+  if (value.file != entry-value[i]-file ||
+ value.reg != entry-value[i]-reg ||
+ value.reg_offset != entry-value[i]-reg_offset ||
+ value.type != entry-value[i]-type ||
+ value.negate != entry-value[i]-negate ||
+ value.abs != entry-value[i]-abs) {
 return false;
   }
}
@@ -239,7 +243,7 @@ try_copy_propagate(struct brw_context *brw, 
vec4_instruction *inst,
 */
int s[4];
for (int i = 0; i  4; i++) {
-  s[i] = BRW_GET_SWZ(values[i]-swizzle,
+  s[i] = BRW_GET_SWZ(entry-value[i]-swizzle,
 BRW_GET_SWZ(inst-src[arg].swizzle, i));
}
value.swizzle = BRW_SWIZZLE4(s[0], s[1], s[2], s[3]);
@@ -299,8 +303,23 @@ try_copy_propagate(struct brw_context *brw, 
vec4_instruction *inst,
if (value.equals(inst-src[arg]))
   return false;
 
+   if (entry-saturate) {
+  switch(inst-opcode) {
+  case BRW_OPCODE_SEL:
+ if (inst-src[1].file != IMM ||
+ inst-src[1].fixed_hw_reg.dw1.f  0.0 ||
+ inst-src[1].fixed_hw_reg.dw1.f  1.0) {
+return false;
+ }
+ break;
+  default:
+ return false;
+  }
+   }
+
value.type = inst-src[arg].type;
inst-src[arg] = value;
+   inst-saturate = inst-saturate || entry-saturate;
return true;
 }
 
@@ -308,9 +327,9 @@ bool
 vec4_visitor::opt_copy_propagation()
 {
bool progress = false;
-   src_reg *cur_value[virtual_grf_reg_count][4];
+   struct copy_entry entries[virtual_grf_reg_count];
 
-   memset(cur_value, 0, sizeof(cur_value));
+   memset(entries, 0, sizeof(entries));
 
foreach_in_list(vec4_instruction, inst, instructions) {
   /* This pass only works on basic blocks.  If there's flow
@@ -321,7 +340,7 @@ vec4_visitor::opt_copy_propagation()
* src/glsl/opt_copy_propagation.cpp to track available copies.
*/
   if (!is_dominated_by_previous_instruction(inst)) {
-memset(cur_value, 0, sizeof(cur_value));
+memset(entries, 0, sizeof(entries));
 continue;
   }
 
@@ -342,33 +361,34 @@ 

Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Erik Faye-Lund
On Tue, Jul 8, 2014 at 5:44 PM, Roland Scheidegger srol...@vmware.com wrote:
 Am 08.07.2014 08:22, schrieb Matt Turner:
 On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?

 Under standard IEEE rules, wouldn't the NaN propagate through all of
 these expressions?
 not under ieee754-2008 rules, no. min/max return the other number if one
 of them is a NaN.

I'm not entirely sure what IEEE754-2008 has to say about min/max, as
GLSL already defines those operations (and states that IEEE754
compliance isn't required). I was referring to the definition of the
less-than comparison-operator, which GLSL seems to lean on. And that
seems to always evaluate to 'false'.

 This is enforced by d3d10 so at least d3d10 capable
 hardware is going to honor that.
 http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v=vs.85%29.aspx
 The order of min/max though indeed matters.

Right, this is very useful information, thanks. This makes me think
that the perhaps the dependence on the less-than definition could be
considered a spec-bug. A quick test reveals that NVIDIA's Windows
driver implement both min(0.5, NaN) and min(NaN, 0.5) as returning
0.5. So I'd say: raise an issue with Khronos, and implement DX10
rules.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/16] gallium: add is_percentage and is_float fields to pipe_driver_query_info

2014-07-08 Thread Marek Olšák
It would be nicer to define all the types as enums, e.g. enum
pipe_driver_query_type { PIPE_DRIVER_QUERY_TYPE_BYTES,
PIPE_DRIVER_QUERY_TYPE_PERCENTAGE, PIPE_DRIVER_QUERY_TYPE_FLOAT,
PIPE_DRIVER_QUERY_TYPE_UINT };

That would be nicer than a bunch of booleans.

If that's not possible, please document how the booleans interact with
each other.

Marek

On Mon, Jul 7, 2014 at 5:47 PM, Samuel Pitoiset
samuel.pitoi...@gmail.com wrote:
 This will be used by GL_AMD_performance_monitor.

 Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
 ---
  src/gallium/include/pipe/p_defines.h | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/gallium/include/pipe/p_defines.h 
 b/src/gallium/include/pipe/p_defines.h
 index 501c1e2..db5c9bf 100644
 --- a/src/gallium/include/pipe/p_defines.h
 +++ b/src/gallium/include/pipe/p_defines.h
 @@ -740,6 +740,8 @@ struct pipe_driver_query_info
 unsigned query_type; /* PIPE_QUERY_DRIVER_SPECIFIC + i */
 union pipe_numeric_type_union max_value; /* max value that can be 
 returned */
 boolean uses_byte_units; /* whether the result is in bytes */
 +   boolean is_percentage;
 +   boolean is_float;
 unsigned group_id;
  };

 --
 2.0.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders

2014-07-08 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
with sample shading.

Also, if this is disabled, the error message at least makes sense now.
---
 src/glsl/glsl_parser.yy | 8 
 src/glsl/glsl_parser_extras.cpp | 2 ++
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/mesa/main/mtypes.h  | 5 +
 4 files changed, 17 insertions(+)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index b989749..4c87163 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -376,6 +376,14 @@ external_declaration_list:
   if ($2 != NULL)
  state-translation_unit.push_tail( $2-link);
}
+   | external_declaration_list extension_statement {
+  if (!state-allow_extension_directive_midshader) {
+ _mesa_glsl_error( @2, state,
+  #extension directive is not allowed 
+  in the middle of a shader);
+ YYERROR;
+  }
+   }
;
 
 variable_identifier:
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index b327c2b..890123a 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
this-early_fragment_tests = false;
memset(this-atomic_counter_offsets, 0,
   sizeof(this-atomic_counter_offsets));
+   this-allow_extension_directive_midshader =
+  ctx-Const.AllowGLSLExtensionDirectiveMidShader;
 }
 
 /**
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 1791816..f720d6a 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state {
 
/** Atomic counter offsets by binding */
unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS];
+
+   bool allow_extension_directive_midshader;
 };
 
 # define YYLLOC_DEFAULT(Current, Rhs, N)   \
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index a7126fd..b42a175 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3349,6 +3349,11 @@ struct gl_constants
GLuint ForceGLSLVersion;
 
/**
+* Allow GLSL #extension directives in the middle of shaders.
+*/
+   GLboolean AllowGLSLExtensionDirectiveMidShader;
+
+   /**
 * Does the driver support real 32-bit integers?  (Otherwise, integers are
 * simulated via floats.)
 */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/mesa, gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0

2014-07-08 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

Most (all?) Unigine shaders fail to compile without this if sample shading
is advertised. This is, of course, Unigine developers' fault.
---
 src/gallium/include/state_tracker/st_api.h  |  1 +
 src/gallium/state_trackers/dri/common/dri_context.c |  2 ++
 src/gallium/state_trackers/dri/common/dri_screen.c  |  1 +
 src/mesa/drivers/dri/common/drirc   | 20 +---
 src/mesa/drivers/dri/common/xmlpool/t_options.h |  5 +
 src/mesa/state_tracker/st_extensions.c  |  3 +++
 6 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index 9dcb76f..46f0436 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -245,6 +245,7 @@ struct st_config_options
boolean force_glsl_extensions_warn;
unsigned force_glsl_version;
boolean force_s3tc_enable;
+   boolean allow_glsl_extension_directive_midshader;
 };
 
 /**
diff --git a/src/gallium/state_trackers/dri/common/dri_context.c 
b/src/gallium/state_trackers/dri/common/dri_context.c
index f6979a7..827f847 100644
--- a/src/gallium/state_trackers/dri/common/dri_context.c
+++ b/src/gallium/state_trackers/dri/common/dri_context.c
@@ -55,6 +55,8 @@ dri_fill_st_options(struct st_config_options *options,
   driQueryOptioni(optionCache, force_glsl_version);
options-force_s3tc_enable =
   driQueryOptionb(optionCache, force_s3tc_enable);
+   options-allow_glsl_extension_directive_midshader =
+  driQueryOptionb(optionCache, allow_glsl_extension_directive_midshader);
 }
 
 GLboolean
diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c 
b/src/gallium/state_trackers/dri/common/dri_screen.c
index dceb628..f894e5d 100644
--- a/src/gallium/state_trackers/dri/common/dri_screen.c
+++ b/src/gallium/state_trackers/dri/common/dri_screen.c
@@ -69,6 +69,7 @@ const __DRIconfigOptionsExtension gallium_config_options = {
  DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)
  DRI_CONF_DISABLE_SHADER_BIT_ENCODING(false)
  DRI_CONF_FORCE_GLSL_VERSION(0)
+ DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER(false)
   DRI_CONF_SECTION_END
 
   DRI_CONF_SECTION_MISCELLANEOUS
diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index ebc04cd..4b9841b 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -11,17 +11,21 @@ Application bugs worked around in this file:
   is still 1.10.
 
 * Unigine Heaven 3.0 with ARB_texture_multisample uses a ivec4 * vec4
-  expression, which fails to compile with GLSL 1.10.
+  expression, which is illegal in GLSL 1.10.
   Adding #version 130 fixes this.
 
 * Unigine Heaven 3.0 with ARB_shader_bit_encoding uses the uint keyword, which
-  fails to compile with GLSL 1.10.
+  is illegal in GLSL 1.10.
   Adding #version 130 fixes this.
 
 * Unigine Heaven 3.0 with ARB_shader_bit_encoding uses a uint  int
-  expression, which fails (and should fail) to compile with any GLSL version.
+  expression, which is illegal in any GLSL version.
   Disabling ARB_shader_bit_encoding fixes this.
 
+* If ARB_sample_shading is supported, Unigine Heaven 4.0 and Valley 1.0 uses
+  an #extension directive in the middle of its shaders, which is illegal
+  in GLSL.
+
 TODO: document the other workarounds.
 
 --
@@ -45,6 +49,7 @@ TODO: document the other workarounds.
 option name=disable_blend_func_extended value=true /
 option name=force_glsl_version value=130 /
 option name=disable_shader_bit_encoding value=true /
+option name=allow_glsl_extension_directive_midshader 
value=true /
/application
 
 application name=Unigine Heaven (64-bit) executable=heaven_x64
@@ -52,6 +57,15 @@ TODO: document the other workarounds.
 option name=disable_blend_func_extended value=true /
 option name=force_glsl_version value=130 /
 option name=disable_shader_bit_encoding value=true /
+option name=allow_glsl_extension_directive_midshader 
value=true /
+   /application
+
+application name=Unigine Valley (32-bit) executable=valley_x86
+option name=allow_glsl_extension_directive_midshader 
value=true /
+   /application
+
+application name=Unigine Valley (64-bit) executable=valley_x64
+option name=allow_glsl_extension_directive_midshader 
value=true /
/application
 
 application name=Unigine OilRush (32-bit) executable=OilRush_x86
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index fc9e104..b73a662 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -105,6 +105,11 @@ DRI_CONF_OPT_BEGIN_V(force_glsl_version, int, def, 
0:999) \
 

Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-07-08 Thread Roland Scheidegger
Am 08.07.2014 18:03, schrieb Erik Faye-Lund:
 On Tue, Jul 8, 2014 at 5:44 PM, Roland Scheidegger srol...@vmware.com wrote:
 Am 08.07.2014 08:22, schrieb Matt Turner:
 On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote:
 This comment tripped me up for a second. This really means that you've
 found either

   - min(max(x, 0.0), 1.0); or
   - max(min(x, 1.0), 0.0)

 Hmm, but are optimizing both of these to saturate OK? Shouldn't
 min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give
 1.0?

 Under standard IEEE rules, wouldn't the NaN propagate through all of
 these expressions?
 not under ieee754-2008 rules, no. min/max return the other number if one
 of them is a NaN.
 
 I'm not entirely sure what IEEE754-2008 has to say about min/max, as
 GLSL already defines those operations (and states that IEEE754
 compliance isn't required). I was referring to the definition of the
 less-than comparison-operator, which GLSL seems to lean on. And that
 seems to always evaluate to 'false'.
Oh right I didn't notice min/max are defined that way. I guess that's
due to older specs - pre glsl 4 NaN support wasn't really required and
just about anything involving NaNs or Infs was ok to give undefined
results anyway (and some hw indeed didn't have NaN support).
Even current glsl version is not very strict there, still saying things
like NaNs are not required to be generated and Operations and
built-in functions that operate on a NaN are not required to return a
NaN as the result. I am not entirely sure why the spec wasn't tightened
up a bit, but maybe some glsl 4+ capable hw exists which doesn't
implement d3d10 rules.

 
 This is enforced by d3d10 so at least d3d10 capable
 hardware is going to honor that.
 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v%3Dvs.85%29.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=9dVgmTYumlj5DwQp2o0tjv4fTQpBhA%2FCDiMURzjOadM%3D%0As=f153cbe35b73f7fdb09fc50f8cfab56651bf346257838bc32ad35a721110fe67
 The order of min/max though indeed matters.
 
 Right, this is very useful information, thanks. This makes me think
 that the perhaps the dependence on the less-than definition could be
 considered a spec-bug. A quick test reveals that NVIDIA's Windows
 driver implement both min(0.5, NaN) and min(NaN, 0.5) as returning
 0.5. So I'd say: raise an issue with Khronos, and implement DX10
 rules
 

I agree that this looks like a bug, at the very least (given the very
lax NaN requirements in general) I think it should allow the d3d10
result if it doesn't do already (it is not obvious to me if the glsl
less-than operator really needs to adhere to ieee754 spec wrt NaNs).

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] various exec_list things

2014-07-08 Thread Connor Abbott
This series adds a couple things I need to exec_list for my work, and
does some cleanups made possible. Only compile tested on i965.

Connor Abbott (3):
  exec_list: add a prepend function
  exec_list: add a function to count the size of a list
  exec_list: make various places use the new get_size() method

 src/glsl/ast_to_hir.cpp|  4 +--
 src/glsl/ir_reader.cpp |  7 ++--
 src/glsl/list.h| 40 +-
 src/glsl/opt_function_inlining.cpp |  7 ++--
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  5 +--
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  4 +--
 src/mesa/program/ir_to_mesa.cpp|  5 +--
 7 files changed, 48 insertions(+), 24 deletions(-)

-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] exec_list: add a prepend function

2014-07-08 Thread Connor Abbott
This complements the existing append function. It's implemented in a
rather simple way right now; it could be changed if performance is a
concern.

Signed-off-by: Connor Abbott connor.abb...@intel.com
---
 src/glsl/list.h | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/glsl/list.h b/src/glsl/list.h
index 922bd68..ca6ee9d 100644
--- a/src/glsl/list.h
+++ b/src/glsl/list.h
@@ -345,9 +345,15 @@ struct exec_list {
void move_nodes_to(exec_list *target);
 
/**
-* Append all nodes from the source list to the target list
+* Append all nodes from the source list to the end of the target list
 */
void append_list(exec_list *source);
+   
+   /**
+* Prepend all nodes from the source list to the beginning of the target
+* list
+*/
+   void prepend_list(exec_list *source);
 #endif
 };
 
@@ -479,6 +485,13 @@ exec_list_append(struct exec_list *list, struct exec_list 
*source)
 }
 
 static inline void
+exec_list_prepend(struct exec_list *list, struct exec_list *source)
+{
+   exec_list_append(source, list);
+   exec_list_move_nodes_to(source, list);
+}
+
+static inline void
 exec_node_insert_list_before(struct exec_node *n, struct exec_list *before)
 {
if (exec_list_is_empty(before))
@@ -554,6 +567,11 @@ inline void exec_list::append_list(exec_list *source)
exec_list_append(this, source);
 }
 
+inline void exec_list::prepend_list(exec_list *source)
+{
+   exec_list_prepend(this, source);
+}
+
 inline void exec_node::insert_before(exec_list *before)
 {
exec_node_insert_list_before(this, before);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] exec_list: add a function to count the size of a list

2014-07-08 Thread Connor Abbott
Signed-off-by: Connor Abbott connor.abb...@intel.com
---
 src/glsl/list.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/glsl/list.h b/src/glsl/list.h
index ca6ee9d..68ab3fd 100644
--- a/src/glsl/list.h
+++ b/src/glsl/list.h
@@ -324,6 +324,8 @@ struct exec_list {
 
const exec_node *get_tail() const;
exec_node *get_tail();
+   
+   unsigned get_size();
 
void push_head(exec_node *n);
void push_tail(exec_node *n);
@@ -405,6 +407,19 @@ exec_list_get_tail(struct exec_list *list)
return !exec_list_is_empty(list) ? list-tail_pred : NULL;
 }
 
+static inline unsigned
+exec_list_get_size(struct exec_list *list)
+{
+   unsigned size = 0;
+   
+   for (struct exec_node *node = list-head; node-next != NULL;
+   node = node-next) {
+  size++;
+   }
+   
+   return size;
+}
+
 static inline void
 exec_list_push_head(struct exec_list *list, struct exec_node *n)
 {
@@ -537,6 +552,11 @@ inline exec_node *exec_list::get_tail()
return exec_list_get_tail(this);
 }
 
+inline unsigned exec_list::get_size()
+{
+   return exec_list_get_size(this);
+}
+
 inline void exec_list::push_head(exec_node *n)
 {
exec_list_push_head(this, n);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] exec_list: make various places use the new get_size() method

2014-07-08 Thread Connor Abbott
Instead of hand-rolling it.

Signed-off-by: Connor Abbott connor.abb...@intel.com
---
 src/glsl/ast_to_hir.cpp   | 4 +---
 src/glsl/ir_reader.cpp| 7 +++
 src/glsl/opt_function_inlining.cpp| 7 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 5 +
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +---
 src/mesa/program/ir_to_mesa.cpp   | 5 +
 6 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 885bee5..a2ab26b 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -5007,9 +5007,7 @@ ast_process_structure_or_interface_block(exec_list 
*instructions,
 * 'declarations' list in each of the elements.
 */
foreach_list_typed (ast_declarator_list, decl_list, link, declarations) {
-  foreach_list_typed (ast_declaration, decl, link, 
decl_list-declarations) {
- decl_count++;
-  }
+  decl_count += decl_list-declarations.get_size();
}
 
/* Allocate storage for the fields and process the field
diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp
index 4017bdd..4f141de 100644
--- a/src/glsl/ir_reader.cpp
+++ b/src/glsl/ir_reader.cpp
@@ -723,10 +723,9 @@ ir_reader::read_expression(s_expression *expr)
   ir_read_error(expr, invalid operator: %s, s_op-value());
   return NULL;
}
-
-   int num_operands = -3; /* skip expression type operation */
-   foreach_in_list(s_expression, e, ((s_list *) expr)-subexpressions)
-  ++num_operands;
+   
+   /* skip expression type operation */
+   int num_operands = (int) ((s_list *) expr)-subexpressions.get_size() - 3;
 
int expected_operands = ir_expression::get_num_operands(op);
if (num_operands != expected_operands) {
diff --git a/src/glsl/opt_function_inlining.cpp 
b/src/glsl/opt_function_inlining.cpp
index b84bb8e..9626639 100644
--- a/src/glsl/opt_function_inlining.cpp
+++ b/src/glsl/opt_function_inlining.cpp
@@ -100,16 +100,13 @@ ir_call::generate_inline(ir_instruction *next_ir)
 {
void *ctx = ralloc_parent(this);
ir_variable **parameters;
-   int num_parameters;
+   unsigned num_parameters;
int i;
struct hash_table *ht;
 
ht = hash_table_ctor(0, hash_table_pointer_hash, 
hash_table_pointer_compare);
 
-   num_parameters = 0;
-   foreach_in_list(ir_rvalue, param, this-callee-parameters)
-  num_parameters++;
-
+   num_parameters = this-callee-parameters.get_size();
parameters = new ir_variable *[num_parameters];
 
/* Generate the declarations for the parameters to our inlined code,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a3ad375..6ab82b9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2957,10 +2957,7 @@ fs_visitor::calculate_register_pressure()
invalidate_live_intervals();
calculate_live_intervals();
 
-   int num_instructions = 0;
-   foreach_in_list(fs_inst, inst, instructions) {
-  ++num_instructions;
-   }
+   unsigned num_instructions = instructions.get_size();
 
regs_live_at_ip = rzalloc_array(mem_ctx, int, num_instructions);
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 28e59c6..ce8dc36 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -106,9 +106,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, 
cfg_t *cfg,
num_acp = 0;
for (int b = 0; b  cfg-num_blocks; b++) {
   for (int i = 0; i  ACP_HASH_SIZE; i++) {
- foreach_in_list(acp_entry, entry, out_acp[b][i]) {
-num_acp++;
- }
+num_acp += out_acp[b][i].get_size();
   }
}
 
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 1109051..cc9c8ae 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2814,10 +2814,7 @@ get_mesa_program(struct gl_context *ctx,
 
prog-NumTemporaries = v.next_temp;
 
-   int num_instructions = 0;
-   foreach_in_list(ir_instruction, node, v.instructions) {
-  num_instructions++;
-   }
+   unsigned num_instructions = v.instructions.get_size();
 
mesa_instructions =
   (struct prog_instruction *)calloc(num_instructions,
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] various exec_list things

2014-07-08 Thread Thomas Helland
I don't feel comfortable reviewing things
on here quite yet, so you should
probably get a reviewed-by by someone else.

That being said this series LGTM.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] various exec_list things

2014-07-08 Thread Matt Turner
On Tue, Jul 8, 2014 at 2:04 PM, Thomas Helland
thomashellan...@gmail.com wrote:
 I don't feel comfortable reviewing things
 on here quite yet, so you should
 probably get a reviewed-by by someone else.

 That being said this series LGTM.

Feel free to give a Reviewed-by tag if the code looks correct to you.
It's up to other people to decide how much that tag means. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders

2014-07-08 Thread Brian Paul

On 07/08/2014 12:29 PM, Marek Olšák wrote:

From: Marek Olšák marek.ol...@amd.com

This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
with sample shading.

Also, if this is disabled, the error message at least makes sense now.
---
  src/glsl/glsl_parser.yy | 8 
  src/glsl/glsl_parser_extras.cpp | 2 ++
  src/glsl/glsl_parser_extras.h   | 2 ++
  src/mesa/main/mtypes.h  | 5 +
  4 files changed, 17 insertions(+)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index b989749..4c87163 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -376,6 +376,14 @@ external_declaration_list:
if ($2 != NULL)
   state-translation_unit.push_tail( $2-link);
 }
+   | external_declaration_list extension_statement {
+  if (!state-allow_extension_directive_midshader) {
+ _mesa_glsl_error( @2, state,
+  #extension directive is not allowed 
+  in the middle of a shader);
+ YYERROR;
+  }
+   }
 ;

  variable_identifier:
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index b327c2b..890123a 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
 this-early_fragment_tests = false;
 memset(this-atomic_counter_offsets, 0,
sizeof(this-atomic_counter_offsets));
+   this-allow_extension_directive_midshader =
+  ctx-Const.AllowGLSLExtensionDirectiveMidShader;
  }

  /**
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 1791816..f720d6a 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state {

 /** Atomic counter offsets by binding */
 unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS];
+
+   bool allow_extension_directive_midshader;
  };

  # define YYLLOC_DEFAULT(Current, Rhs, N)  \
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index a7126fd..b42a175 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3349,6 +3349,11 @@ struct gl_constants
 GLuint ForceGLSLVersion;

 /**
+* Allow GLSL #extension directives in the middle of shaders.
+*/
+   GLboolean AllowGLSLExtensionDirectiveMidShader;
+
+   /**
  * Does the driver support real 32-bit integers?  (Otherwise, integers are
  * simulated via floats.)
  */




For both, Reviewed-by: Brian Paul bri...@vmware.com

There was a thread about this a few months ago.  Photoshop is another 
app that has this issue.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] various exec_list things

2014-07-08 Thread Thomas Helland
Well, if that's how things work then, for the series, :

Reviewed-by: Thomas Helland thomashellan...@gmail.com


2014-07-08 23:07 GMT+02:00 Matt Turner matts...@gmail.com:

 On Tue, Jul 8, 2014 at 2:04 PM, Thomas Helland
 thomashellan...@gmail.com wrote:
  I don't feel comfortable reviewing things
  on here quite yet, so you should
  probably get a reviewed-by by someone else.
 
  That being said this series LGTM.

 Feel free to give a Reviewed-by tag if the code looks correct to you.
 It's up to other people to decide how much that tag means. :)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak

2014-07-08 Thread Brian Paul
Spotted by Charmaine Lee.
Cc: 10.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/state_tracker/st_context.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index c7f3ec6..c805a09 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -307,6 +307,7 @@ void st_destroy_context( struct st_context *st )
cso_release_all(st-cso_context);
 
st_reference_fragprog(st, st-fp, NULL);
+   st_reference_geomprog(st, st-gp, NULL);
st_reference_vertprog(st, st-vp, NULL);
 
/* release framebuffer surfaces */
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: add fix geometry shader memory leaks

2014-07-08 Thread Brian Paul
Spotted by Charmaine Lee.
Cc: 10.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/context.c |3 +++
 src/mesa/main/shared.c  |1 +
 2 files changed, 4 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index b082159..50aae8b 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1215,6 +1215,9 @@ _mesa_free_context_data( struct gl_context *ctx )
_mesa_reference_vertprog(ctx, ctx-VertexProgram._Current, NULL);
_mesa_reference_vertprog(ctx, ctx-VertexProgram._TnlProgram, NULL);
 
+   _mesa_reference_geomprog(ctx, ctx-GeometryProgram.Current, NULL);
+   _mesa_reference_geomprog(ctx, ctx-GeometryProgram._Current, NULL);
+
_mesa_reference_fragprog(ctx, ctx-FragmentProgram.Current, NULL);
_mesa_reference_fragprog(ctx, ctx-FragmentProgram._Current, NULL);
_mesa_reference_fragprog(ctx, ctx-FragmentProgram._TexEnvProgram, NULL);
diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
index dc22025..5ae7014 100644
--- a/src/mesa/main/shared.c
+++ b/src/mesa/main/shared.c
@@ -312,6 +312,7 @@ free_shared_state(struct gl_context *ctx, struct 
gl_shared_state *shared)
_mesa_DeleteHashTable(shared-Programs);
 
_mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL);
+   _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL);
_mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL);
 
_mesa_HashDeleteAll(shared-ATIShaders, delete_fragshader_cb, ctx);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] A proposal for new testing requirements for stable releases

2014-07-08 Thread Carl Worth
I've been doing stable-branch release of mesa for close to a year now.

In all of that time, there's one thing that I've never been very
comfortable with. Namely, release candidates are not tested very
thoroughly prior to release.[*] I'm glad that we haven't had any major
problems yet with broken releases. But I'd like to improve our release
testing rather than just trust to future luck.

Here are the changes that I'm proposing now for comments:

  1. The release schedule needs to be predictable.

The current ideal is every 2 weeks for stable releases. I
believe that is healthy and feasible.

I propose that releases occur consistently every 2 weeks on
Friday, (in the release manager's timezone---currently
America/Los_Angeles).

  2. The release candidate needs to be made available in advance to
 allow for testing.

I propose that the release manager push a candidate branch to
the upstream repository by the end of the Monday prior to each
scheduled Friday release.

  3. I'd like to receive a testing report from each driver team

This is the meat of my proposal. I'm requesting that each driver
team designate one (or more) people that will be responsible for
running (or collecting) tests for each release-candidate and
reporting the results to me.

With a new release-candidate pushed by the end of the day on
Monday, and me starting the actual release work on Friday, that
gives 72 hours for each team to perform testing.

I'm happy to let each driver team decide its own testing
strategy. I would hope it would be based on running piglit
across a set of interesting hardware configurations, (and
augmenting piglit as necessary as bug fixes are performed). But
I do not plan to be involved in the specification of what those
configurations are. All I need is something along the lines of:

The radeon team is OK with releasing commit foo

sent to me before the scheduled day of the release.

Obviously, any negative report from any team can trigger changes
to the patches to be included in the release.

And in order to put some teeth into this scheme:

I propose that on the day of the release, the release manager
drop all driver-specific patches for any driver for which the
driver team has not provided an affirmative testing report.

Does the above sound like a reasonable scheme? If so,who from each driver
team is willing to volunteer to send me testing reports? I'll be happy
to send personal reminders to such volunteers as releases are getting
close and testing reports are missing, (before yanking out any
driver-specific patches).

Thanks in advance for any thoughts or comments,

-Carl

PS. This same testing scheme can obviously be used for major releases as
well, presumably with the same volunteers, (but different details on
release timing).

[*] For background, here's the testing scheme I've been using so far:

  1. Running each release candidate through piglit on my laptop

  2. Trying to leave the stable branch pushed out for at least a day or
 so to allow testing prior to release.

The testing I do on my laptop at least gives some sanity checking for
core, but very little driver-specific testing. Meanwhile, the majority
of stable-branch fixes seem to be in driver-specific code.

And I don't know if there is any serious testing of the stable branch by
anyone else between when I push it and when I release, (I haven't heard
much either way). I know it doesn't help that the timing of my releases
has not often been predictable.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: switch dedicated centroid field to interpolation location

2014-07-08 Thread Ilia Mirkin
On Sat, Jul 5, 2014 at 5:20 PM, Roland Scheidegger srol...@vmware.com wrote:
 Am 05.07.2014 06:07, schrieb Ilia Mirkin:
 The new location field can be either center, centroid, or sample, which
 indicates the location that the shader should interpolate at.
 Looks good though maybe it should be mentioned in the interpolator
 section in tgsi.rst (yeah centroid wasn't neither).

I added this bit:


+The Location field specifies the location inside the pixel that the
+interpolation should be done at, one of ``TGSI_INTERPOLATE_LOC_*``.


Let me know if you want me to resend.

 In any case,
 Reviewed-by: Roland Scheidegger srol...@vmware.com


 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 I tried to make sure I hit all the uses, but I guess a bunch of drivers don't
 look at the Centroid field at all?
 Well drivers not doing msaa or just fake one like llvmpipe aren't
 interested in it.
 And, this wasn't exposed via d3d9, though r300 reportedly can do it (the
 docs even mention it for r5xx), I suspect using some of the crazy
 backdoor mechanisms of d3d9. Some very quick look at the docs though
 don't indicate how this could be done (if it's even possible per
 interpolator) for these chips, which is I guess why centroid isn't handled.


  src/gallium/auxiliary/tgsi/tgsi_build.c   |  8 
  src/gallium/auxiliary/tgsi/tgsi_dump.c|  5 +++--
  src/gallium/auxiliary/tgsi/tgsi_scan.c|  2 +-
  src/gallium/auxiliary/tgsi/tgsi_scan.h|  2 +-
  src/gallium/auxiliary/tgsi/tgsi_strings.c |  7 +++
  src/gallium/auxiliary/tgsi/tgsi_strings.h |  2 ++
  src/gallium/auxiliary/tgsi/tgsi_ureg.c| 12 ++--
  src/gallium/auxiliary/tgsi/tgsi_ureg.h|  2 +-
  src/gallium/drivers/ilo/shader/toy_tgsi.c |  2 +-
  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp |  2 +-
  src/gallium/drivers/r600/r600_shader.c|  4 ++--
  src/gallium/drivers/radeonsi/si_shader.c  |  6 +++---
  src/gallium/include/pipe/p_shader_tokens.h|  9 +++--
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp|  5 +++--
  src/mesa/state_tracker/st_glsl_to_tgsi.h  |  2 +-
  src/mesa/state_tracker/st_program.c   | 13 +
  16 files changed, 52 insertions(+), 31 deletions(-)

 diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
 b/src/gallium/auxiliary/tgsi/tgsi_build.c
 index 7621b6a..bef5c75 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
 @@ -200,7 +200,7 @@ tgsi_default_declaration_interp( void )
 struct tgsi_declaration_interp di;

 di.Interpolate = TGSI_INTERPOLATE_CONSTANT;
 -   di.Centroid = 0;
 +   di.Location = TGSI_INTERPOLATE_LOC_CENTER;
 di.CylindricalWrap = 0;
 di.Padding = 0;

 @@ -209,7 +209,7 @@ tgsi_default_declaration_interp( void )

  static struct tgsi_declaration_interp
  tgsi_build_declaration_interp(unsigned interpolate,
 -  unsigned centroid,
 +  unsigned interpolate_location,
unsigned cylindrical_wrap,
struct tgsi_declaration *declaration,
struct tgsi_header *header)
 @@ -217,7 +217,7 @@ tgsi_build_declaration_interp(unsigned interpolate,
 struct tgsi_declaration_interp di;

 di.Interpolate = interpolate;
 -   di.Centroid = centroid;
 +   di.Location = interpolate_location;
 di.CylindricalWrap = cylindrical_wrap;
 di.Padding = 0;

 @@ -433,7 +433,7 @@ tgsi_build_full_declaration(
size++;

*di = tgsi_build_declaration_interp(full_decl-Interp.Interpolate,
 -  full_decl-Interp.Centroid,
 +  full_decl-Interp.Location,
full_decl-Interp.CylindricalWrap,
declaration,
header);
 diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
 b/src/gallium/auxiliary/tgsi/tgsi_dump.c
 index 8e09bac..884d8cf 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
 @@ -349,8 +349,9 @@ iter_declaration(
   ENM( decl-Interp.Interpolate, tgsi_interpolate_names );
}

 -  if (decl-Interp.Centroid) {
 - TXT( , CENTROID );
 +  if (decl-Interp.Location != TGSI_INTERPOLATE_LOC_CENTER) {
 + TXT( ,  );
 + ENM( decl-Interp.Location, tgsi_interpolate_locations );
}

if (decl-Interp.CylindricalWrap) {
 diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
 b/src/gallium/auxiliary/tgsi/tgsi_scan.c
 index 00fdcfb..563d2c5 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
 @@ 

Re: [Mesa-dev] [PATCH 2/3] mesa/st: add per sample shading state to fp key and set interpolation

2014-07-08 Thread Ilia Mirkin
So... I don't think we're going to figure this out here. At least I
have nothing enlightening to say. FWIW this is doing the same thing as
what i965 does wrt the persample_shading computation. It should be
pretty easy to change should we decide on a different interpretation
of the spec.

The only question is whether this is necessary at all. i.e. what
happens if you have a ARB_gs5-enabled shader with a regular varying
and sample varying. If the regular varying should just be interpolated
at the sample, perhaps all this stuff is stupid and I should just drop
it and keep the if per-sample shading, just interpolate at the sample
no matter what semantics that ARB_sample_shading introduced. I tried
to start a thread about this but no one bit :)

Hopefully people who have more experience than I will be able to
suggest how to proceed. As always, happy to do it whichever way.

  -ilia

On Sat, Jul 5, 2014 at 6:48 PM, Marek Olšák mar...@gmail.com wrote:
 What you say makes sense. I'm just asking what that sentence in the
 spec means if it isn't about interpolation. :)

 Marek

 On Sun, Jul 6, 2014 at 12:40 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Sat, Jul 5, 2014 at 6:22 PM, Marek Olšák mar...@gmail.com wrote:
 There is this vague statement in the sample shading spec:

 When the sample shading fraction is 1.0, a separate set of colors and
 other associated data are evaluated for each sample, each set of values
 are evaluated at the sample location.

 I thought it was about interpolation, meaning that the fraction must
 1.0 for the per-sample interpolation to be enabled, right?

 I guess so. I'm pretty much just doing what the intel driver is
 doing... see brw_wm.c.

 /* Gets the minimum number of shader invocations per fragment.
  * This function is useful to determine if we need to do per
  * sample shading or per fragment shading.
  */
 GLint
 _mesa_get_min_invocations_per_fragment(struct gl_context *ctx,
const struct gl_fragment_program 
 *prog,
bool ignore_sample_qualifier)

 So I guess instead of  1 this should be checking for ==
 ctx-DrawBuffer-Visual.samples? (and  1)

 Although in that case, I _really_ don't get the point of having
 non-1.0/0.0 fractions ever existing. Why would you want it to be
 shaded for e.g. 4/8 samples if all the inputs are going to get
 interpolated the same way?

   -ilia


 Marek


 On Sat, Jul 5, 2014 at 6:07 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 This enables a gallium driver not to care about the semantics of
 ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When
 ARB_sample_shading-style sample shading is enabled, all of the fp inputs
 are marked for interpolation at the sample location.

 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 I _think_ I got this right... at least the tests still pass. But my
 understanding of all the various update atoms, etc is really weak... please
 read carefully :)

 Now that I understand all this interpolation stuff better, I see why it was
 suggested I add proper per-sample interpolation first and the
 ARB_sample_shading stuff second. Oh well...

  src/mesa/state_tracker/st_atom_shader.c | 6 +-
  src/mesa/state_tracker/st_program.c | 3 +++
  src/mesa/state_tracker/st_program.h | 3 +++
  3 files changed, 11 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/state_tracker/st_atom_shader.c 
 b/src/mesa/state_tracker/st_atom_shader.c
 index 67c6157..6515a98 100644
 --- a/src/mesa/state_tracker/st_atom_shader.c
 +++ b/src/mesa/state_tracker/st_atom_shader.c
 @@ -89,6 +89,10 @@ update_fp( struct st_context *st )
 key.clamp_color = st-clamp_frag_color_in_shader 
   st-ctx-Color._ClampFragmentColor;

 +   /* Ignore sample qualifier while computing this flag. */
 +   key.persample_shading =
 +  _mesa_get_min_invocations_per_fragment(st-ctx, stfp-Base, true) 
  1;
 +
 st-fp_variant = st_get_fp_variant(st, stfp, key);

 st_reference_fragprog(st, st-fp, stfp);
 @@ -108,7 +112,7 @@ update_fp( struct st_context *st )
  const struct st_tracked_state st_update_fp = {
 st_update_fp, /* name */
 {   /* dirty */
 -  _NEW_BUFFERS,/* mesa */
 +  _NEW_BUFFERS | _NEW_MULTISAMPLE, /* mesa */
ST_NEW_FRAGMENT_PROGRAM   /* st */
 },
 update_fp   /* update */
 diff --git a/src/mesa/state_tracker/st_program.c 
 b/src/mesa/state_tracker/st_program.c
 index b603759..9d7b7c4 100644
 --- a/src/mesa/state_tracker/st_program.c
 +++ b/src/mesa/state_tracker/st_program.c
 @@ -548,6 +548,9 @@ st_translate_fragment_program(struct st_context *st,
   else
  interpLocation[slot] = TGSI_INTERPOLATE_LOC_CENTER;

 + if (key-persample_shading)
 +

Re: [Mesa-dev] [PATCH 2/3] mesa/st: add per sample shading state to fp key and set interpolation

2014-07-08 Thread Marek Olšák
Alright. For the patch:

Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Wed, Jul 9, 2014 at 2:48 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 So... I don't think we're going to figure this out here. At least I
 have nothing enlightening to say. FWIW this is doing the same thing as
 what i965 does wrt the persample_shading computation. It should be
 pretty easy to change should we decide on a different interpretation
 of the spec.

 The only question is whether this is necessary at all. i.e. what
 happens if you have a ARB_gs5-enabled shader with a regular varying
 and sample varying. If the regular varying should just be interpolated
 at the sample, perhaps all this stuff is stupid and I should just drop
 it and keep the if per-sample shading, just interpolate at the sample
 no matter what semantics that ARB_sample_shading introduced. I tried
 to start a thread about this but no one bit :)

 Hopefully people who have more experience than I will be able to
 suggest how to proceed. As always, happy to do it whichever way.

   -ilia

 On Sat, Jul 5, 2014 at 6:48 PM, Marek Olšák mar...@gmail.com wrote:
 What you say makes sense. I'm just asking what that sentence in the
 spec means if it isn't about interpolation. :)

 Marek

 On Sun, Jul 6, 2014 at 12:40 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Sat, Jul 5, 2014 at 6:22 PM, Marek Olšák mar...@gmail.com wrote:
 There is this vague statement in the sample shading spec:

 When the sample shading fraction is 1.0, a separate set of colors and
 other associated data are evaluated for each sample, each set of values
 are evaluated at the sample location.

 I thought it was about interpolation, meaning that the fraction must
 1.0 for the per-sample interpolation to be enabled, right?

 I guess so. I'm pretty much just doing what the intel driver is
 doing... see brw_wm.c.

 /* Gets the minimum number of shader invocations per fragment.
  * This function is useful to determine if we need to do per
  * sample shading or per fragment shading.
  */
 GLint
 _mesa_get_min_invocations_per_fragment(struct gl_context *ctx,
const struct gl_fragment_program 
 *prog,
bool ignore_sample_qualifier)

 So I guess instead of  1 this should be checking for ==
 ctx-DrawBuffer-Visual.samples? (and  1)

 Although in that case, I _really_ don't get the point of having
 non-1.0/0.0 fractions ever existing. Why would you want it to be
 shaded for e.g. 4/8 samples if all the inputs are going to get
 interpolated the same way?

   -ilia


 Marek


 On Sat, Jul 5, 2014 at 6:07 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 This enables a gallium driver not to care about the semantics of
 ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When
 ARB_sample_shading-style sample shading is enabled, all of the fp inputs
 are marked for interpolation at the sample location.

 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 I _think_ I got this right... at least the tests still pass. But my
 understanding of all the various update atoms, etc is really weak... 
 please
 read carefully :)

 Now that I understand all this interpolation stuff better, I see why it 
 was
 suggested I add proper per-sample interpolation first and the
 ARB_sample_shading stuff second. Oh well...

  src/mesa/state_tracker/st_atom_shader.c | 6 +-
  src/mesa/state_tracker/st_program.c | 3 +++
  src/mesa/state_tracker/st_program.h | 3 +++
  3 files changed, 11 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/state_tracker/st_atom_shader.c 
 b/src/mesa/state_tracker/st_atom_shader.c
 index 67c6157..6515a98 100644
 --- a/src/mesa/state_tracker/st_atom_shader.c
 +++ b/src/mesa/state_tracker/st_atom_shader.c
 @@ -89,6 +89,10 @@ update_fp( struct st_context *st )
 key.clamp_color = st-clamp_frag_color_in_shader 
   st-ctx-Color._ClampFragmentColor;

 +   /* Ignore sample qualifier while computing this flag. */
 +   key.persample_shading =
 +  _mesa_get_min_invocations_per_fragment(st-ctx, stfp-Base, true) 
  1;
 +
 st-fp_variant = st_get_fp_variant(st, stfp, key);

 st_reference_fragprog(st, st-fp, stfp);
 @@ -108,7 +112,7 @@ update_fp( struct st_context *st )
  const struct st_tracked_state st_update_fp = {
 st_update_fp, /* name */
 {   /* dirty */
 -  _NEW_BUFFERS,/* mesa */
 +  _NEW_BUFFERS | _NEW_MULTISAMPLE, /* mesa */
ST_NEW_FRAGMENT_PROGRAM   /* st */
 },
 update_fp   /* update */
 diff --git a/src/mesa/state_tracker/st_program.c 
 b/src/mesa/state_tracker/st_program.c
 index b603759..9d7b7c4 100644
 --- a/src/mesa/state_tracker/st_program.c
 +++ b/src/mesa/state_tracker/st_program.c
 @@ -548,6 +548,9 @@ 

Re: [Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak

2014-07-08 Thread Charmaine Lee
Changes looks good. Tested with svga driver.

Reviewed-by: Charmaine Lee charmai...@vmware.com

From: mesa-dev mesa-dev-boun...@lists.freedesktop.org on behalf of Brian Paul 
bri...@vmware.com
Sent: Tuesday, July 8, 2014 3:32 PM
To: mesa-dev@lists.freedesktop.org
Cc: 10.2
Subject: [Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak

Spotted by Charmaine Lee.
Cc: 10.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/state_tracker/st_context.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index c7f3ec6..c805a09 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -307,6 +307,7 @@ void st_destroy_context( struct st_context *st )
cso_release_all(st-cso_context);

st_reference_fragprog(st, st-fp, NULL);
+   st_reference_geomprog(st, st-gp, NULL);
st_reference_vertprog(st, st-vp, NULL);

/* release framebuffer surfaces */
--
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=iVNYIcCaC9TDvyNBQU%2F5q5NVsC01tSgJb3oX27T14ck%3D%0Am=PGTeI%2BoujnkFXc9mu71bomZEroIqI0JU%2FWsluOF9vs4%3D%0As=f6d160459d8d239933fc9e34ab65f560b5b06adc65b8c4ad0e1a4acacedda314
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: add fix geometry shader memory leaks

2014-07-08 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Wed, Jul 9, 2014 at 12:32 AM, Brian Paul bri...@vmware.com wrote:
 Spotted by Charmaine Lee.
 Cc: 10.2 mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/main/context.c |3 +++
  src/mesa/main/shared.c  |1 +
  2 files changed, 4 insertions(+)

 diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
 index b082159..50aae8b 100644
 --- a/src/mesa/main/context.c
 +++ b/src/mesa/main/context.c
 @@ -1215,6 +1215,9 @@ _mesa_free_context_data( struct gl_context *ctx )
 _mesa_reference_vertprog(ctx, ctx-VertexProgram._Current, NULL);
 _mesa_reference_vertprog(ctx, ctx-VertexProgram._TnlProgram, NULL);

 +   _mesa_reference_geomprog(ctx, ctx-GeometryProgram.Current, NULL);
 +   _mesa_reference_geomprog(ctx, ctx-GeometryProgram._Current, NULL);
 +
 _mesa_reference_fragprog(ctx, ctx-FragmentProgram.Current, NULL);
 _mesa_reference_fragprog(ctx, ctx-FragmentProgram._Current, NULL);
 _mesa_reference_fragprog(ctx, ctx-FragmentProgram._TexEnvProgram, NULL);
 diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
 index dc22025..5ae7014 100644
 --- a/src/mesa/main/shared.c
 +++ b/src/mesa/main/shared.c
 @@ -312,6 +312,7 @@ free_shared_state(struct gl_context *ctx, struct 
 gl_shared_state *shared)
 _mesa_DeleteHashTable(shared-Programs);

 _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL);
 +   _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL);
 _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL);

 _mesa_HashDeleteAll(shared-ATIShaders, delete_fragshader_cb, ctx);
 --
 1.7.10.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] nvc0: allocate more space before a counter is configured

2014-07-08 Thread Martin Peres

On 05/07/2014 20:49, Samuel Pitoiset wrote:

On nvc0, a counter can up to 6 sources instead of only one
for nve4+. This fixes a crash when a counter uses more than
one source.

The verb is missing in the first sentence :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] nvc0: allocate more space before a counter is configured

2014-07-08 Thread Ilia Mirkin
On Sat, Jul 5, 2014 at 7:41 PM, Martin Peres martin.pe...@free.fr wrote:
 On 05/07/2014 20:49, Samuel Pitoiset wrote:

 On nvc0, a counter can up to 6 sources instead of only one
 for nve4+. This fixes a crash when a counter uses more than
 one source.

 The verb is missing in the first sentence :)

Yeah, he accidentally the whole sentence[1]. But I fixed it up before
pushing it.

  -ilia

[1] http://ultimate-photos.com/share/i-just-accidentally-a-coca-cola-bottle.jpg
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa/st: add support for dynamic ubo selection

2014-07-08 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

With ChrisF's patches to add support for this in core mesa, this generates
code like:

FRAG
DCL OUT[0], COLOR
DCL CONST[0]
DCL CONST[1][0]
DCL CONST[2][0]
DCL CONST[3][0]
DCL CONST[4][0]
DCL TEMP[0], LOCAL
DCL ADDR[0..1]
IMM[0] UINT32 {0, 0, 0, 0}
IMM[1] INT32 {1, 0, 0, 0}
  0: UADD TEMP[0].x, CONST[0]., IMM[1].
  1: UARL ADDR[1].x, TEMP[0].
  2: UARL ADDR[1].x, TEMP[0].
  3: MOV TEMP[0], CONST[ADDR[1].x][0]
  4: MOV OUT[0], TEMP[0]
  5: END

Not sure what the deal is with the two UARL's, but nouveau's backend removes
one of them pretty easily. I assume others handle this too.

Unfortunately the core patches aren't quite ready yet, but this patch doesn't
regress anything.

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9bc7500..3202c56 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1947,16 +1947,16 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   break;
 
case ir_binop_ubo_load: {
-  ir_constant *uniform_block = ir-operands[0]-as_constant();
+  ir_constant *const_uniform_block = ir-operands[0]-as_constant();
   ir_constant *const_offset_ir = ir-operands[1]-as_constant();
   unsigned const_offset = const_offset_ir ? const_offset_ir-value.u[0] : 
0;
+  unsigned const_block = const_uniform_block ? 
const_uniform_block-value.u[0] + 1 : 0;
   st_src_reg index_reg = get_temp(glsl_type::uint_type);
   st_src_reg cbuf;
 
   cbuf.type = glsl_type::vec4_type-base_type;
   cbuf.file = PROGRAM_CONSTANT;
   cbuf.index = 0;
-  cbuf.index2D = uniform_block-value.u[0] + 1;
   cbuf.reladdr = NULL;
   cbuf.negate = 0;
   
@@ -1966,7 +1966,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
  /* Constant index into constant buffer */
  cbuf.reladdr = NULL;
  cbuf.index = const_offset / 16;
- cbuf.has_index2 = true;
   }
   else {
  /* Relative/variable index into constant buffer */
@@ -1976,6 +1975,21 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
  memcpy(cbuf.reladdr, index_reg, sizeof(index_reg));
   }
 
+  if (const_uniform_block) {
+ /* Constant constant buffer */
+ cbuf.reladdr2 = NULL;
+ cbuf.index2D = const_block;
+ cbuf.has_index2 = true;
+  }
+  else {
+ /* Relative/variable constant buffer */
+ emit(ir, TGSI_OPCODE_UADD, st_dst_reg(index_reg), op[0],
+  st_src_reg_for_int(1));
+ cbuf.reladdr2 = ralloc(mem_ctx, st_src_reg);
+ memcpy(cbuf.reladdr2, index_reg, sizeof(index_reg));
+ cbuf.has_index2 = true;
+  }
+
   cbuf.swizzle = swizzle_for_size(ir-type-vector_elements);
   cbuf.swizzle += MAKE_SWIZZLE4(const_offset % 16 / 4,
 const_offset % 16 / 4,
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nvc0/ir: support 2d constbuf indexing

2014-07-08 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp   | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 37e2f7e..95423d4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1606,6 +1606,19 @@ NVC0LoweringPass::visit(Instruction *i)
 i-op = OP_VFETCH;
 assert(prog-getType() != Program::TYPE_FRAGMENT); // INTERP
  }
+  } else if (i-src(0).getFile() == FILE_MEMORY_CONST) {
+ if (i-src(0).isIndirect(1)) {
+Value *ptr;
+if (i-src(0).isIndirect(0))
+   ptr = bld.mkOp3v(OP_INSBF, TYPE_U32, bld.getSSA(),
+i-getIndirect(0, 1), bld.mkImm(0x1010),
+i-getIndirect(0, 0));
+else
+   ptr = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(),
+i-getIndirect(0, 1), bld.mkImm(16));
+i-setIndirect(0, 1, NULL);
+i-setIndirect(0, 0, ptr);
+ }
   }
   break;
case OP_ATOM:
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 80933] Fullscreen OpenGL programs (e.g. games) crash if focus lost then regained, something to do with automatic compositing suspension

2014-07-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=80933

Michel Dänzer mic...@daenzer.net changed:

   What|Removed |Added

 Status|NEW |NEEDINFO

--- Comment #2 from Michel Dänzer mic...@daenzer.net ---
We'll need to see a backtrace of an application crash, with symbols for as many
entries of the backtrace as possible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.

2014-07-08 Thread Matt Turner
Otherwise we'd compare uninitialized pointers with NULL and dereference,
leading to crashes.
---
 src/mesa/drivers/dri/i965/intel_asm_annotation.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
index 4717baf..d524725 100644
--- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
+++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
@@ -96,9 +96,13 @@ void annotate(struct brw_context *brw,
   struct backend_instruction *inst, unsigned offset)
 {
if (annotation-ann_size = annotation-ann_count) {
+  int old_size = annotation-ann_size;
   annotation-ann_size = MAX2(1024, annotation-ann_size * 2);
   annotation-ann = reralloc(annotation-mem_ctx, annotation-ann,
  struct annotation, annotation-ann_size);
+  memset(annotation-ann + old_size, 0,
+ (annotation-ann_size - old_size) * sizeof(struct annotation));
+
   if (!annotation-ann)
  return;
}
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.

2014-07-08 Thread Chris Forbes
I think you want to move the memset after the !annotation-ann bail
out. Currently, if that allocation were to fail (and we care enough to
check...) , you'll segfault.

On Wed, Jul 9, 2014 at 3:44 PM, Matt Turner matts...@gmail.com wrote:
 Otherwise we'd compare uninitialized pointers with NULL and dereference,
 leading to crashes.
 ---
  src/mesa/drivers/dri/i965/intel_asm_annotation.c | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
 b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 index 4717baf..d524725 100644
 --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 @@ -96,9 +96,13 @@ void annotate(struct brw_context *brw,
struct backend_instruction *inst, unsigned offset)
  {
 if (annotation-ann_size = annotation-ann_count) {
 +  int old_size = annotation-ann_size;
annotation-ann_size = MAX2(1024, annotation-ann_size * 2);
annotation-ann = reralloc(annotation-mem_ctx, annotation-ann,
   struct annotation, annotation-ann_size);
 +  memset(annotation-ann + old_size, 0,
 + (annotation-ann_size - old_size) * sizeof(struct annotation));
 +
if (!annotation-ann)
   return;
 }
 --
 1.8.5.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.

2014-07-08 Thread Matt Turner
On Tue, Jul 8, 2014 at 9:51 PM, Chris Forbes chr...@ijw.co.nz wrote:
 I think you want to move the memset after the !annotation-ann bail
 out. Currently, if that allocation were to fail (and we care enough to
 check...) , you'll segfault.

Yeah... of course.

Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev