Re: [Mesa-dev] [PATCH 6/8] i965: Implement HiZ resolves on Broadwell.

2014-02-19 Thread Kenneth Graunke
On 02/18/2014 01:38 PM, Eric Anholt wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
[snip]
 diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
 b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 index f30ff28..3fa20c8 100644
 --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 @@ -203,3 +203,108 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw,
brw-depthstencil.stencil_offset,
hiz, width, height, depth, lod, min_array_element);
  }
 +
 +/**
 + * Emit packets to perform a depth/HiZ resolve or fast depth/stencil clear.
 + *
 + * See the Optimized Depth Buffer Clear and/or Stencil Buffer Clear 
 section
 + * of the hardware documentation for details.
 + */
 +void
 +gen8_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt,
 +  unsigned int level, unsigned int layer, enum gen6_hiz_op op)
 +{
 +   if (op == GEN6_HIZ_OP_NONE)
 +  return;
 +
 +   assert(mt-first_level == 0);
 +
 +   struct intel_mipmap_level *miplevel = mt-level[level];
 +
 +   /* The basic algorithm is:
 +* - If needed, emit 3DSTATE_{DEPTH,HIER_DEPTH,STENCIL}_BUFFER and
 +*   3DSTATE_CLEAR_PARAMS packets to set up the relevant buffers.
 +* - If needed, emit 3DSTATE_DRAWING_RECTANGLE.
 +* - Emit 3DSTATE_WM_HZ_OP with a bit set for the particular operation.
 +* - Do a special PIPE_CONTROL to trigger an implicit rectangle 
 primitive.
 +* - Emit 3DSTATE_WM_HZ_OP with no bits set to return to normal 
 rendering.
 +*/
 +   emit_depth_packets(brw, mt,
 +  brw_depth_format(brw, mt-format),
 +  BRW_SURFACE_2D,
 +  true, /* depth writes */
 +  NULL, false, 0, /* no stencil for now */
 +  true, /* hiz */
 +  mt-logical_width0,
 +  mt-logical_height0,
 +  MAX2(mt-logical_depth0, 1),
 
 Is logical_depth0 ever 0?  That seems like a bug.

No, I guess it isn't.  It looks like I copy and pasted this from BLORP,
or was being overly cautious for some reason.  Dropped.

 +  level,
 +  layer); /* min_array_element */
 +
 +   BEGIN_BATCH(4);
 +   OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE  16 | (4 - 2));
 +   OUT_BATCH(0);
 +   OUT_BATCH(((mt-logical_width0 - 1)  0x) |
 + ((mt-logical_height0 - 1)  16));
 +   OUT_BATCH(0);
 +   ADVANCE_BATCH();
 
 The drawing rectangle should be using the level's size, not the level 0
 size.

Yes, this makes sense...we bind a specific miplevel of the depth buffer,
so presumably the (0, 0) origin is the start of that miplevel, not the
start of the whole tree.  I'll change that.

Since the drawing rectangle is just the bounds of where you can draw,
and not actually the clear/resolve rectangle, I think specifying one
that's too large shouldn't be harmful.  But specifying the right value
is trivial, so I agree we should do it.

 +   uint32_t sample_mask = 0x;
 +   if (mt-num_samples  0) {
 +  dw1 |= SET_FIELD(ffs(mt-num_samples) - 1, GEN8_WM_HZ_NUM_SAMPLES);
 +  sample_mask = gen6_determine_sample_mask(brw);
 +   }
 
 I don't think we want the user-set sample mask stuff to change the
 samples affected by our hiz/depth resolves.  I think you can just drop
 the if block.

Good point, whatever the user specified is probably unrelated to our
values.  I've dropped the sample_mask variable and just stuffed 0x
in the packet.

I kept the if-block for the dw1 |= ...num_samples... line.

 +
 +   BEGIN_BATCH(5);
 +   OUT_BATCH(_3DSTATE_WM_HZ_OP  16 | (5 - 2));
 +   OUT_BATCH(dw1);
 +   OUT_BATCH(0);
 +   OUT_BATCH(SET_FIELD(miplevel-width, GEN8_WM_HZ_CLEAR_RECTANGLE_X_MAX) |
 + SET_FIELD(miplevel-height, GEN8_WM_HZ_CLEAR_RECTANGLE_Y_MAX));
 +   OUT_BATCH(SET_FIELD(sample_mask, GEN8_WM_HZ_SAMPLE_MASK));
 +   ADVANCE_BATCH();
 
 I think now the miplevel-width should be minify(mt-logical_width0,
 level).  Hope that helped

Yes, that's much nicer - and correct for MSAA buffers!  I'm unclear
whether we need to do:

   ALIGN(minify(mt-logical_width0,  level), 8)
   ALIGN(minify(mt-logical_height0, level), 4)

(both here and in the drawing rectangle)

I've read seemingly contradictory information...it sounds like it might
be necessary for depth resolves, but not otherwise...but I could be
misinterpreting it.  It seems to be working...

 +
 +   /* Emit a PIPE_CONTROL with Post-Sync Operation set to Write Immediate
 +* Data, and no other bits set.  This causes 3DSTATE_WM_HZ_OP's state to
 +* take effect, and spawns a rectangle primitive.
 +*/
 +   brw_emit_pipe_control_write(brw,
 +   PIPE_CONTROL_WRITE_IMMEDIATE,
 +   brw-batch.workaround_bo, 0, 0, 0);
 +
 +   /* Emit 3DSTATE_WM_HZ_OP again to disable the state overrides. */
 +   BEGIN_BATCH(5);
 +   OUT_BATCH(_3DSTATE_WM_HZ_OP  16 | (5 

[Mesa-dev] [PATCH 01/13] i965: Simplify Broadwell's 3DSTATE_MULTISAMPLE sample count handling.

2014-02-19 Thread Kenneth Graunke
These enumerations are simply log2 of the number of multisamples shifted
by a bit, so we can calculate them using ffs() in a lot less code.

Suggested by Eric Anholt.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_multisample_state.c | 26 +++---
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
index 64c7208..bfe0d5b 100644
--- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
@@ -33,33 +33,13 @@
 void
 gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
 {
-   uint32_t number_of_multisamples = 0;
+   assert(num_samples = 16);
 
-   switch (num_samples) {
-   case 0:
-   case 1:
-  number_of_multisamples = MS_NUMSAMPLES_1;
-  break;
-   case 2:
-  number_of_multisamples = MS_NUMSAMPLES_2;
-  break;
-   case 4:
-  number_of_multisamples = MS_NUMSAMPLES_4;
-  break;
-   case 8:
-  number_of_multisamples = MS_NUMSAMPLES_8;
-  break;
-   case 16:
-  number_of_multisamples = MS_NUMSAMPLES_16;
-  break;
-   default:
-  assert(!Unrecognized num_samples in gen8_emit_3dstate_multisample);
-  break;
-   }
+   unsigned log2_samples = ffs(MAX2(num_samples, 1)) - 1;
 
BEGIN_BATCH(2);
OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE  16 | (2 - 2));
-   OUT_BATCH(MS_PIXEL_LOCATION_CENTER | number_of_multisamples);
+   OUT_BATCH(MS_PIXEL_LOCATION_CENTER | log2_samples  1);
ADVANCE_BATCH();
 }
 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/13] i965/fs: Implement FS_OPCODE_SET_SAMPLE_ID on Broadwell.

2014-02-19 Thread Kenneth Graunke
Largely cut and paste from Gen7; it works the same way.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_fs.h  |  4 
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 29 -
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index b5fb0eb..99c6298 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -704,6 +704,10 @@ private:
 struct brw_reg index,
 struct brw_reg offset);
void generate_mov_dispatch_to_flags(fs_inst *ir);
+   void generate_set_sample_id(fs_inst *ir,
+   struct brw_reg dst,
+   struct brw_reg src0,
+   struct brw_reg src1);
void generate_set_simd4x2_offset(fs_inst *ir,
 struct brw_reg dst,
 struct brw_reg offset);
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index e19d960..0078228 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -698,6 +698,33 @@ gen8_fs_generator::generate_set_simd4x2_offset(fs_inst *ir,
 }
 
 /**
+ * Do a special ADD with vstride=1, width=4, hstride=0 for src1.
+ */
+void
+gen8_fs_generator::generate_set_sample_id(fs_inst *ir,
+  struct brw_reg dst,
+  struct brw_reg src0,
+  struct brw_reg src1)
+{
+   assert(dst.type == BRW_REGISTER_TYPE_D || dst.type == BRW_REGISTER_TYPE_UD);
+   assert(src0.type == BRW_REGISTER_TYPE_D || src0.type == 
BRW_REGISTER_TYPE_UD);
+
+   struct brw_reg reg = retype(stride(src1, 1, 4, 0), BRW_REGISTER_TYPE_UW);
+
+   unsigned save_exec_size = default_state.exec_size;
+   default_state.exec_size = BRW_EXECUTE_8;
+
+   gen8_instruction *add = ADD(dst, src0, reg);
+   gen8_set_mask_control(add, BRW_MASK_DISABLE);
+   if (dispatch_width == 16) {
+  add = ADD(offset(dst, 1), offset(src0, 1), suboffset(reg, 2));
+  gen8_set_mask_control(add, BRW_MASK_DISABLE);
+   }
+
+   default_state.exec_size = save_exec_size;
+}
+
+/**
  * Change the register's data type from UD to HF, doubling the strides in order
  * to compensate for halving the data type width.
  */
@@ -1148,7 +1175,7 @@ gen8_fs_generator::generate_code(exec_list *instructions)
  break;
 
   case FS_OPCODE_SET_SAMPLE_ID:
- assert(!XXX: Missing Gen8 scalar support for SET_SAMPLE_ID);
+ generate_set_sample_id(ir, dst, src[0], src[1]);
  break;
 
   case FS_OPCODE_PACK_HALF_2x16_SPLIT:
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/13] i965: Set Position XY Offset Select bits in 3DSTATE_PS on Broadwell.

2014-02-19 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index e93668e..57bf053 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -187,6 +187,24 @@ upload_ps_state(struct brw_context *brw)
if (brw-wm.prog_data-prog_offset_16)
   dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
 
+   /* From the documentation for this packet:
+* If the PS kernel does not need the Position XY Offsets to
+*  compute a Position Value, then this field should be programmed
+*  to POSOFFSET_NONE.
+*
+* SW Recommendation: If the PS kernel needs the Position Offsets
+*  to compute a Position XY value, this field should match Position
+*  ZW Interpolation Mode to ensure a consistent position.xyzw
+*  computation.
+*
+* We only require XY sample offsets. So, this recommendation doesn't
+* look useful at the moment. We might need this in future.
+*/
+   if (brw-wm.prog_data-uses_pos_offset)
+  dw6 |= GEN7_PS_POSOFFSET_SAMPLE;
+   else
+  dw6 |= GEN7_PS_POSOFFSET_NONE;
+
dw7 |=
   brw-wm.prog_data-first_curbe_grf  GEN7_PS_DISPATCH_START_GRF_SHIFT_0 
|
   brw-wm.prog_data-first_curbe_grf_16 
GEN7_PS_DISPATCH_START_GRF_SHIFT_2;
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/13] i965: Only use the SIMD16 program for per-sample shading on Broadwell.

2014-02-19 Thread Kenneth Graunke
This is a straight port from gen7_wm_state.c; I haven't looked into
whether we can do both.

v2: Actually do it right.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 38 ---
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index 57bf053..a834b85 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -183,10 +183,6 @@ upload_ps_state(struct brw_context *brw)
if (brw-wm.prog_data-nr_params  0)
   dw6 |= GEN7_PS_PUSH_CONSTANT_ENABLE;
 
-   dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
-   if (brw-wm.prog_data-prog_offset_16)
-  dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
-
/* From the documentation for this packet:
 * If the PS kernel does not need the Position XY Offsets to
 *  compute a Position Value, then this field should be programmed
@@ -205,13 +201,39 @@ upload_ps_state(struct brw_context *brw)
else
   dw6 |= GEN7_PS_POSOFFSET_NONE;
 
-   dw7 |=
-  brw-wm.prog_data-first_curbe_grf  GEN7_PS_DISPATCH_START_GRF_SHIFT_0 
|
-  brw-wm.prog_data-first_curbe_grf_16 
GEN7_PS_DISPATCH_START_GRF_SHIFT_2;
+   /* In case of non 1x per sample shading, only one of SIMD8 and SIMD16
+* should be enabled. We do 'SIMD16 only' dispatch if a SIMD16 shader
+* is successfully compiled. In majority of the cases that bring us
+* better performance than 'SIMD8 only' dispatch.
+*/
+   int min_invocations_per_fragment =
+  _mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
false);
+   assert(min_invocations_per_fragment = 1);
+
+   if (brw-wm.prog_data-prog_offset_16) {
+  dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
+  if (min_invocations_per_fragment == 1) {
+ dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
+ dw7 |= (brw-wm.prog_data-first_curbe_grf 
+ GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
+ dw7 |= (brw-wm.prog_data-first_curbe_grf_16 
+ GEN7_PS_DISPATCH_START_GRF_SHIFT_2);
+  } else {
+ dw7 |= (brw-wm.prog_data-first_curbe_grf_16 
+ GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
+  }
+   } else {
+  dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
+  dw7 |= (brw-wm.prog_data-first_curbe_grf 
+  GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
+   }
 
BEGIN_BATCH(12);
OUT_BATCH(_3DSTATE_PS  16 | (12 - 2));
-   OUT_BATCH(brw-wm.base.prog_offset);
+   if (brw-wm.prog_data-prog_offset_16  min_invocations_per_fragment  1)
+  OUT_BATCH(brw-wm.base.prog_offset + brw-wm.prog_data-prog_offset_16);
+   else
+  OUT_BATCH(brw-wm.base.prog_offset);
OUT_BATCH(0);
OUT_BATCH(dw3);
if (brw-wm.prog_data-total_scratch) {
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/13] i965: Thwack multisample enable bit in 3DSTATE_RASTER.

2014-02-19 Thread Kenneth Graunke
The meaning and effects of this bit are surprisingly complicated.

See Rasterization  Windower  Multisampling  Multisample ModesState.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
 src/mesa/drivers/dri/i965/gen8_sf_state.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index dea0940..1cbbe67 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1707,6 +1707,7 @@ enum brw_message_target {
 # define GEN8_RASTER_CULL_FRONT (2  16)
 # define GEN8_RASTER_CULL_BACK  (3  16)
 # define GEN8_RASTER_SMOOTH_POINT_ENABLE(1  13)
+# define GEN8_RASTER_API_MULTISAMPLE_ENABLE (1  12)
 # define GEN8_RASTER_LINE_AA_ENABLE (1  2)
 # define GEN8_RASTER_SCISSOR_ENABLE (1  1)
 # define GEN8_RASTER_VIEWPORT_Z_CLIP_TEST_ENABLE(1  0)
diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
b/src/mesa/drivers/dri/i965/gen8_sf_state.c
index a5cd9f8..b31b17e 100644
--- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
@@ -209,6 +209,9 @@ upload_raster(struct brw_context *brw)
if (ctx-Point.SmoothFlag)
   dw1 |= GEN8_RASTER_SMOOTH_POINT_ENABLE;
 
+   if (ctx-Multisample._Enabled)
+  dw1 |= GEN8_RASTER_API_MULTISAMPLE_ENABLE;
+
if (ctx-Polygon.OffsetFill)
   dw1 |= GEN6_SF_GLOBAL_DEPTH_OFFSET_SOLID;
 
@@ -274,6 +277,7 @@ const struct brw_tracked_state gen8_raster_state = {
.dirty = {
   .mesa  = _NEW_BUFFERS |
_NEW_LINE |
+   _NEW_MULTISAMPLE |
_NEW_POINT |
_NEW_POLYGON |
_NEW_SCISSOR |
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/13] i965: Enable smooth points when multisampling without point sprites.

2014-02-19 Thread Kenneth Graunke
According to the Point Multisample Rasterization of the OpenGL
specification (3.0 or later), smooth points are supposed to be enabled
implicitly when multisampling, regardless of the GL_POINT_SMOOTH flag.

However, if GL_POINT_SPRITE is enabled, you get square points no matter
what.  Core contexts always enable point sprites, so this effectively
makes smooth points go away, even in the case of multisampling.

Fixes Piglit's EXT_framebuffer_multisample/point-smooth tests.
(Yes, that's right folks, we actually have Piglit tests for this.)

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_sf_state.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
b/src/mesa/drivers/dri/i965/gen8_sf_state.c
index b31b17e..0693fee 100644
--- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
@@ -139,8 +139,11 @@ upload_sf(struct brw_context *brw)
if (!(ctx-VertexProgram.PointSizeEnabled || ctx-Point._Attenuated))
   dw3 |= GEN6_SF_USE_STATE_POINT_WIDTH;
 
-   if (ctx-Point.SmoothFlag)
+   /* _NEW_POINT | _NEW_MULTISAMPLE */
+   if ((ctx-Point.SmoothFlag || ctx-Multisample._Enabled) 
+   !ctx-Point.PointSprite) {
   dw3 |= GEN8_SF_SMOOTH_POINT_ENABLE;
+   }
 
dw3 |= GEN6_SF_LINE_AA_MODE_TRUE;
 
@@ -166,6 +169,7 @@ const struct brw_tracked_state gen8_sf_state = {
   .mesa  = _NEW_LIGHT |
_NEW_PROGRAM |
_NEW_LINE |
+   _NEW_MULTISAMPLE |
_NEW_POINT,
   .brw   = BRW_NEW_CONTEXT,
   .cache = 0,
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/13] i965: Add missing sample shading bits to Gen8's 3DSTATE_PS_EXTRA.

2014-02-19 Thread Kenneth Graunke
v2: Also set the oMask Present to Render Target bit, which is required
for shaders that write oMask.  Otherwise the hardware won't expect
the extra data.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index e0a1c9b..e93668e 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -22,6 +22,7 @@
  */
 
 #include stdbool.h
+#include program/program.h
 #include brw_state.h
 #include brw_defines.h
 #include intel_batchbuffer.h
@@ -29,6 +30,7 @@
 static void
 upload_ps_extra(struct brw_context *brw)
 {
+   struct gl_context *ctx = brw-ctx;
/* BRW_NEW_FRAGMENT_PROGRAM */
const struct brw_fragment_program *fp =
   brw_fragment_program_const(brw-fragment_program);
@@ -63,6 +65,18 @@ upload_ps_extra(struct brw_context *brw)
if (fp-program.Base.InputsRead  VARYING_BIT_POS)
   dw1 |= GEN8_PSX_USES_SOURCE_DEPTH | GEN8_PSX_USES_SOURCE_W;
 
+   /* _NEW_BUFFERS */
+   bool multisampled_fbo = ctx-DrawBuffer-Visual.samples  1;
+   if (multisampled_fbo 
+   _mesa_get_min_invocations_per_fragment(ctx, fp-program, false)  1)
+  dw1 |= GEN8_PSX_SHADER_IS_PER_SAMPLE;
+
+   if (fp-program.Base.SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN)
+  dw1 |= GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK;
+
+   if (brw-wm.prog_data-uses_omask)
+  dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET;
+
BEGIN_BATCH(2);
OUT_BATCH(_3DSTATE_PS_EXTRA  16 | (2 - 2));
OUT_BATCH(dw1);
@@ -71,7 +85,7 @@ upload_ps_extra(struct brw_context *brw)
 
 const struct brw_tracked_state gen8_ps_extra = {
.dirty = {
-  .mesa  = 0,
+  .mesa  = _NEW_BUFFERS,
   .brw   = BRW_NEW_CONTEXT | BRW_NEW_FRAGMENT_PROGRAM,
   .cache = 0,
},
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/13] i965: Use ffs() for sample counting in gen7_surface_msaa_bits().

2014-02-19 Thread Kenneth Graunke
The enumerations are just log2(num_samples) shifted by 3, which we can
easily compute via ffs().

This also makes it reusable for Broadwell, which has 2x MSAA.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 12d0fa9..154a0fd 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -82,12 +82,10 @@ gen7_surface_msaa_bits(unsigned num_samples, enum 
intel_msaa_layout layout)
 {
uint32_t ss4 = 0;
 
-   if (num_samples  4)
-  ss4 |= GEN7_SURFACE_MULTISAMPLECOUNT_8;
-   else if (num_samples  1)
-  ss4 |= GEN7_SURFACE_MULTISAMPLECOUNT_4;
-   else
-  ss4 |= GEN7_SURFACE_MULTISAMPLECOUNT_1;
+   assert(num_samples = 8);
+
+   /* The SURFACE_MULTISAMPLECOUNT_X enums are simply log2(num_samples)  3. 
*/
+   ss4 |= (ffs(MAX2(num_samples, 1)) - 1)  3;
 
if (layout == INTEL_MSAA_LAYOUT_IMS)
   ss4 |= GEN7_SURFACE_MSFMT_DEPTH_STENCIL;
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/13] i965/fs: Implement FS_OPCODE_SET_OMASK on Broadwell.

2014-02-19 Thread Kenneth Graunke
I made a few changes which I think simplify the code a bit compared to
the Gen7 implementation, but which are largely pointless.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_fs.h  |  3 +++
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 36 -
 2 files changed, 38 insertions(+), 1 deletion(-)

Apologies for the differences between Gen7 and Gen8 code.  I think this
is cleaner, and as long as I'm reimplementing it...

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 99c6298..00ac577 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -704,6 +704,9 @@ private:
 struct brw_reg index,
 struct brw_reg offset);
void generate_mov_dispatch_to_flags(fs_inst *ir);
+   void generate_set_omask(fs_inst *ir,
+   struct brw_reg dst,
+   struct brw_reg sample_mask);
void generate_set_sample_id(fs_inst *ir,
struct brw_reg dst,
struct brw_reg src0,
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 0078228..106c7f4 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -698,6 +698,40 @@ gen8_fs_generator::generate_set_simd4x2_offset(fs_inst *ir,
 }
 
 /**
+ * Sets vstride=16, width=8, hstride=2 or vstride=0, width=1, hstride=0
+ * (when mask is passed as a uniform) of register mask before moving it
+ * to register dst.
+ */
+void
+gen8_fs_generator::generate_set_omask(fs_inst *inst,
+  struct brw_reg dst,
+  struct brw_reg mask)
+{
+   assert(dst.type == BRW_REGISTER_TYPE_UW);
+
+   if (dispatch_width == 16)
+  dst = vec16(dst);
+
+   if (mask.vstride == BRW_VERTICAL_STRIDE_8 
+   mask.width == BRW_WIDTH_8 
+   mask.hstride == BRW_HORIZONTAL_STRIDE_1) {
+  mask = stride(mask, 16, 8, 2);
+   } else {
+  assert(mask.vstride == BRW_VERTICAL_STRIDE_0 
+ mask.width == BRW_WIDTH_1 
+ mask.hstride == BRW_HORIZONTAL_STRIDE_0);
+   }
+
+   unsigned save_exec_size = default_state.exec_size;
+   default_state.exec_size = BRW_EXECUTE_8;
+
+   gen8_instruction *mov = MOV(dst, retype(mask, dst.type));
+   gen8_set_mask_control(mov, BRW_MASK_DISABLE);
+
+   default_state.exec_size = save_exec_size;
+}
+
+/**
  * Do a special ADD with vstride=1, width=4, hstride=0 for src1.
  */
 void
@@ -1171,7 +1205,7 @@ gen8_fs_generator::generate_code(exec_list *instructions)
  break;
 
   case FS_OPCODE_SET_OMASK:
- assert(!XXX: Missing Gen8 scalar support for SET_OMASK);
+ generate_set_omask(ir, dst, src[0]);
  break;
 
   case FS_OPCODE_SET_SAMPLE_ID:
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/13] Hack: Disable MCS on Broadwell for now.

2014-02-19 Thread Kenneth Graunke
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 ++
 1 file changed, 6 insertions(+)

I'm mostly sending this out as a placeholder.  Ultimately, we want to get
MCS working.  I'm not sure whether it would be valuable to push this (with
a proper commit message) in the meantime.

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e604b70..43f51fc 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -84,6 +84,12 @@ compute_msaa_layout(struct brw_context *brw, mesa_format 
format, GLenum target)
case GL_DEPTH_STENCIL:
   return INTEL_MSAA_LAYOUT_IMS;
default:
+  /* Disable MCS on Broadwell for now.  We can enable it once things
+   * are working without it.
+   */
+  if (brw-gen = 8)
+ return INTEL_MSAA_LAYOUT_UMS;
+
   /* From the Ivy Bridge PRM, Vol4 Part1 p77 (MCS Enable):
*
*   This field must be set to 0 for all SINT MSRTs when all RT channels
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/13] i965: Actually claim to support MSAA on Broadwell.

2014-02-19 Thread Kenneth Graunke
We need to advertise 8x, 4x, and 2x multisamples.  Previously, we only
claimed to support 0/1 samples.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_context.c  | 6 ++
 src/mesa/drivers/dri/i965/intel_screen.c | 5 -
 2 files changed, 10 insertions(+), 1 deletion(-)

This also makes WebGL work.

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index bb194a7..5800092 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -79,6 +79,12 @@ brw_query_samples_for_format(struct gl_context *ctx, GLenum 
target,
(void) target;
 
switch (brw-gen) {
+   case 8:
+  samples[0] = 8;
+  samples[1] = 4;
+  samples[2] = 2;
+  return 3;
+
case 7:
   samples[0] = 8;
   samples[1] = 4;
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index ba22971..b5b0294 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -,11 +,14 @@ intel_detect_swizzling(struct intel_screen *screen)
 const int*
 intel_supported_msaa_modes(const struct intel_screen  *screen)
 {
+   static const int gen8_modes[] = {8, 4, 2, 0, -1};
static const int gen7_modes[] = {8, 4, 0, -1};
static const int gen6_modes[] = {4, 0, -1};
static const int gen4_modes[] = {0, -1};
 
-   if (screen-devinfo-gen = 7) {
+   if (screen-devinfo-gen = 8) {
+  return gen8_modes;
+   } else if (screen-devinfo-gen = 7) {
   return gen7_modes;
} else if (screen-devinfo-gen == 6) {
   return gen6_modes;
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/13] i965: Use gen7_surface_msaa_bits in Broadwell SURFACE_STATE code.

2014-02-19 Thread Kenneth Graunke
We already set the number of samples, but were missing the MSAA layout
mode.  Reusing gen7_surface_msaa_bits makes it easy to set both.

This also lets us drop the Gen8 surface_num_multisamples function.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 22ffa78..594e531 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -83,18 +83,6 @@ horizontal_alignment(struct intel_mipmap_tree *mt)
}
 }
 
-static uint32_t
-surface_num_multisamples(unsigned num_samples)
-{
-   assert(num_samples = 0  num_samples = 16);
-
-   if (num_samples == 0)
-  return GEN7_SURFACE_MULTISAMPLECOUNT_1;
-
-   /* The SURFACE_MULTISAMPLECOUNT_X enums are simply log2(num_samples)  3. 
*/
-   return (ffs(num_samples) - 1)  3;
-}
-
 static void
 gen8_emit_buffer_surface_state(struct brw_context *brw,
uint32_t *out_offset,
@@ -180,7 +168,7 @@ gen8_update_texture_surface(struct gl_context *ctx,
surf[3] = SET_FIELD(mt-logical_depth0 - 1, BRW_SURFACE_DEPTH) |
  (mt-region-pitch - 1);
 
-   surf[4] = surface_num_multisamples(mt-num_samples);
+   surf[4] = gen7_surface_msaa_bits(mt-num_samples, mt-msaa_layout);
 
surf[5] = SET_FIELD(tObj-BaseLevel - mt-first_level, 
GEN7_SURFACE_MIN_LOD) |
  (intelObj-_MaxLevel - tObj-BaseLevel); /* mip count */
@@ -322,7 +310,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
surf[3] = (depth - 1)  BRW_SURFACE_DEPTH_SHIFT |
  (region-pitch - 1); /* Surface Pitch */
 
-   surf[4] = surface_num_multisamples(mt-num_samples) |
+   surf[4] = gen7_surface_msaa_bits(mt-num_samples, mt-msaa_layout) |
  min_array_element  GEN7_SURFACE_MIN_ARRAY_ELEMENT_SHIFT |
  (depth - 1)  GEN7_SURFACE_RENDER_TARGET_VIEW_EXTENT_SHIFT;
 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/13] i965: Update physical width/height munging for 2x IMS MSAA.

2014-02-19 Thread Kenneth Graunke
I can't find any documentation to explain what ought to be done here, so
I simply guessed based on the pattern I observed in the 4x/8x cases.
It appears to work, but it could be totally wrong.

I was able to find the Sandybridge PRM quote from the comments in the
latest documentation: Shared Functions  3D Sampler  Multisampled
Surface Behavior.  However, it only mentions 4x MSAA - not even 8x.

After a substantial amount more digging, I was able to find a second
page (incorrectly tagged) which confirmed the formulas in our code for
8x MSAA.  However, that page didn't mention 2x MSAA at all.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 43f51fc..07308dc 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -311,6 +311,11 @@ intel_miptree_create_layout(struct brw_context *brw,
   * sample 3 is in that bottom right 2x2 block.
   */
  switch (num_samples) {
+ case 2:
+assert(brw-gen = 8);
+width0 = ALIGN(width0, 2) * 2;
+height0 = ALIGN(height0, 2);
+break;
  case 4:
 width0 = ALIGN(width0, 2) * 2;
 height0 = ALIGN(height0, 2) * 2;
@@ -320,7 +325,7 @@ intel_miptree_create_layout(struct brw_context *brw,
 height0 = ALIGN(height0, 2) * 2;
 break;
  default:
-/* num_samples should already have been quantized to 0, 1, 4, or
+/* num_samples should already have been quantized to 0, 1, 2, 4, or
  * 8.
  */
 assert(false);
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Use MOV, not OR for setting URB write channel enables on Gen8+.

2014-02-19 Thread Kenneth Graunke
On Broadwell, g0.5 contains the Scratch Space Pointer; using OR
puts some bits of that into ignored sections of our message header.

While this doesn't hurt, it's also not terribly /useful/.  Using MOV
is sufficient to set the only interesting bits in this part of the
message header.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
index d0f574a..7ed5d2a 100644
--- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
@@ -173,11 +173,8 @@ gen8_vec4_generator::generate_urb_write(vec4_instruction 
*ir, bool vs)
if (!(ir-urb_write_flags  BRW_URB_WRITE_USE_CHANNEL_MASKS)) {
   /* Enable Channel Masks in the URB_WRITE_OWORD message header */
   default_state.access_mode = BRW_ALIGN_1;
-  inst = OR(retype(brw_vec1_grf(GEN7_MRF_HACK_START + ir-base_mrf, 5),
-   BRW_REGISTER_TYPE_UD),
-retype(brw_vec1_grf(0, 5), BRW_REGISTER_TYPE_UD),
-brw_imm_ud(0xff00));
-  gen8_set_mask_control(inst, BRW_MASK_DISABLE);
+  MOV_RAW(brw_vec1_grf(GEN7_MRF_HACK_START + ir-base_mrf, 5),
+  brw_imm_ud(0xff00));
   default_state.access_mode = BRW_ALIGN_16;
}
 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add missing DebugMessageControl types

2014-02-19 Thread Timothy Arceri
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 src/mesa/main/errors.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 5f4eac6..c00c796 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -575,6 +575,11 @@ validate_params(struct gl_context *ctx, unsigned caller,
   /* this value is only valid for GL_KHR_debug functions */
   if (caller == CONTROL || caller == INSERT)
  break;
+   case GL_DEBUG_TYPE_PUSH_GROUP:
+   case GL_DEBUG_TYPE_POP_GROUP:
+  /* this value is only valid for GL_KHR_debug */
+  if (caller == CONTROL)
+ break;
case GL_DONT_CARE:
   if (caller == CONTROL || caller == CONTROL_ARB)
  break;
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 75212] New: Mesa selects wrong DRI driver

2014-02-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=75212

  Priority: medium
Bug ID: 75212
CC: e...@anholt.net, lem...@gmail.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Mesa selects wrong DRI driver
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: eero.t.tammi...@intel.com
  Hardware: x86 (IA32)
Status: NEW
   Version: git
 Component: GLX
   Product: Mesa

Test environment:
- HSW GT3e
- Up to date Ubuntu 13.10
- Latest versions of libdrm and mesa built (from today's git master),
  with only i965 driver enabled

Steps to reproduce:
1. Run glxgears with latest mesa

Expected output:
- i965 driver loaded  HW acceleration used

Actual output:
- Mesa complains that it doesn't find i915 nor swrast
- glxgears runs really slowly

Bisecting Mesa identified this commit as culprit:
--
commit 7bd95ec437a5b1052fa17780a9d66677ec1fdc35
Author: Eric Anholt e...@anholt.net
Date:   Thu Jan 23 10:21:09 2014 -0800

dri2: Trust our own driver name lookup over the server's.

This allows Mesa to choose to rename driver .sos (or split drivers),
without needing a flag day with the corresponding 2D driver.

v2: Undo the loader-only-for-dri3 change.
--

If Mesa needs some new dependency or specific version of that to identify on
which HW it's running, it should check for that in configure.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 75212] Mesa selects wrong DRI driver

2014-02-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=75212

--- Comment #1 from Emil Velikov emil.l.veli...@gmail.com ---
There are a couple of solutions for this [1] [2]. I would prefer the latter as
I've never been a fan of black/white listing.

Eric are you leaning towards either solution ? Can we get an ack or your
thoughts if you are not keen on either one ?

Thanks

[1] http://lists.freedesktop.org/archives/mesa-dev/2014-January/052888.html
[2] http://lists.freedesktop.org/archives/mesa-dev/2014-January/052981.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 75098] OpenGL ES2 with fbdev - link error

2014-02-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=75098

--- Comment #4 from Christian Prochaska christian.procha...@genode-labs.com 
---
(In reply to comment #3)
 Created attachment 94319 [details] [review]
 configure: use shared-glapi when more than one gl* API is used
 
 Hmm forcing shared-glapi whenever more than one gl* api is used seems like
 the only sensible thing to do imho.
 
 This patch fixes the problem by convering the dri dependency gl*. Feel free
 to give the patch a try.


This patch works for me. Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/omx/enc: add multi scaling buffers for performance improvement

2014-02-19 Thread Leo Liu
From: Leo Liu leo@amd.com

Signed-off-by: Leo Liu leo@amd.com
---
 src/gallium/state_trackers/omx/vid_enc.c | 38 
 src/gallium/state_trackers/omx/vid_enc.h |  7 --
 2 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 6e65274..3f1d01c 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -273,8 +273,9 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE 
*comp)
vl_compositor_cleanup_state(priv-cstate);
vl_compositor_cleanup(priv-compositor);
  
-   if (priv-scale_buffer)
- priv-scale_buffer-destroy(priv-scale_buffer);
+   for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i)
+  if (priv-scale_buffer[i])
+ priv-scale_buffer[i]-destroy(priv-scale_buffer[i]);
 
if (priv-s_pipe)
   priv-s_pipe-destroy(priv-s_pipe);
@@ -447,7 +448,8 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
OMX_COMPONENTTYPE *comp = handle;
vid_enc_PrivateType *priv = comp-pComponentPrivate;
OMX_ERRORTYPE r;
-
+   int i;
+ 
if (!config)
   return OMX_ErrorBadParameter;
  
@@ -473,9 +475,11 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
   if (scale-xWidth  176 || scale-xHeight  144)
  return OMX_ErrorBadParameter;
 
-  if (priv-scale_buffer) {
- priv-scale_buffer-destroy(priv-scale_buffer);
- priv-scale_buffer = NULL;
+  for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i) {
+ if (priv-scale_buffer[i]) {
+priv-scale_buffer[i]-destroy(priv-scale_buffer[i]);
+priv-scale_buffer[i] = NULL;
+ }
   }
 
   priv-scale = *scale;
@@ -487,9 +491,11 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
  templat.width = priv-scale.xWidth; 
  templat.height = priv-scale.xHeight; 
  templat.interlaced = false;
- priv-scale_buffer = priv-s_pipe-create_video_buffer(priv-s_pipe, 
templat);
- if (!priv-scale_buffer)
-return OMX_ErrorInsufficientResources;
+ for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i) {
+priv-scale_buffer[i] = 
priv-s_pipe-create_video_buffer(priv-s_pipe, templat);
+if (!priv-scale_buffer[i])
+   return OMX_ErrorInsufficientResources;
+ }
   }
 
   break;
@@ -545,8 +551,10 @@ static OMX_ERRORTYPE 
vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq
  templat.profile = PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE;
  templat.entrypoint = PIPE_VIDEO_ENTRYPOINT_ENCODE;
  templat.chroma_format = PIPE_VIDEO_CHROMA_FORMAT_420;
- templat.width = priv-scale_buffer ? priv-scale.xWidth : 
port-sPortParam.format.video.nFrameWidth;
- templat.height = priv-scale_buffer ? priv-scale.xHeight : 
port-sPortParam.format.video.nFrameHeight;
+ templat.width = priv-scale_buffer[priv-current_scale_buffer] ?
+priv-scale.xWidth : 
port-sPortParam.format.video.nFrameWidth;
+ templat.height = priv-scale_buffer[priv-current_scale_buffer] ?
+priv-scale.xHeight : 
port-sPortParam.format.video.nFrameHeight;
  templat.max_references = 1;
 
  priv-codec = priv-s_pipe-create_video_codec(priv-s_pipe, 
templat);
@@ -736,7 +744,7 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
 
/* -- scale input image - */
 
-   if (priv-scale_buffer) {
+   if (priv-scale_buffer[priv-current_scale_buffer]) {
   struct vl_compositor *compositor = priv-compositor;
   struct vl_compositor_state *s = priv-cstate;
   struct pipe_sampler_view **views;
@@ -744,7 +752,8 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
   unsigned i;
 
   views = vbuf-get_sampler_view_planes(vbuf);
-  dst_surface = priv-scale_buffer-get_surfaces(priv-scale_buffer);
+  dst_surface = 
priv-scale_buffer[priv-current_scale_buffer]-get_surfaces
+   (priv-scale_buffer[priv-current_scale_buffer]);
   vl_compositor_clear_layers(s);
 
   for (i = 0; i  VL_MAX_SURFACES; ++i) {
@@ -768,7 +777,8 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
   }
   
   size  = priv-scale.xWidth * priv-scale.xHeight * 2; 
-  vbuf = priv-scale_buffer; 
+  vbuf = priv-scale_buffer[priv-current_scale_buffer++];
+  priv-current_scale_buffer %= OMX_VID_ENC_NUM_SCALING_BUFFERS;
}
 
priv-s_pipe-flush(priv-s_pipe, NULL, 0);
diff --git a/src/gallium/state_trackers/omx/vid_enc.h 
b/src/gallium/state_trackers/omx/vid_enc.h
index 431ca91..a3fdfae 100644
--- a/src/gallium/state_trackers/omx/vid_enc.h
+++ 

[Mesa-dev] [PATCH] st/omx: fix prevFrameNumOffset handling

2014-02-19 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/omx/vid_dec_h264.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/omx/vid_dec_h264.c 
b/src/gallium/state_trackers/omx/vid_dec_h264.c
index 5f4a261..7f1c2fa 100644
--- a/src/gallium/state_trackers/omx/vid_dec_h264.c
+++ b/src/gallium/state_trackers/omx/vid_dec_h264.c
@@ -765,6 +765,8 @@ static void slice_header(vid_dec_PrivateType *priv, struct 
vl_rbsp *rbsp,
   else
  FrameNumOffset = priv-codec_data.h264.prevFrameNumOffset;
 
+  priv-codec_data.h264.prevFrameNumOffset = FrameNumOffset;
+
   if (sps-num_ref_frames_in_pic_order_cnt_cycle != 0)
  absFrameNum = FrameNumOffset + frame_num;
   else
@@ -814,6 +816,8 @@ static void slice_header(vid_dec_PrivateType *priv, struct 
vl_rbsp *rbsp,
   else
  FrameNumOffset = priv-codec_data.h264.prevFrameNumOffset;
 
+  priv-codec_data.h264.prevFrameNumOffset = FrameNumOffset;
+
   if (IdrPicFlag)
  tempPicOrderCnt = 0;
   else if (nal_ref_idc == 0)
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] build: Fix FTBFS bug introduced by ee55500c22

2014-02-19 Thread Kai Wasserbäch
The referenced commit set the with_dri_drivers variable to yes in the
auto case, which is an unknown classic DRI driver and leads to a FTBFS.

CC: Emil Velikov emil.l.veli...@gmail.com
Signed-off-by: Kai Wasserbäch k...@dev.carbon-project.org
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 8390d27..ad00d93 100644
--- a/configure.ac
+++ b/configure.ac
@@ -955,7 +955,7 @@ no) ;;
 auto)
 # classic DRI drivers
 if test x$enable_opengl = xyes; then
-with_dri_drivers=yes
+with_dri_drivers=swrast,i915,i965,radeon,r200,nouveau
 fi
 ;;
 esac
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Eric Anholt
Chia-I Wu olva...@gmail.com writes:

 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is not advertised
 for OpenGL ES contexts due to the terrible performance.  It is still enabled
 for desktop GL because it was believed GL 3.0+ requires Z16.

 It turns out only GL 3.0 requires Z16, and that is corrected in later GL
 versions.  In light of that, and per Ian's suggestion, stop advertising Z16
 support by default, and add a drirc option, gl30_sized_format_rules, so that
 users can override.

I don't think having the driconf option will help anybody.  Let's just
unconditionally drop Z16.


pgpnzzWLmZCAb.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 07/19] svga: update shader code for GBS

2014-02-19 Thread Ian Romanick
This patch didn't apply to the 10.1 branch.  I've picked most things to
the 10.1 branch except this series.  Could you put a branch up somewhere
and send me a pull request?  I'm sure you'd like to have these in the
release, and I don't want to mess them up. :)

2e0c90847f16a9cf2a40436beacb65c65535fa4a svga: split / update svga3d header 
files
024711385ec5333976b124d33a030c30f1345ed1 svga: update dumping code with new GBS 
commands, etc
d993ada50cf2f112bfff2bd7fbb5a6c25ca00306 svga: update svga_winsys interface for 
GBS
823fbfdca7165ac11eab2a7e168960f5874ebdc3 svga: add new GBS commands
31dfefc47f9f12c49fd3cfb27ba4fe384cb60380 svga: add svga_have_gb_objects/dma() 
functions
2f1fc8db108eb771414aa5440d4c439f63f4e7c1 svga: update constant buffer code for 
GBS
f84c830b144fd4d53f862fc6ad05541e5bf60a3b svga: update shader code for GBS
c1e60a61e8ca3bdac0530ad1aeb3c751f273b73d svga: add helpers for tracking 
rendering to textures
d0c22a6d53a9cce2d40006f3d4d7dd7e2f63aca9 svga: track which textures are 
rendered to
f8bbd8261d297be11f1f2eaf768c2a8ace0cb69d svga: adjust adjustment for point 
coordinates
6476bcbc5005b76e1494a201f92f3c76bd8e9727 svga: remove a couple unneeded 
assertions
e0a6fb09bdfde40253b924b6c9d1fdf3f16fed21 svga: add new helper functions for GBS 
buffers
72b0e959fc38cf4f01d8aaeabe7336cc88588f90 svga: update buffer code for GBS
3d1fd6df5315cfa4b9c8b1332f5078a89abc3ed8 svga: update texture code for GBS
c9e9b1862b472b2671b8d3b339f9f7624a272073 pipebuffer, winsys: Add a size match 
parameter to the cached buffer manager
8af358d8bc9f7563cd76313b16d7b149197a4b2c gallium/pipebuffer: Add a cache buffer 
manager bypass mask
59e7c596215155b556ba8cf06233b621b88f49c6 gallium/util: Add flush/map debug 
utility code
fe6a854477c2ed30c37c200668a4dc86512120f7 svga/winsys: implement GBS support
141e39a8936a7b19fd857a35ea2d200daf1777c7 svga/winsys: Propagate surface shared 
information to the winsys
e4a5a9fd2fdd5b5ae8b85ac743a228f409a21a70 gallium/pipebuffer: change 
pb_cache_manager_create() size_factor to float


On 02/13/2014 05:20 PM, Brian Paul wrote:
 Reviewed-by: Thomas Hellstrom thellst...@vmware.com
 Cc: 10.1 mesa-sta...@lists.freedesktop.org
 ---
  src/gallium/drivers/svga/svga_context.c  |4 +++
  src/gallium/drivers/svga/svga_context.h  |2 ++
  src/gallium/drivers/svga/svga_draw.c |   14 
  src/gallium/drivers/svga/svga_shader.c   |   21 ++-
  src/gallium/drivers/svga/svga_state.h|4 +++
  src/gallium/drivers/svga/svga_state_fs.c |   58 
 --
  src/gallium/drivers/svga/svga_state_vs.c |   56 -
  src/gallium/drivers/svga/svga_tgsi.h |3 ++
  8 files changed, 142 insertions(+), 20 deletions(-)
 
 diff --git a/src/gallium/drivers/svga/svga_context.c 
 b/src/gallium/drivers/svga/svga_context.c
 index de769ca..4da9a65 100644
 --- a/src/gallium/drivers/svga/svga_context.c
 +++ b/src/gallium/drivers/svga/svga_context.c
 @@ -197,6 +197,10 @@ void svga_context_flush( struct svga_context *svga,
  */
 svga-rebind.rendertargets = TRUE;
 svga-rebind.texture_samplers = TRUE;
 +   if (svga_have_gb_objects(svga)) {
 +  svga-rebind.vs = TRUE;
 +  svga-rebind.fs = TRUE;
 +   }
  
 if (SVGA_DEBUG  DEBUG_SYNC) {
if (fence)
 diff --git a/src/gallium/drivers/svga/svga_context.h 
 b/src/gallium/drivers/svga/svga_context.h
 index 71a8eea..0daab0b 100644
 --- a/src/gallium/drivers/svga/svga_context.h
 +++ b/src/gallium/drivers/svga/svga_context.h
 @@ -374,6 +374,8 @@ struct svga_context
 struct {
unsigned rendertargets:1;
unsigned texture_samplers:1;
 +  unsigned vs:1;
 +  unsigned fs:1;
 } rebind;
  
 struct svga_hwtnl *hwtnl;
 diff --git a/src/gallium/drivers/svga/svga_draw.c 
 b/src/gallium/drivers/svga/svga_draw.c
 index 80dbc35..fa0cac4 100644
 --- a/src/gallium/drivers/svga/svga_draw.c
 +++ b/src/gallium/drivers/svga/svga_draw.c
 @@ -213,6 +213,20 @@ svga_hwtnl_flush(struct svga_hwtnl *hwtnl)
   }
}
  
 +  if (svga-rebind.vs) {
 + ret = svga_reemit_vs_bindings(svga);
 + if (ret != PIPE_OK) {
 +return ret;
 + }
 +  }
 +
 +  if (svga-rebind.fs) {
 + ret = svga_reemit_fs_bindings(svga);
 + if (ret != PIPE_OK) {
 +return ret;
 + }
 +  }
 +
SVGA_DBG(DEBUG_DMA, draw to sid %p, %d prims\n,
 svga-curr.framebuffer.cbufs[0] ?
 svga_surface(svga-curr.framebuffer.cbufs[0])-handle : NULL,
 diff --git a/src/gallium/drivers/svga/svga_shader.c 
 b/src/gallium/drivers/svga/svga_shader.c
 index 88877b2..6b6b441 100644
 --- a/src/gallium/drivers/svga/svga_shader.c
 +++ b/src/gallium/drivers/svga/svga_shader.c
 @@ -43,7 +43,17 @@ svga_define_shader(struct svga_context *svga,
  {
 unsigned codeLen = variant-nr_tokens * sizeof(variant-tokens[0]);
  
 -   {
 +   if (svga_have_gb_objects(svga)) {
 +  struct svga_winsys_screen *sws = 

Re: [Mesa-dev] [PATCH 1/3] glcpp: Only warn for macro names containing __

2014-02-19 Thread Ian Romanick
I'm hoping that Tapani or Darius will verify that this patch actually
fixes the problem.  That's why people CC other people on patches. :)

On 02/18/2014 10:19 AM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
 GLSL ES spec (all versions) say:
 
 All macro names containing two consecutive underscores ( __ ) are
 reserved for future use as predefined macro names. All macro names
 prefixed with GL_ (GL followed by a single underscore) are also
 reserved.
 
 The intention is that names containing __ are reserved for internal use
 by the implementation, and names prefixed with GL_ are reserved for use
 by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
 the name of the extension), that should be an error.  Names simply
 containing __ are dangerous to use, but should be allowed.  In similar
 cases, the C++ preprocessor specification says, no diagnostic is
 required.
 
 Per the Khronos bug mentioned below, a future version of the GLSL
 specification will clarify this.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org
 Cc: Tapani Pälli lem...@gmail.com
 Cc: Kenneth Graunke kenn...@whitecape.org
 Cc: Darius Spitznagel d.spitzna...@goodbytez.de
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
 Bugzilla: Khronos #11702
 ---
  src/glsl/glcpp/glcpp-parse.y   | 22 
 +++---
  .../tests/086-reserved-macro-names.c.expected  |  4 ++--
  2 files changed, 21 insertions(+), 5 deletions(-)
 
 diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
 index 5bb2891..bdc598f 100644
 --- a/src/glsl/glcpp/glcpp-parse.y
 +++ b/src/glsl/glcpp/glcpp-parse.y
 @@ -1770,11 +1770,27 @@ static void
  _check_for_reserved_macro_name (glcpp_parser_t *parser, YYLTYPE *loc,
   const char *identifier)
  {
 - /* According to the GLSL specification, macro names starting with __
 -  * or GL_ are reserved for future use.  So, don't allow them.
 + /* Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and
 +  * the GLSL ES spec (all versions) say:
 +  *
 +  * All macro names containing two consecutive underscores ( __ )
 +  * are reserved for future use as predefined macro names. All
 +  * macro names prefixed with GL_ (GL followed by a single
 +  * underscore) are also reserved.
 +  *
 +  * The intention is that names containing __ are reserved for internal
 +  * use by the implementation, and names prefixed with GL_ are reserved
 +  * for use by Khronos.  Since every extension adds a name prefixed
 +  * with GL_ (i.e., the name of the extension), that should be an
 +  * error.  Names simply containing __ are dangerous to use, but should
 +  * be allowed.
 +  *
 +  * A future version of the GLSL specification will clarify this.
*/
   if (strstr(identifier, __)) {
 - glcpp_error (loc, parser, Macro names containing \__\ are 
 reserved.\n);
 + glcpp_warning(loc, parser,
 +   Macro names containing \__\ are reserved 
 +   for use by the implementation.\n);
   }
   if (strncmp(identifier, GL_, 3) == 0) {
   glcpp_error (loc, parser, Macro names starting with \GL_\ 
 are reserved.\n);
 diff --git a/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected 
 b/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
 index d8aa9f0..5ca42a9 100644
 --- a/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
 +++ b/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
 @@ -1,8 +1,8 @@
 -0:1(10): preprocessor error: Macro names containing __ are reserved.
 +0:1(10): preprocessor warning: Macro names containing __ are reserved for 
 use by the implementation.
  
  0:2(9): preprocessor error: Macro names starting with GL_ are reserved.
  
 -0:3(9): preprocessor error: Macro names containing __ are reserved.
 +0:3(9): preprocessor warning: Macro names containing __ are reserved for 
 use by the implementation.
  
  
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965: Implement HiZ resolves on Broadwell.

2014-02-19 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 On 02/18/2014 01:38 PM, Eric Anholt wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 [snip]
 diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
 b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 index f30ff28..3fa20c8 100644
 --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 @@ -203,3 +203,108 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw,
brw-depthstencil.stencil_offset,
hiz, width, height, depth, lod, min_array_element);
  }
 +
 +/**
 + * Emit packets to perform a depth/HiZ resolve or fast depth/stencil clear.
 + *
 + * See the Optimized Depth Buffer Clear and/or Stencil Buffer Clear 
 section
 + * of the hardware documentation for details.
 + */
 +void
 +gen8_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt,
 +  unsigned int level, unsigned int layer, enum gen6_hiz_op op)
 +{
 +   if (op == GEN6_HIZ_OP_NONE)
 +  return;
 +
 +   assert(mt-first_level == 0);
 +
 +   struct intel_mipmap_level *miplevel = mt-level[level];
 +
 +   /* The basic algorithm is:
 +* - If needed, emit 3DSTATE_{DEPTH,HIER_DEPTH,STENCIL}_BUFFER and
 +*   3DSTATE_CLEAR_PARAMS packets to set up the relevant buffers.
 +* - If needed, emit 3DSTATE_DRAWING_RECTANGLE.
 +* - Emit 3DSTATE_WM_HZ_OP with a bit set for the particular operation.
 +* - Do a special PIPE_CONTROL to trigger an implicit rectangle 
 primitive.
 +* - Emit 3DSTATE_WM_HZ_OP with no bits set to return to normal 
 rendering.
 +*/
 +   emit_depth_packets(brw, mt,
 +  brw_depth_format(brw, mt-format),
 +  BRW_SURFACE_2D,
 +  true, /* depth writes */
 +  NULL, false, 0, /* no stencil for now */
 +  true, /* hiz */
 +  mt-logical_width0,
 +  mt-logical_height0,
 +  MAX2(mt-logical_depth0, 1),
 
 Is logical_depth0 ever 0?  That seems like a bug.

 No, I guess it isn't.  It looks like I copy and pasted this from BLORP,
 or was being overly cautious for some reason.  Dropped.

 +  level,
 +  layer); /* min_array_element */
 +
 +   BEGIN_BATCH(4);
 +   OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE  16 | (4 - 2));
 +   OUT_BATCH(0);
 +   OUT_BATCH(((mt-logical_width0 - 1)  0x) |
 + ((mt-logical_height0 - 1)  16));
 +   OUT_BATCH(0);
 +   ADVANCE_BATCH();
 
 The drawing rectangle should be using the level's size, not the level 0
 size.

 Yes, this makes sense...we bind a specific miplevel of the depth buffer,
 so presumably the (0, 0) origin is the start of that miplevel, not the
 start of the whole tree.  I'll change that.

 Since the drawing rectangle is just the bounds of where you can draw,
 and not actually the clear/resolve rectangle, I think specifying one
 that's too large shouldn't be harmful.  But specifying the right value
 is trivial, so I agree we should do it.

 +   uint32_t sample_mask = 0x;
 +   if (mt-num_samples  0) {
 +  dw1 |= SET_FIELD(ffs(mt-num_samples) - 1, GEN8_WM_HZ_NUM_SAMPLES);
 +  sample_mask = gen6_determine_sample_mask(brw);
 +   }
 
 I don't think we want the user-set sample mask stuff to change the
 samples affected by our hiz/depth resolves.  I think you can just drop
 the if block.

 Good point, whatever the user specified is probably unrelated to our
 values.  I've dropped the sample_mask variable and just stuffed 0x
 in the packet.

 I kept the if-block for the dw1 |= ...num_samples... line.

 +
 +   BEGIN_BATCH(5);
 +   OUT_BATCH(_3DSTATE_WM_HZ_OP  16 | (5 - 2));
 +   OUT_BATCH(dw1);
 +   OUT_BATCH(0);
 +   OUT_BATCH(SET_FIELD(miplevel-width, GEN8_WM_HZ_CLEAR_RECTANGLE_X_MAX) |
 + SET_FIELD(miplevel-height, 
 GEN8_WM_HZ_CLEAR_RECTANGLE_Y_MAX));
 +   OUT_BATCH(SET_FIELD(sample_mask, GEN8_WM_HZ_SAMPLE_MASK));
 +   ADVANCE_BATCH();
 
 I think now the miplevel-width should be minify(mt-logical_width0,
 level).  Hope that helped

 Yes, that's much nicer - and correct for MSAA buffers!  I'm unclear
 whether we need to do:

ALIGN(minify(mt-logical_width0,  level), 8)
ALIGN(minify(mt-logical_height0, level), 4)

 (both here and in the drawing rectangle)

 I've read seemingly contradictory information...it sounds like it might
 be necessary for depth resolves, but not otherwise...but I could be
 misinterpreting it.  It seems to be working...

Yeah, I'm not clear on how this ought to work for resolves.  For clears,
the strategy explained for the previous gens made sense: Round down your
coords to get 8x4 alignment, then do slow clears on the remaining strips
(if any).  But for a resolve, what else do you do?



pgpFhkUCbiV6h.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 13/23] i965/fs: Take into account reg_offset consistently for MRF regs.

2014-02-19 Thread Francisco Jerez
Paul Berry stereotype...@gmail.com writes:

 On 15 January 2014 14:01, Francisco Jerez curroje...@riseup.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  On 2 December 2013 11:31, Francisco Jerez curroje...@riseup.net wrote:
 
  Until now it was only being taken into account in the VEC4 back-end
  but not in the FS back-end.  Do it in both cases.
  ---
   src/mesa/drivers/dri/i965/brw_fs.h |  2 +-
   src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++
   src/mesa/drivers/dri/i965/brw_shader.h |  7 ---
   3 files changed, 11 insertions(+), 8 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
  b/src/mesa/drivers/dri/i965/brw_fs.h
  index 2c36d9f..f918f7e 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs.h
  +++ b/src/mesa/drivers/dri/i965/brw_fs.h
  @@ -615,4 +615,4 @@ bool brw_do_channel_expressions(struct exec_list
  *instructions);
   bool brw_do_vector_splitting(struct exec_list *instructions);
   bool brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program
  *prog);
 
  -struct brw_reg brw_reg_from_fs_reg(fs_reg *reg);
  +struct brw_reg brw_reg_from_fs_reg(fs_reg *reg, unsigned
 dispatch_width);
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  index 8d310a1..1de59eb 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  @@ -981,8 +981,9 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
   }
 
   struct brw_reg
  -brw_reg_from_fs_reg(fs_reg *reg)
  +brw_reg_from_fs_reg(fs_reg *reg, unsigned dispatch_width)
   {
  +   const int reg_size = 4 * dispatch_width;
 
 
  What happens when reg.type is UW and dispatch_width is 16?  In that case,
  we would compute reg_size == 64, but the correct value seems like it's
  actually 32 in this case.
 
  Are we perhaps relying on reg.type being a 32-bit type?  If so, maybe we
  should add an assertion:
 
 assert(type_sz(reg.type) == 4);
 

 Nope, reg_size is supposed to be the size in bytes of a ::reg_offset
 unit, i.e. one hardware register in SIMD8 mode and two hardware
 registers in SIMD16 mode as the comment at the definition of
 ::reg_offset explains.  The fixed factor of four is intentional and
 correct no matter what the register type is.

 Thanks.


 Ok, I see.  It appears that both this function *and* the comment above
 reg_offset are assuming that the data type is 32 bit.  The comment above
 reg_offset says:

 * For pre-register-allocation GRFs and MRFs, this is in units of a
 * float per pixel (1 hardware register for SIMD8 or SIMD4x2 mode,
 * or 2 registers for SIMD16 mode).  For uniforms, this is in units
 * of 1 float.

 but when we get around to adding support for double-precision floats (a
 feature of GL 4.0), this will no longer work; for double precision types
 we'll need reg_offset to be measured in units of at least 4 hardware
 registers in SIMD16 mode to avoid overlap.

 Similarly, for types that are 16 bits, if we consider reg_offset to be
 measured in units of 2 hardware registers in SIMD16 mode, we're actually
 wasting registers, since all 16 values actually fit in a single hardware
 register.  That's not really a big deal right now, since we use 16-bit
 types so rarely in the FS back-end.

I see what you mean, but it seems rather problematic to me to have the
unit of reg_offset depend on the register data type.  E.g. bit-casting
the contents of a register to a type of different size would involve
non-trivial algebra on reg_offset.  IMHO the ideal solution would be to
settle on a fixed unit (e.g. bytes, and remove the subreg_offset field
completely) and use a helper function to get the array indexing that
seems to be the main use case of reg_offset (e.g. 'index(base_reg, i)'
that would take into account the type size of 'base_reg' to calculate
the byte offset of element 'i').

 In fact, I bet we never use a nonzero reg_offset on a 16-bit type.

Not sure if we do already, but my surface packing/unpacking code might
in some situations.

 It still seems to me that it would be worth putting an assertion here to
 help alert us to what needs to change when we add double precision support
 (or if someday we have hardware that supports half float computation).  I'm
 not 100% sure what the assertion could be.  assert(type_sz(reg-type) ==
 4); was an optimistic guess.  We might have to do
 assert(type_sz(reg-type) == 4 || reg-reg_offset == 0); in order to
 avoid tripping on the rare cases where we currently use 16-bit types in
 fragment shaders.

I think that such an assertion would break my homogeneous
packing/unpacking code if it ever gets an argument with reg_offset != 0,
because it casts its argument into a smaller type (either 8 or 16 bits)
and then uses 'subreg_offset' to select the individual components of the
packed vector.

P.S.: Sorry for the late reply.


pgpED3gpN8Ntu.pgp
Description: PGP signature

Re: [Mesa-dev] [PATCH] build: Fix FTBFS bug introduced by ee55500c22

2014-02-19 Thread Emil Velikov
On 19/02/14 18:12, Kai Wasserbäch wrote:
 The referenced commit set the with_dri_drivers variable to yes in the
 auto case, which is an unknown classic DRI driver and leads to a FTBFS.
 
Thanks for the patch Kai

The issue has been reported already[1] and a slightly more appropriate
patch has been suggested[2]. It will resolve a few more build cases than
the one you have in mind.

Feel free to give it a try.

-Emil

[1] https://bugs.freedesktop.org/show_bug.cgi?id=75126
[2] http://patchwork.freedesktop.org/patch/20467/

 CC: Emil Velikov emil.l.veli...@gmail.com
 Signed-off-by: Kai Wasserbäch k...@dev.carbon-project.org
 ---
  configure.ac | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/configure.ac b/configure.ac
 index 8390d27..ad00d93 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -955,7 +955,7 @@ no) ;;
  auto)
  # classic DRI drivers
  if test x$enable_opengl = xyes; then
 -with_dri_drivers=yes
 +with_dri_drivers=swrast,i915,i965,radeon,r200,nouveau
  fi
  ;;
  esac
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/13] i965: Add missing sample shading bits to Gen8's 3DSTATE_PS_EXTRA.

2014-02-19 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 v2: Also set the oMask Present to Render Target bit, which is required
 for shaders that write oMask.  Otherwise the hardware won't expect
 the extra data.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_ps_state.c | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
 b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index e0a1c9b..e93668e 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -22,6 +22,7 @@
   */
  
  #include stdbool.h
 +#include program/program.h
  #include brw_state.h
  #include brw_defines.h
  #include intel_batchbuffer.h
 @@ -29,6 +30,7 @@
  static void
  upload_ps_extra(struct brw_context *brw)
  {
 +   struct gl_context *ctx = brw-ctx;
 /* BRW_NEW_FRAGMENT_PROGRAM */
 const struct brw_fragment_program *fp =
brw_fragment_program_const(brw-fragment_program);
 @@ -63,6 +65,18 @@ upload_ps_extra(struct brw_context *brw)
 if (fp-program.Base.InputsRead  VARYING_BIT_POS)
dw1 |= GEN8_PSX_USES_SOURCE_DEPTH | GEN8_PSX_USES_SOURCE_W;
  
 +   /* _NEW_BUFFERS */

_mesa_get_min_invocations_per_fragment also depends on _NEW_MULTISAMPLE
(for its test of Multisample.Enabled).

 +   bool multisampled_fbo = ctx-DrawBuffer-Visual.samples  1;
 +   if (multisampled_fbo 
 +   _mesa_get_min_invocations_per_fragment(ctx, fp-program, false)  1)
 +  dw1 |= GEN8_PSX_SHADER_IS_PER_SAMPLE;
 +
 +   if (fp-program.Base.SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN)
 +  dw1 |= GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK;
 +
 +   if (brw-wm.prog_data-uses_omask)
 +  dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET;
 +
 BEGIN_BATCH(2);
 OUT_BATCH(_3DSTATE_PS_EXTRA  16 | (2 - 2));
 OUT_BATCH(dw1);
 @@ -71,7 +85,7 @@ upload_ps_extra(struct brw_context *brw)
  
  const struct brw_tracked_state gen8_ps_extra = {
 .dirty = {
 -  .mesa  = 0,
 +  .mesa  = _NEW_BUFFERS,
.brw   = BRW_NEW_CONTEXT | BRW_NEW_FRAGMENT_PROGRAM,
.cache = 0,
 },
 -- 
 1.8.4.2

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgpmseqGu3Qeo.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] i965: Only use the SIMD16 program for per-sample shading on Broadwell.

2014-02-19 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 This is a straight port from gen7_wm_state.c; I haven't looked into
 whether we can do both.

 v2: Actually do it right.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

 @@ -205,13 +201,39 @@ upload_ps_state(struct brw_context *brw)
 else
dw6 |= GEN7_PS_POSOFFSET_NONE;
  
 -   dw7 |=
 -  brw-wm.prog_data-first_curbe_grf  
 GEN7_PS_DISPATCH_START_GRF_SHIFT_0 |
 -  brw-wm.prog_data-first_curbe_grf_16 
 GEN7_PS_DISPATCH_START_GRF_SHIFT_2;
 +   /* In case of non 1x per sample shading, only one of SIMD8 and SIMD16
 +* should be enabled. We do 'SIMD16 only' dispatch if a SIMD16 shader
 +* is successfully compiled. In majority of the cases that bring us
 +* better performance than 'SIMD8 only' dispatch.
 +*/
 +   int min_invocations_per_fragment =
 +  _mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
 false);
 +   assert(min_invocations_per_fragment = 1);

Same _NEW_MULTISAMPLE comment.


pgpHduuj0GO1U.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/13] Hack: Disable MCS on Broadwell for now.

2014-02-19 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 ---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 ++
  1 file changed, 6 insertions(+)

 I'm mostly sending this out as a placeholder.  Ultimately, we want to get
 MCS working.  I'm not sure whether it would be valuable to push this (with
 a proper commit message) in the meantime.

Seems reasonable for now.  Drop a perf_debug() in here to remind us, and
fix the _NEW_MULTISAMPLE comments in two later patches, and this series
is:

Reviewed-by: Eric Anholt e...@anholt.net


pgpsSvwJe9C37.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Kenneth Graunke
On 02/18/2014 09:48 PM, Chia-I Wu wrote:
 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is not advertised
 for OpenGL ES contexts due to the terrible performance.  It is still enabled
 for desktop GL because it was believed GL 3.0+ requires Z16.
 
 It turns out only GL 3.0 requires Z16, and that is corrected in later GL
 versions.  In light of that, and per Ian's suggestion, stop advertising Z16
 support by default, and add a drirc option, gl30_sized_format_rules, so that
 users can override.

I actually don't think that GL 3.0 requires Z16, either.

In glspec30.20080923.pdf, page 180, it says:
[...] memory allocation per texture component is assigned by the GL to
match the allocations listed in tables 3.16-3.18 as closely as possible.
[...]

Required Texture Formats
[...]
In addition, implementations are required to support the following sized
internal formats.  Requesting one of these internal formats for any
texture type will allocate exactly the internal component sizes and
types shown for that format in tables 3.16-3.17:

Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table 3.16
or table 3.17.  It appears in table 3.18, where the exact rule doesn't
apply, and thus we fall back to the closely as possible rule.

The confusing part is that the ordering of the tables in the PDF is:

Table 3.16 (pages 182-184)
Table 3.18 (bottom of page 184 to top of 185)
Table 3.17 (page 185)

I'm guessing that people saw table 3.16, then saw the one after with
DEPTH_COMPONENT* formats, and assumed it was 3.17.  But it's not.

I think we should just drop Z16 support entirely, and I think we should
remove the requirement from the Piglit test.

 This regresses required-sized-texture-formats on GL 3.0.
 
 Signed-off-by: Chia-I Wu o...@lunarg.com
 Cc: Ian Romanick ian.d.roman...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_context.c | 3 +++
  src/mesa/drivers/dri/i965/brw_context.h | 1 +
  src/mesa/drivers/dri/i965/brw_surface_formats.c | 7 ---
  src/mesa/drivers/dri/i965/intel_screen.c| 4 
  4 files changed, 12 insertions(+), 3 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index ffbdb94..8ecf80b 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -553,6 +553,9 @@ brw_process_driconf_options(struct brw_context *brw)
 brw-disable_derivative_optimization =
driQueryOptionb(brw-optionCache, disable_derivative_optimization);
  
 +   brw-enable_z16 =
 +  driQueryOptionb(brw-optionCache, gl30_sized_format_rules);
 +
 brw-precompile = driQueryOptionb(brw-optionCache, shader_precompile);
  
 ctx-Const.ForceGLSLExtensionsWarn =
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 98e90e2..fd10884 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -1093,6 +1093,7 @@ struct brw_context
 bool disable_throttling;
 bool precompile;
 bool disable_derivative_optimization;
 +   bool enable_z16;
  
 driOptionCache optionCache;
 /** @} */
 diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 index 6a7e00a..1d5f044 100644
 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 @@ -623,10 +623,11 @@ brw_init_surface_formats(struct brw_context *brw)
  * increased depth stalls from a cacheline-based heuristic for detecting
  * depth stalls.
  *
 -* However, desktop GL 3.0+ require that you get exactly 16 bits when
 -* asking for DEPTH_COMPONENT16, so we have to respect that.
 +* However, desktop GL 3.0, and no other version, requires that you get
 +* exactly 16 bits when asking for DEPTH_COMPONENT16, so we have an drirc
 +* option to decide whether to respect that or not.
  */
 -   if (_mesa_is_desktop_gl(ctx))
 +   if (brw-enable_z16)
ctx-TextureFormatSupported[MESA_FORMAT_Z_UNORM16] = true;
  
 /* On hardware that lacks support for ETC1, we map ETC1 to RGBX
 diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
 b/src/mesa/drivers/dri/i965/intel_screen.c
 index ba22971..087fc3c 100644
 --- a/src/mesa/drivers/dri/i965/intel_screen.c
 +++ b/src/mesa/drivers/dri/i965/intel_screen.c
 @@ -64,6 +64,10 @@ DRI_CONF_BEGIN
DRI_CONF_OPT_BEGIN_B(disable_derivative_optimization, false)
DRI_CONF_DESC(en, Derivatives with finer granularity by default)
DRI_CONF_OPT_END
 +
 +  DRI_CONF_OPT_BEGIN_B(gl30_sized_format_rules, false)
 +  DRI_CONF_DESC(en, Honor GL 3.0 specific rules for sized formats)
 +  DRI_CONF_OPT_END
 DRI_CONF_SECTION_END
  
 DRI_CONF_SECTION_QUALITY
 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 07/15] mesa/sso: Add pipeline container/state

2014-02-19 Thread Jordan Justen
On Fri, Feb 7, 2014 at 10:00 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Gregory Hainaut gregory.hain...@gmail.com

 V1:
 * Extend gl_shader_state as pipeline object state
 * Add a new container gl_pipeline_shader_state that contains
binding point of the previous object
 * Update mesa init/free shader state due to the extension of
the attibute
 * Add an init/free pipeline function for the context

 V2:
 * Rename gl_shader_state to gl_pipeline_object
 * Rename Pipeline.PipelineObj to Pipeline.Current
 * Formatting improvement

 V3 (idr):
 * Split out from previous uber patch.
 * Remove '#if 0' debug printfs.

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 ---
  src/mesa/main/context.c |   3 +
  src/mesa/main/mtypes.h  |  22 +-
  src/mesa/main/pipelineobj.c | 161 
 
  src/mesa/main/pipelineobj.h |  25 +++
  4 files changed, 209 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
 index 8421a25..fe072ab 100644
 --- a/src/mesa/main/context.c
 +++ b/src/mesa/main/context.c
 @@ -106,6 +106,7 @@
  #include matrix.h
  #include multisample.h
  #include performance_monitor.h
 +#include pipelineobj.h
  #include pixel.h
  #include pixelstore.h
  #include points.h
 @@ -814,6 +815,7 @@ init_attrib_groups(struct gl_context *ctx)
 _mesa_init_matrix( ctx );
 _mesa_init_multisample( ctx );
 _mesa_init_performance_monitors( ctx );
 +   _mesa_init_pipeline( ctx );
 _mesa_init_pixel( ctx );
 _mesa_init_pixelstore( ctx );
 _mesa_init_point( ctx );
 @@ -1219,6 +1221,7 @@ _mesa_free_context_data( struct gl_context *ctx )
 _mesa_free_texture_data( ctx );
 _mesa_free_matrix_data( ctx );
 _mesa_free_viewport_data( ctx );
 +   _mesa_free_pipeline_data(ctx);
 _mesa_free_program_data(ctx);
 _mesa_free_shader_state(ctx);
 _mesa_free_queryobj_data(ctx);
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 52aeb15..4b8749a 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -2746,9 +2746,15 @@ struct gl_shader_program

  /**
   * Context state for GLSL vertex/fragment shaders.
 + * Extended to support pipeline object
   */
 -struct gl_shader_state
 +struct gl_pipeline_object
  {
 +   /** Name of the pipeline object as received from glGenProgramPipelines.
 +* It would be 0 for shaders without separate shader objects.
 +*/
 +   GLuint Name;
 +
 GLint RefCount;

 _glthread_Mutex Mutex;
 @@ -2774,6 +2780,17 @@ struct gl_shader_state
 GLbitfield Flags;/** Mask of GLSL_x flags */
  };

 +/**
 + * Context state for GLSL pipeline shaders.
 + */
 +struct gl_pipeline_shader_state
 +{
 +   /** Currently bound pipeline object. See _mesa_BindProgramPipeline() */
 +   struct gl_pipeline_object *Current;
 +
 +   /** Pipeline objects */
 +   struct _mesa_HashTable *Objects;
 +};

  /**
   * Compiler options for a single GLSL shaders type
 @@ -4075,7 +4092,8 @@ struct gl_context
 struct gl_geometry_program_state GeometryProgram;
 struct gl_ati_fragment_shader_state ATIFragmentShader;

 -   struct gl_shader_state Shader; /** GLSL shader object state */
 +   struct gl_pipeline_shader_state Pipeline; /** GLSL pipeline shader 
 object state */
 +   struct gl_pipeline_object Shader; /** GLSL shader object state */
 struct gl_shader_compiler_options 
 ShaderCompilerOptions[MESA_SHADER_STAGES];

 struct gl_query_state Query;  /** occlusion, timer queries */
 diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
 index 7454619..a82e3ed 100644
 --- a/src/mesa/main/pipelineobj.c
 +++ b/src/mesa/main/pipelineobj.c
 @@ -30,6 +30,9 @@
   * Implementation of pipeline object related API functions. Based on
   * GL_ARB_separate_shader_objects extension.
   *
 + * \todo
 + * Do we need to create CreatePipelineObject and DeletePipelineObject driver
 + * functions?
   */

I don't know. Another question .. do we need this todo comment? :)

  #include main/glheader.h
 @@ -50,6 +53,164 @@
  #include ../glsl/glsl_parser_extras.h
  #include ../glsl/ir_uniform.h

 +/**
 + * Delete a pipeline object.
 + */
 +void
 +_mesa_delete_pipeline_object(struct gl_context *ctx,
 + struct gl_pipeline_object *obj)
 +{
 +   unsinged i;
 +
 +   _mesa_reference_shader_program(ctx, obj-_CurrentFragmentProgram, NULL);
 +
 +   for (i = 0; i  MESA_SHADER_STAGES; i++)
 +  _mesa_reference_shader_program(ctx, obj-CurrentProgram[i], NULL);
 +
 +   _mesa_reference_shader_program(ctx, obj-ActiveProgram, NULL);
 +   _glthread_DESTROY_MUTEX(obj-Mutex);
 +   ralloc_free(obj);
 +}
 +
 +/**
 + * Allocate and initialize a new pipeline object.
 + */
 +static struct gl_pipeline_object *
 +_mesa_new_pipeline_object(struct gl_context *ctx, GLuint name)
 +{
 +   struct gl_pipeline_object *obj = rzalloc(NULL, struct gl_pipeline_object);
 +   if (obj) {
 +  obj-Name = name;
 +  

Re: [Mesa-dev] [PATCH 6/8] i965: Implement HiZ resolves on Broadwell.

2014-02-19 Thread Kenneth Graunke
On 02/19/2014 11:12 AM, Eric Anholt wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 
 On 02/18/2014 01:38 PM, Eric Anholt wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 [snip]
 diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
 b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 index f30ff28..3fa20c8 100644
 --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
 @@ -203,3 +203,108 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw,
brw-depthstencil.stencil_offset,
hiz, width, height, depth, lod, min_array_element);
  }
 +
 +/**
 + * Emit packets to perform a depth/HiZ resolve or fast depth/stencil 
 clear.
 + *
 + * See the Optimized Depth Buffer Clear and/or Stencil Buffer Clear 
 section
 + * of the hardware documentation for details.
 + */
 +void
 +gen8_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt,
 +  unsigned int level, unsigned int layer, enum gen6_hiz_op op)
 +{
 +   if (op == GEN6_HIZ_OP_NONE)
 +  return;
 +
 +   assert(mt-first_level == 0);
 +
 +   struct intel_mipmap_level *miplevel = mt-level[level];
 +
 +   /* The basic algorithm is:
 +* - If needed, emit 3DSTATE_{DEPTH,HIER_DEPTH,STENCIL}_BUFFER and
 +*   3DSTATE_CLEAR_PARAMS packets to set up the relevant buffers.
 +* - If needed, emit 3DSTATE_DRAWING_RECTANGLE.
 +* - Emit 3DSTATE_WM_HZ_OP with a bit set for the particular operation.
 +* - Do a special PIPE_CONTROL to trigger an implicit rectangle 
 primitive.
 +* - Emit 3DSTATE_WM_HZ_OP with no bits set to return to normal 
 rendering.
 +*/
 +   emit_depth_packets(brw, mt,
 +  brw_depth_format(brw, mt-format),
 +  BRW_SURFACE_2D,
 +  true, /* depth writes */
 +  NULL, false, 0, /* no stencil for now */
 +  true, /* hiz */
 +  mt-logical_width0,
 +  mt-logical_height0,
 +  MAX2(mt-logical_depth0, 1),

 Is logical_depth0 ever 0?  That seems like a bug.

 No, I guess it isn't.  It looks like I copy and pasted this from BLORP,
 or was being overly cautious for some reason.  Dropped.

 +  level,
 +  layer); /* min_array_element */
 +
 +   BEGIN_BATCH(4);
 +   OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE  16 | (4 - 2));
 +   OUT_BATCH(0);
 +   OUT_BATCH(((mt-logical_width0 - 1)  0x) |
 + ((mt-logical_height0 - 1)  16));
 +   OUT_BATCH(0);
 +   ADVANCE_BATCH();

 The drawing rectangle should be using the level's size, not the level 0
 size.

 Yes, this makes sense...we bind a specific miplevel of the depth buffer,
 so presumably the (0, 0) origin is the start of that miplevel, not the
 start of the whole tree.  I'll change that.

 Since the drawing rectangle is just the bounds of where you can draw,
 and not actually the clear/resolve rectangle, I think specifying one
 that's too large shouldn't be harmful.  But specifying the right value
 is trivial, so I agree we should do it.

 +   uint32_t sample_mask = 0x;
 +   if (mt-num_samples  0) {
 +  dw1 |= SET_FIELD(ffs(mt-num_samples) - 1, GEN8_WM_HZ_NUM_SAMPLES);
 +  sample_mask = gen6_determine_sample_mask(brw);
 +   }

 I don't think we want the user-set sample mask stuff to change the
 samples affected by our hiz/depth resolves.  I think you can just drop
 the if block.

 Good point, whatever the user specified is probably unrelated to our
 values.  I've dropped the sample_mask variable and just stuffed 0x
 in the packet.

 I kept the if-block for the dw1 |= ...num_samples... line.

 +
 +   BEGIN_BATCH(5);
 +   OUT_BATCH(_3DSTATE_WM_HZ_OP  16 | (5 - 2));
 +   OUT_BATCH(dw1);
 +   OUT_BATCH(0);
 +   OUT_BATCH(SET_FIELD(miplevel-width, GEN8_WM_HZ_CLEAR_RECTANGLE_X_MAX) 
 |
 + SET_FIELD(miplevel-height, 
 GEN8_WM_HZ_CLEAR_RECTANGLE_Y_MAX));
 +   OUT_BATCH(SET_FIELD(sample_mask, GEN8_WM_HZ_SAMPLE_MASK));
 +   ADVANCE_BATCH();

 I think now the miplevel-width should be minify(mt-logical_width0,
 level).  Hope that helped

 Yes, that's much nicer - and correct for MSAA buffers!  I'm unclear
 whether we need to do:

ALIGN(minify(mt-logical_width0,  level), 8)
ALIGN(minify(mt-logical_height0, level), 4)

 (both here and in the drawing rectangle)

 I've read seemingly contradictory information...it sounds like it might
 be necessary for depth resolves, but not otherwise...but I could be
 misinterpreting it.  It seems to be working...
 
 Yeah, I'm not clear on how this ought to work for resolves.  For clears,
 the strategy explained for the previous gens made sense: Round down your
 coords to get 8x4 alignment, then do slow clears on the remaining strips
 (if any).  But for a resolve, what else do you do?

We only enable HiZ for miplevels that align to 8x4 boundaries.

--Ken
___
mesa-dev 

Re: [Mesa-dev] [PATCH 1/3] glcpp: Only warn for macro names containing __

2014-02-19 Thread Kenneth Graunke
On 02/19/2014 11:09 AM, Ian Romanick wrote:
 I'm hoping that Tapani or Darius will verify that this patch actually
 fixes the problem.  That's why people CC other people on patches. :)

I have the game, and I can confirm that the lighting is awfully broken
with master, and looks correct after this patch.

Patches 1 and 2 are:
Tested-by: Kenneth Graunke kenn...@whitecape.org



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] OpenCL Supported extensions for R600/SI ?

2014-02-19 Thread Dorrington, Albert
Currently clGetDeviceInfo() returns an empty string when queried for 
CL_DEVICE_EXTENSIONS.

Looking through both the Mesa and LLVM/Clang code I see references to the 
following extensions:
cl_khr_fp16
cl_khr_fp64
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_gl_sharing
cl_khr_gl_event
cl_khr_d3d10_sharing
cl_khr_global_int32_base_atomics
cl_khr_global_in32_extended_atomics
cl_khr_local_in32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_3d_image_writes

So are any of these extensions supported within Mesa for the R600 or SI 
implementation?

I'm not finding the Khronos OpenCL spec to be completely clear on this, but it 
seems that extensions that are possible, even if not enabled, should be 
returned by clGetDeviceInfo()

Can anyone shed some light on this for me?

Thanks!

Al Dorrington
Software Engineer Sr
Lockheed Martin, Mission Systems and Training

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/13] i965: Simplify Broadwell's 3DSTATE_MULTISAMPLE sample count handling.

2014-02-19 Thread Anuj Phogat
On Wed, Feb 19, 2014 at 2:04 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 These enumerations are simply log2 of the number of multisamples shifted
 by a bit, so we can calculate them using ffs() in a lot less code.

 Suggested by Eric Anholt.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_multisample_state.c | 26 
 +++---
  1 file changed, 3 insertions(+), 23 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
 b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
 index 64c7208..bfe0d5b 100644
 --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
 @@ -33,33 +33,13 @@
  void
  gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
  {
 -   uint32_t number_of_multisamples = 0;
 +   assert(num_samples = 16);

 -   switch (num_samples) {
 -   case 0:
 -   case 1:
 -  number_of_multisamples = MS_NUMSAMPLES_1;
 -  break;
 -   case 2:
 -  number_of_multisamples = MS_NUMSAMPLES_2;
 -  break;
 -   case 4:
 -  number_of_multisamples = MS_NUMSAMPLES_4;
 -  break;
 -   case 8:
 -  number_of_multisamples = MS_NUMSAMPLES_8;
 -  break;
 -   case 16:
 -  number_of_multisamples = MS_NUMSAMPLES_16;
 -  break;
 -   default:
 -  assert(!Unrecognized num_samples in gen8_emit_3dstate_multisample);
 -  break;
 -   }
 +   unsigned log2_samples = ffs(MAX2(num_samples, 1)) - 1;

 BEGIN_BATCH(2);
 OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE  16 | (2 - 2));
 -   OUT_BATCH(MS_PIXEL_LOCATION_CENTER | number_of_multisamples);
 +   OUT_BATCH(MS_PIXEL_LOCATION_CENTER | log2_samples  1);
 ADVANCE_BATCH();
  }

 --
 1.8.4.2

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

This series is:
Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] i965: Only use the SIMD16 program for per-sample shading on Broadwell.

2014-02-19 Thread Anuj Phogat
On Wed, Feb 19, 2014 at 2:04 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 This is a straight port from gen7_wm_state.c; I haven't looked into
 whether we can do both.

Verified that restriction still holds true in BDW.
See 3D Pipeline Stages  Pixel  Pixel Shader Thread Generation 
Pixel Grouping (Dispatch Size) Control
 v2: Actually do it right.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_ps_state.c | 38 
 ---
  1 file changed, 30 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
 b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index 57bf053..a834b85 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -183,10 +183,6 @@ upload_ps_state(struct brw_context *brw)
 if (brw-wm.prog_data-nr_params  0)
dw6 |= GEN7_PS_PUSH_CONSTANT_ENABLE;

 -   dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
 -   if (brw-wm.prog_data-prog_offset_16)
 -  dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
 -
 /* From the documentation for this packet:
  * If the PS kernel does not need the Position XY Offsets to
  *  compute a Position Value, then this field should be programmed
 @@ -205,13 +201,39 @@ upload_ps_state(struct brw_context *brw)
 else
dw6 |= GEN7_PS_POSOFFSET_NONE;

 -   dw7 |=
 -  brw-wm.prog_data-first_curbe_grf  
 GEN7_PS_DISPATCH_START_GRF_SHIFT_0 |
 -  brw-wm.prog_data-first_curbe_grf_16 
 GEN7_PS_DISPATCH_START_GRF_SHIFT_2;
 +   /* In case of non 1x per sample shading, only one of SIMD8 and SIMD16
 +* should be enabled. We do 'SIMD16 only' dispatch if a SIMD16 shader
 +* is successfully compiled. In majority of the cases that bring us
 +* better performance than 'SIMD8 only' dispatch.
 +*/
 +   int min_invocations_per_fragment =
 +  _mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
 false);
 +   assert(min_invocations_per_fragment = 1);
 +
 +   if (brw-wm.prog_data-prog_offset_16) {
 +  dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
 +  if (min_invocations_per_fragment == 1) {
 + dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
 + dw7 |= (brw-wm.prog_data-first_curbe_grf 
 + GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 + dw7 |= (brw-wm.prog_data-first_curbe_grf_16 
 + GEN7_PS_DISPATCH_START_GRF_SHIFT_2);
 +  } else {
 + dw7 |= (brw-wm.prog_data-first_curbe_grf_16 
 + GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 +  }
 +   } else {
 +  dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
 +  dw7 |= (brw-wm.prog_data-first_curbe_grf 
 +  GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 +   }

 BEGIN_BATCH(12);
 OUT_BATCH(_3DSTATE_PS  16 | (12 - 2));
 -   OUT_BATCH(brw-wm.base.prog_offset);
 +   if (brw-wm.prog_data-prog_offset_16  min_invocations_per_fragment  1)
 +  OUT_BATCH(brw-wm.base.prog_offset + 
 brw-wm.prog_data-prog_offset_16);
 +   else
 +  OUT_BATCH(brw-wm.base.prog_offset);
 OUT_BATCH(0);
 OUT_BATCH(dw3);
 if (brw-wm.prog_data-total_scratch) {
 --
 1.8.4.2

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/15] mesa/sso: Implement _mesa_ActiveShaderProgram

2014-02-19 Thread Jordan Justen
On Fri, Feb 7, 2014 at 10:00 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Gregory Hainaut gregory.hain...@gmail.com

 This was originally included in another patch, but it was split out by
 Ian Romanick.

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 ---
  src/mesa/main/pipelineobj.c | 24 
  1 file changed, 24 insertions(+)

 diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
 index b47dc7a..6e490bd 100644
 --- a/src/mesa/main/pipelineobj.c
 +++ b/src/mesa/main/pipelineobj.c
 @@ -227,6 +227,30 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
 stages, GLuint program)
  void GLAPIENTRY
  _mesa_ActiveShaderProgram(GLuint pipeline, GLuint program)
  {
 +   GET_CURRENT_CONTEXT(ctx);
 +   struct gl_shader_program *shProg = (program != 0)
 +  ? _mesa_lookup_shader_program_err(ctx, program, 
 glActiveShaderProgram(program))
 +  : NULL;

Seems like if/else would be more clear for this part.

If _mesa_lookup_shader_program_err returns NULL, should we exit early?

-Jordan

 +   struct gl_pipeline_object *pipe = lookup_pipeline_object(ctx, pipeline);
 +
 +   if (!pipe) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION, 
 glActiveShaderProgram(pipeline));
 +  return;
 +   }
 +
 +   /* Object is created by any Pipeline call but glGenProgramPipelines,
 +* glIsProgramPipeline and GetProgramPipelineInfoLog
 +*/
 +   pipe-EverBound = GL_TRUE;
 +
 +   if ((shProg != NULL)  !shProg-LinkStatus) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION,
 +glActiveShaderProgram(program %u not linked), shProg-Name);
 +  return;
 +   }
 +
 +   _mesa_reference_shader_program(ctx, pipe-ActiveProgram, shProg);
  }

  /**
 --
 1.8.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] The first half of GL_ARB_separate_shader_objects

2014-02-19 Thread Jordan Justen
I replied to 7  14. Series:
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

On Fri, Feb 7, 2014 at 10:00 PM, Ian Romanick i...@freedesktop.org wrote:
 I'm taking a patch from Paul's notebook, and I'm going to try land a
 giant patch series in a small number of more manageable chunks.
 GL_ARB_separate_shader_objects has been work-in-progress for about 10
 months.  This represents about half of the final patch series.  The next
 block of patches will be about half of the remaining bits, and the third
 patch series should be the rest.

 The current state of things is also in the sso5 branch of
 git://people.freedesktop.org/~idr/mesa.  There are some smurf commits at
 the end, and there's still some work to be done, obviously.

 This is the easy half.  This series adds:

  - Extension tracking

  - Parser and compiler front-end support for the layout qualifiers added
by the extension.

  - Plumbing for shader pipeline objects.

  - The bulk of the API.

 I don't think there should be anythig controversial here... that's all
 in the next batch.  All of this has been pretty well tested by piglit,
 and at least one ISV has been playing with it too.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 On 02/18/2014 09:48 PM, Chia-I Wu wrote:
 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is not advertised
 for OpenGL ES contexts due to the terrible performance.  It is still enabled
 for desktop GL because it was believed GL 3.0+ requires Z16.
 
 It turns out only GL 3.0 requires Z16, and that is corrected in later GL
 versions.  In light of that, and per Ian's suggestion, stop advertising Z16
 support by default, and add a drirc option, gl30_sized_format_rules, so that
 users can override.

 I actually don't think that GL 3.0 requires Z16, either.

 In glspec30.20080923.pdf, page 180, it says:
 [...] memory allocation per texture component is assigned by the GL to
 match the allocations listed in tables 3.16-3.18 as closely as possible.
 [...]

 Required Texture Formats
 [...]
 In addition, implementations are required to support the following sized
 internal formats.  Requesting one of these internal formats for any
 texture type will allocate exactly the internal component sizes and
 types shown for that format in tables 3.16-3.17:

 Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table 3.16
 or table 3.17.  It appears in table 3.18, where the exact rule doesn't
 apply, and thus we fall back to the closely as possible rule.

 The confusing part is that the ordering of the tables in the PDF is:

 Table 3.16 (pages 182-184)
 Table 3.18 (bottom of page 184 to top of 185)
 Table 3.17 (page 185)

ffs.

Yeah, let's just drop Z16 from the driver and the sized depth stuff from
the 3.0 piglit test.


pgp8DqD8hK1Uc.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenCL Supported extensions for R600/SI ?

2014-02-19 Thread Tom Stellard
On Wed, Feb 19, 2014 at 09:20:22PM +, Dorrington, Albert wrote:
 Currently clGetDeviceInfo() returns an empty string when queried for 
 CL_DEVICE_EXTENSIONS.
 
 Looking through both the Mesa and LLVM/Clang code I see references to the 
 following extensions:
 cl_khr_fp64
 cl_khr_int64_base_atomics
 cl_khr_int64_extended_atomics
 cl_khr_gl_sharing
 cl_khr_gl_event
 cl_khr_d3d10_sharing
 cl_khr_global_int32_base_atomics
 cl_khr_global_in32_extended_atomics


 cl_khr_local_in32_base_atomics
 cl_khr_local_int32_extended_atomics

These two are partially supported.

 cl_khr_byte_addressable_store

This one is supported, but the clGetDeviceInfo implementation
has not been updated to reflect this.

 cl_khr_3d_image_writes
 
 So are any of these extensions supported within Mesa for the R600 or SI 
 implementation?
 
 I'm not finding the Khronos OpenCL spec to be completely clear on this, but 
 it seems that extensions that are possible, even if not enabled, should be 
 returned by clGetDeviceInfo()
 
 Can anyone shed some light on this for me?


The clGetDeviceInfo() implementation in clover is incomplete.  There are a lot
of values that are hard-coded which need to be replaced with driver queries.

-Tom
 
 Thanks!
 
 Al Dorrington
 Software Engineer Sr
 Lockheed Martin, Mission Systems and Training
 

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Don't call pipe_loader_release() when deleting a device

2014-02-19 Thread Tom Stellard
On Tue, Feb 18, 2014 at 05:50:19PM +0100, Francisco Jerez wrote:
 Tom Stellard t...@stellard.net writes:
 
  From: Tom Stellard thomas.stell...@amd.com
 
  After pipe_loader_release() is called, if any of the pipe_* objects
  try to call into the gallium API the application will segfault.
 
  The only time devices are deleted is when the global _clover_platform
  object is deleted by the static destructor.  However,  since application
  objects that are deleted by the static destructor *after*
  _clover_platform might try to make a CL API calls from their destructor,
  it is never safe to call pipe_loader_release().
 
 Please have a look at the clover-internal-ref-counting branch [1] of my
 mesa tree, it should fix a number of memory management-related bugs,
 possibly the one you've encountered too, without the negative side
 effects of dropping the call to pipe_loader_release().
 

I came across one regression, but I'm still looking into whether or not it is a
bug in clover or an application bug.

-Tom

 Thanks.
 
 [1] 
 http://cgit.freedesktop.org/~currojerez/mesa/log/?h=clover-internal-ref-counting
 
  ---
   src/gallium/state_trackers/clover/core/device.cpp | 2 --
   1 file changed, 2 deletions(-)
 
  diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
  b/src/gallium/state_trackers/clover/core/device.cpp
  index 76a49d0..2290366 100644
  --- a/src/gallium/state_trackers/clover/core/device.cpp
  +++ b/src/gallium/state_trackers/clover/core/device.cpp
  @@ -48,8 +48,6 @@ device::device(clover::platform platform, 
  pipe_loader_device *ldev) :
   device::~device() {
  if (pipe)
 pipe-destroy(pipe);
  -   if (ldev)
  -  pipe_loader_release(ldev, 1);
   }
   
   bool
  -- 
  1.8.1.4



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Kenneth Graunke
On 02/19/2014 02:27 PM, Ian Romanick wrote:
 On 02/19/2014 12:08 PM, Kenneth Graunke wrote:
 On 02/18/2014 09:48 PM, Chia-I Wu wrote:
 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is
 not advertised for OpenGL ES contexts due to the terrible
 performance.  It is still enabled for desktop GL because it was
 believed GL 3.0+ requires Z16.

 It turns out only GL 3.0 requires Z16, and that is corrected in
 later GL versions.  In light of that, and per Ian's suggestion,
 stop advertising Z16 support by default, and add a drirc option,
 gl30_sized_format_rules, so that users can override.
 
 I actually don't think that GL 3.0 requires Z16, either.
 
 In glspec30.20080923.pdf, page 180, it says: [...] memory
 allocation per texture component is assigned by the GL to match the
 allocations listed in tables 3.16-3.18 as closely as possible. 
 [...]
 
 Required Texture Formats [...] In addition, implementations are
 required to support the following sized internal formats.
 Requesting one of these internal formats for any texture type will
 allocate exactly the internal component sizes and types shown for
 that format in tables 3.16-3.17:
 
 Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table
 3.16 or table 3.17.  It appears in table 3.18, where the exact
 rule doesn't apply, and thus we fall back to the closely as
 possible rule.
 
 The confusing part is that the ordering of the tables in the PDF
 is:
 
 Table 3.16 (pages 182-184) Table 3.18 (bottom of page 184 to top of
 185) Table 3.17 (page 185)
 
 I'm guessing that people saw table 3.16, then saw the one after
 with DEPTH_COMPONENT* formats, and assumed it was 3.17.  But it's
 not.
 
 Yay latex!  Thank you for putting things in random order because it
 fit better. :(
 
 I think we should just drop Z16 support entirely, and I think we
 should remove the requirement from the Piglit test.
 
 If the test is wrong, and it sounds like it is, then I'm definitely in
 favor of changing it.
 
 The reason to have Z16 is low-bandwidth GPUs in resource constrained
 environments.  If an app specifically asks for Z16, then there's a
 non-zero (though possibly infinitesimal) probability they're doing it
 for a reason.  For at least some platforms, isn't there just a
 work-around to implement to fix the performance issue?  Doesn't the
 performance issue only affect some platforms to begin with?
 
 Maybe just change the check to
 
ctx-TextureFormatSupported[MESA_FORMAT_Z_UNORM16] =
   ! platform has z16 performance issues;

Currently, all platforms have Z16 performance issues.  On Haswell and
later, we could potentially implement the PMA stall optimization, which
I believe would reduce(?) the problem.  I'm not sure if it would
eliminate it though.

I think the best course of action is:
1. Fix the Piglit test to not require precise depth formats.
2. Disable Z16 on all generations.
3. Add a to do item for implementing the HSW+ PMA stall optimization.
4. Add a to do item for re-evaluating Z16 on HSW+ once that's done.

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord

2014-02-19 Thread Anuj Phogat
On Tue, Feb 18, 2014 at 5:28 PM, Ian Romanick i...@freedesktop.org wrote:
 On 02/18/2014 03:36 PM, Anuj Phogat wrote:
 On Tue, Feb 18, 2014 at 11:01 AM, Ian Romanick i...@freedesktop.org wrote:
 On 02/10/2014 05:29 PM, Anuj Phogat wrote:
 GLSL 1.50 spec says:
If gl_FragCoord is redeclared in any fragment shader in a program,
 it must be redeclared in all the fragment shaders in that
 program that have a static use gl_FragCoord. All redeclarations of
 gl_FragCoord in all fragment shaders in a single program must
 have the same set of qualifiers.

 This patch makes the glsl compiler to generate an error if we have a
 fragment shader defined with conflicting layout qualifier declarations
 for gl_FragCoord. For example:

 layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
 layout(pixel_center_integer) in vec4 gl_FragCoord;
 void main()
 {
gl_FragColor = gl_FragCoord.xyzz;
 }

 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/glsl/ast_to_hir.cpp | 39 
 +++
  src/glsl/glsl_parser_extras.cpp |  3 +++
  src/glsl/glsl_parser_extras.h   | 10 ++
  3 files changed, 52 insertions(+)

 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index c89a26b..7d7d89b 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -2374,6 +2374,45 @@ apply_type_qualifier_to_variable(const struct 
 ast_type_qualifier *qual,
  qual_string);
 }

 +   /* Make sure all gl_FragCoord redeclarations specify the same layout
 +* qualifier type.
 +*/
 +   bool conflicting_pixel_center_integer =
 +  state-fs_pixel_center_integer 
 +  !qual-flags.q.pixel_center_integer;
 +
 +   bool conflicting_origin_upper_left =
 +  state-fs_origin_upper_left 
 +  !qual-flags.q.origin_upper_left;

 I don't think this catches all the cases.  What about

 layout(origin_upper_left  ) in vec4 gl_FragCoord;
 layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;

 Nice catch. I'll update my patch to include this case. What do you think
 about following two cases?
 case 1:
 in vec4 gl_FragCoord;
 layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;

 AMD produces no compilation error. This patch matches the behavior.

 case 2:
 layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
 in vec4 gl_FragCoord;

 AMD produces compilation error. This patch matches the behavior.

 I don't think that's right.  I think they should both produce an error.
  The spec says, All redeclarations of gl_FragCoord in all fragment
 shaders in a single program must have the same set of qualifiers.  I
 can't see any reason to give an error for case 2 but not for case 1.

I agree.
 We should also check NVIDIA.

NVIDIA driver produces compilation error in both cases. I'll add a check
to handle them correctly in mesa.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 71870] Metro: Last Light rendering issues

2014-02-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=71870

Ian Romanick i...@freedesktop.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #45 from Ian Romanick i...@freedesktop.org ---
The __ issue should be fixed by the following commits on Mesa master.  These
are scheduled to inclusion in 10.1 and 10.0.4.

commit 2c85fd5a964a78c9f7a93994fb79f1723c6f45b5
Author: Ian Romanick ian.d.roman...@intel.com
Date:   Tue Feb 18 09:36:08 2014 -0800

glsl: Only warn for macro names containing __

From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:

In addition, all identifiers containing two consecutive underscores
 (__) are reserved as possible future keywords.

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Names simply containing __ are dangerous to use, but should
be allowed.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Tested-by: Kenneth Graunke kenn...@whitecape.org
Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
Tested-by: Darius Spitznagel d.spitzna...@goodbytez.de
Cc: Tapani Pälli lem...@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702

commit 0bd78926304e72ef3566e977d0cb5a959d86b809
Author: Ian Romanick ian.d.roman...@intel.com
Date:   Tue Feb 18 09:10:36 2014 -0800

glcpp: Only warn for macro names containing __

Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:

All macro names containing two consecutive underscores ( __ ) are
reserved for future use as predefined macro names. All macro names
prefixed with GL_ (GL followed by a single underscore) are also
reserved.

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error.  Names simply
containing __ are dangerous to use, but should be allowed.  In similar
cases, the C++ preprocessor specification says, no diagnostic is
required.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Tested-by: Kenneth Graunke kenn...@whitecape.org
Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
Tested-by: Darius Spitznagel d.spitzna...@goodbytez.de
Cc: Tapani Pälli lem...@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Ian Romanick
On 02/19/2014 03:03 PM, Kenneth Graunke wrote:
 Currently, all platforms have Z16 performance issues.  On Haswell and
 later, we could potentially implement the PMA stall optimization, which
 I believe would reduce(?) the problem.  I'm not sure if it would
 eliminate it though.
 
 I think the best course of action is:
 1. Fix the Piglit test to not require precise depth formats.
 2. Disable Z16 on all generations.
 3. Add a to do item for implementing the HSW+ PMA stall optimization.
 4. Add a to do item for re-evaluating Z16 on HSW+ once that's done.

I didn't realize all platforms had Z16 issues.  I thought it was just
HiZ platforms (ILK+).  Sounds like a good plan to me.

 --Ken
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] i965: Only use the SIMD16 program for per-sample shading on Broadwell.

2014-02-19 Thread Kenneth Graunke
On 02/19/2014 01:32 PM, Anuj Phogat wrote:
 On Wed, Feb 19, 2014 at 2:04 AM, Kenneth Graunke kenn...@whitecape.org 
 wrote:
 This is a straight port from gen7_wm_state.c; I haven't looked into
 whether we can do both.

 Verified that restriction still holds true in BDW.
 See 3D Pipeline Stages  Pixel  Pixel Shader Thread Generation 
 Pixel Grouping (Dispatch Size) Control

Thanks for looking this up!




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] geom-outlining-150: Use a vbo.

2014-02-19 Thread Fabian Bieler
Use a vbo for vertex data instead of client-side arrays.
Also bind a vertex array object.

This is necessary for the switch to a core profile context.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/geom-outlining-150.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/src/glsl/geom-outlining-150.c b/src/glsl/geom-outlining-150.c
index 5c2b3c9..0bc20f0 100644
--- a/src/glsl/geom-outlining-150.c
+++ b/src/glsl/geom-outlining-150.c
@@ -23,6 +23,7 @@
 static GLint WinWidth = 500, WinHeight = 500;
 static GLint Win = 0;
 static GLuint VertShader, GeomShader, FragShader, Program;
+static GLuint vao, vbo;
 static GLboolean Anim = GL_TRUE;
 static int uViewportSize = -1, uModelViewProj = -1, uColor = -1;
 
@@ -112,11 +113,6 @@ mat_multiply(GLfloat product[16], const GLfloat a[16], 
const GLfloat b[16])
 static void
 Redisplay(void)
 {
-   static const GLfloat verts[3][2] = {
-  { -1, -1 },
-  {  1, -1 },
-  {  0,  1 }
-   };
GLfloat rot[4][4];
GLfloat trans[16], mvp[16];
 
@@ -131,8 +127,6 @@ Redisplay(void)
glUniformMatrix4fv(uModelViewProj, 1, GL_FALSE, (float *) mvp);
 
/* Draw */
-   glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, verts);
-   glEnableVertexAttribArray(0);
glDrawArrays(GL_TRIANGLES, 0, 3);
 
glutSwapBuffers();
@@ -217,6 +211,8 @@ CleanUp(void)
glDeleteShader(VertShader);
glDeleteShader(GeomShader);
glDeleteProgram(Program);
+   glDeleteVertexArrays(1, vao);
+   glDeleteBuffers(1, vbo);
glutDestroyWindow(Win);
 }
 
@@ -304,6 +300,11 @@ Init(void)
  float m = min(d0, min(d1, d2)); \n
  FragColor = Color * smoothstep(0.0, LINE_WIDTH, m); \n
   } \n;
+   static const GLfloat verts[3][2] = {
+  { -1, -1 },
+  {  1, -1 },
+  {  0,  1 }
+   };
 
if (!ShadersSupported())
   exit(1);
@@ -351,6 +352,16 @@ Init(void)
 
glUniform4fv(uColor, 1, Orange);
 
+   glGenBuffers(1, vbo);
+   glBindBuffer(GL_ARRAY_BUFFER, vbo);
+   glBufferData(GL_ARRAY_BUFFER, sizeof(verts), verts, GL_STATIC_DRAW);
+
+   glGenVertexArrays(1, vao);
+   glBindVertexArray(vao);
+
+   glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, NULL);
+   glEnableVertexAttribArray(0);
+
glClearColor(0.3f, 0.3f, 0.3f, 0.0f);
glEnable(GL_DEPTH_TEST);
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] glsl/gsraytrace: Don't create new Buffer objects everytime the window is resized.

2014-02-19 Thread Fabian Bieler
Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index c21c667..f156fdc 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -776,7 +776,6 @@ Reshape(int width, int height)
 
{
   size_t nElem = WinWidth*WinHeight*nRayGens;
-  glGenBuffers(1, dst);
   glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV, dst);
   glBufferData(GL_TRANSFORM_FEEDBACK_BUFFER_NV, nElem*sizeof(GSRay), 0, 
GL_STREAM_DRAW);
   GSRay* d = (GSRay*)glMapBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 
GL_READ_WRITE);
@@ -790,7 +789,6 @@ Reshape(int width, int height)
}
 
{
-  glGenBuffers(1, eyeRaysAsPoints);
   glBindBuffer(GL_ARRAY_BUFFER, eyeRaysAsPoints);
   glBufferData(GL_ARRAY_BUFFER, WinWidth*WinHeight*sizeof(GSRay), 0, 
GL_STATIC_DRAW);
   GSRay* d = (GSRay*)glMapBuffer(GL_ARRAY_BUFFER, GL_READ_WRITE);
@@ -919,6 +917,8 @@ Init(void)
}
 
glGenQueries(1, pgQuery);
+   glGenBuffers(1, dst);
+   glGenBuffers(1, eyeRaysAsPoints);
 
printf(\nESC = exit demo\nleft mouse + drag   = rotate 
camera\n\n);
 }
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] DEMOS Use core profile in two GS demos (v3).

2014-02-19 Thread Fabian Bieler
Hello!

As mesa only supports geometry shaders in core profile contexts this patchset
adjusts the gsraytrace and the geom-outlining-150 demos to use the core
profile.

This is v3 with some comments by Ian Romanick adressed.

The series is reviewed by Brian Paul and Ian Romanick.

As I don't have git access, I'd appreciate it if someone could commit these
patches.

Thanks,
Fabian

Fabian Bieler (12):
  configure.ac: Check for freeglut.
  glut_wrapper: Include freeglut.h if available.
  glsl/gsraytrace: Use __LINE__ macro to set line numbers in GLSL source
strings.
  glsl/gsraytrace: Don't create new Buffer objects everytime the window
is resized.
  glsl/gsraytrace: Bind transform feedback buffer.
  glsl/gsraytrace: Use core GL3.0 transform feedback
  glsl/gsraytrace: Use GLSL 1.5 instead of 1.2.
  glsl/gsraytrace: Use core geometry shaders.
  glsl/gsraytrace: Switch to core profile.
  geom-outlining-150: Use a vbo.
  geom-outlining-150: Use core geometry shaders.
  geom-outlining-150: Switch to core profile.

 configure.ac  |   6 ++
 src/glsl/geom-outlining-150.c |  64 +--
 src/glsl/gsraytrace.cpp   | 185 ++
 src/util/glut_wrap.h  |   4 +-
 4 files changed, 147 insertions(+), 112 deletions(-)

-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/12] glut_wrapper: Include freeglut.h if available.

2014-02-19 Thread Fabian Bieler
The freeglut header only defines the extensions to request an OpenGL core
profile context if freeglut.h (rather than glut.h) is included.

Note that the header is installed to include/GL/freeglut.h on OS X, too.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/util/glut_wrap.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/util/glut_wrap.h b/src/util/glut_wrap.h
index a48a9e8..fa1b8f9 100644
--- a/src/util/glut_wrap.h
+++ b/src/util/glut_wrap.h
@@ -1,7 +1,9 @@
 #ifndef GLUT_WRAP_H
 #define GLUT_WRAP_H
 
-#ifdef __APPLE__
+#ifdef HAVE_FREEGLUT
+#  include GL/freeglut.h
+#elif defined __APPLE__
 #  include GLUT/glut.h
 #else
 #  include GL/glut.h
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/12] geom-outlining-150: Switch to core profile.

2014-02-19 Thread Fabian Bieler
Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/geom-outlining-150.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/glsl/geom-outlining-150.c b/src/glsl/geom-outlining-150.c
index 3dffa16..2e2a54a 100644
--- a/src/glsl/geom-outlining-150.c
+++ b/src/glsl/geom-outlining-150.c
@@ -364,9 +364,22 @@ main(int argc, char *argv[])
 {
glutInit(argc, argv);
glutInitWindowSize(WinWidth, WinHeight);
+#ifdef HAVE_FREEGLUT
+   glutInitContextVersion(3, 2);
+   glutInitContextProfile(GLUT_CORE_PROFILE);
glutInitDisplayMode(GLUT_RGB | GLUT_DEPTH | GLUT_DOUBLE);
+#elif defined __APPLE__
+   glutInitDisplayMode(GLUT_3_2_CORE_PROFILE | GLUT_RGB | GLUT_DEPTH | 
GLUT_DOUBLE);
+#else
+   glutInitDisplayMode(GLUT_RGB | GLUT_DEPTH | GLUT_DOUBLE);
+#endif
Win = glutCreateWindow(argv[0]);
+   /* glewInit requires glewExperimentel set to true for core profiles.
+* Depending on the glew version it also generates a GL_INVALID_ENUM.
+*/
+   glewExperimental = GL_TRUE;
glewInit();
+   glGetError();
glutReshapeFunc(Reshape);
glutKeyboardFunc(Key);
glutDisplayFunc(Redisplay);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] glsl/gsraytrace: Bind transform feedback buffer.

2014-02-19 Thread Fabian Bieler
Bind the transform feedback buffer before drawing into it und unbind it
afterwards.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index f156fdc..015bfcd 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -628,6 +628,7 @@ Draw(void)
printf(%d\n, i);
//gs.fpwQuery-beginQuery();
//gs.pgQuery-beginQuery();
+   glBindBufferBaseNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, dst);
glBeginQuery(GL_PRIMITIVES_GENERATED_NV, pgQuery);
glBeginTransformFeedbackNV(GL_POINTS);
//gs.eyeRaysAsPoints-bindAs(ARRAY);
@@ -675,7 +676,7 @@ Draw(void)
 
 
swap(src, dst);
-   glBindBufferOffsetNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, dst-getID(), 
0); pso_gl_check();
+   glBindBufferBaseNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, 0);
 
clear();
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/12] glsl/gsraytrace: Use __LINE__ macro to set line numbers in GLSL source strings.

2014-02-19 Thread Fabian Bieler
The hardcoded numbers are a few lines off at the moment.
Keeping track of the numbers through further modifications is inconvenient.
The __LINE__ constant takes care of this automatically.

v2: Don't set source-string-number to line number.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index 62a584d..c21c667 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -37,6 +37,10 @@
 // TODO: use GL_EXT_transform_feedback or GL3 equivalent
 // TODO: port to piglit too
 
+#define STRINGIFY_(x) #x
+#define STRINGIFY(x) STRINGIFY_(x)
+#define S__LINE__ STRINGIFY(__LINE__)
+
 static const float INF=.9F;
 
 static int Win;
@@ -67,7 +71,7 @@ float rot[9] = {1,0,0,  0,1,0,   0,0,1};
 static const char* vsSource =
   \n
 #version 120  \n
-#line 63 63   \n
+#line  S__LINE__ \n
 #define SHADOWS   \n
 #define RECURSION \n
   \n
@@ -249,7 +253,7 @@ static const char* vsSource =
 
 static const char* gsSource = 
 #version 120 \n
-#line 245 245\n
+#line  S__LINE__ \n
 #extension GL_ARB_geometry_shader4: require  \n
  \n
 #define SHADOWS  \n
@@ -388,7 +392,7 @@ static const char* gsSource =
 
 static const char* fsSource = 
 #version 120 \n
-#line 384 384\n
+#line  S__LINE__ \n
  \n
 #define SHADOWS  \n
 #define RECURSION\n
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] glsl/gsraytrace: Use core geometry shaders.

2014-02-19 Thread Fabian Bieler
v2: Don't remove ShaderSupported() test. It sets up some function pointers for
the CompileShader framework.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index f9e708f..6df6543 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -255,7 +255,8 @@ static const char* vsSource =
 static const char* gsSource = 
 #version 150 \n
 #line  S__LINE__ \n
-#extension GL_ARB_geometry_shader4: require  \n
+layout(points) in;   \n
+layout(points, max_vertices = 3) out;\n
  \n
 #define SHADOWS  \n
 #define RECURSION\n
@@ -337,7 +338,7 @@ static const char* gsSource =
 return;  \n
  \n
   // emitPassThrough();  \n
-  gl_Position  = gl_PositionIn[0];   \n
+  gl_Position  = gl_in[0].gl_Position;   \n
   orig_t2  = orig_t1[0]; \n
   dir_idx2 = dir_idx1[0];\n
   uv_state2.xyw= uv_state1[0].xyw;   \n
@@ -362,7 +363,7 @@ static const char* gsSource =
   type = 1;\n
\n
   //emitShadowRay();   \n
-  gl_Position  = gl_PositionIn[0]; \n
+  gl_Position  = gl_in[0].gl_Position; \n
   orig_t2.xyz  = shadowRay.orig;   \n
   orig_t2.w= shadowHit.t;  \n
   dir_idx2.xyz = shadowRay.dir;\n
@@ -379,7 +380,7 @@ static const char* gsSource =
   type  = -1;   \n
 \n
   //emitReflRay();  \n
-  gl_Position  = gl_PositionIn[0];  \n
+  gl_Position  = gl_in[0].gl_Position;  \n
   orig_t2.xyz  = reflRay.orig;  \n
   orig_t2.w= reflHit.t; \n
   dir_idx2.xyz = reflRay.dir;   \n
@@ -844,24 +845,17 @@ Init(void)
   exit(-1);
}
 
-   if (!GLEW_ARB_geometry_shader4)
+   if (!GLEW_VERSION_3_2)
{
-  fprintf(stderr, GS Shaders are not supported!\n);
-  exit(-1);
-   }
-
-   if (!GLEW_VERSION_3_0)
-   {
-  fprintf(stderr, OpenGL 3.0 (needed for transform feedback) not 
-  supported!\n);
+  fprintf(stderr, OpenGL 3.2 (needed for transform feedback and 
+  gemoetry shaders) not supported!\n);
   exit(-1);
}
 
vertShader = CompileShaderText(GL_VERTEX_SHADER, vsSource);
geomShader = CompileShaderText(GL_GEOMETRY_SHADER_ARB, gsSource);
fragShader = CompileShaderText(GL_FRAGMENT_SHADER, fsSource);
-   program = LinkShaders3WithGeometryInfo(vertShader, geomShader, fragShader,
-  3, GL_POINTS, GL_POINTS);
+   program = LinkShaders3(vertShader, geomShader, fragShader);
 
const char *varyings[] = {
   gl_Position,
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] glsl/gsraytrace: Use core GL3.0 transform feedback

2014-02-19 Thread Fabian Bieler
NV_transform_feedback is not supported by mesa.
Use transform feedback from core OpenGL 3.0.

This necessitates binding the transform feedback varyings before linking the
shader.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 72 +
 1 file changed, 31 insertions(+), 41 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index 015bfcd..ef67643 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -34,7 +34,6 @@
 #include math.h
 #include stddef.h // offsetof
 
-// TODO: use GL_EXT_transform_feedback or GL3 equivalent
 // TODO: port to piglit too
 
 #define STRINGIFY_(x) #x
@@ -604,33 +603,12 @@ Draw(void)
dir_idxAttribLoc = glGetAttribLocation(program, dir_idx);
uv_stateAttribLoc = glGetAttribLocation(program, uv_state);
 
-   posVaryingLoc = glGetVaryingLocationNV(program, gl_Position);
-   orig_tVaryingLoc = glGetVaryingLocationNV(program, orig_t2);
-   dir_idxVaryingLoc = glGetVaryingLocationNV(program, dir_idx2);
-   uv_stateVaryingLoc = glGetVaryingLocationNV(program, uv_state2);
-   //gs.gs-getVaryingLocation(gl_Position, gs.posVaryingLoc);
-   //gs.gs-getVaryingLocation(orig_t2, gs.orig_tVaryingLoc);
-   //gs.gs-getVaryingLocation(dir_idx2, gs.dir_idxVaryingLoc);
-   //gs.gs-getVaryingLocation(uv_state2, gs.uv_stateVaryingLoc);
-
-
-   glBindBufferOffsetNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, dst, 0);
-   GLint varyings[4]= {
-  posVaryingLoc,
-  orig_tVaryingLoc,
-  dir_idxVaryingLoc,
-  uv_stateVaryingLoc
-   };
-   // I think it will be a performance win to use multiple buffer objects to 
write to
-   // instead of using the interleaved mode.
-   glTransformFeedbackVaryingsNV(program, 4, varyings, 
GL_INTERLEAVED_ATTRIBS_NV);
-
printf(%d\n, i);
//gs.fpwQuery-beginQuery();
//gs.pgQuery-beginQuery();
-   glBindBufferBaseNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, dst);
-   glBeginQuery(GL_PRIMITIVES_GENERATED_NV, pgQuery);
-   glBeginTransformFeedbackNV(GL_POINTS);
+   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, dst);
+   glBeginQuery(GL_PRIMITIVES_GENERATED, pgQuery);
+   glBeginTransformFeedback(GL_POINTS);
//gs.eyeRaysAsPoints-bindAs(ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, eyeRaysAsPoints);
{
@@ -653,9 +631,9 @@ Draw(void)
   //gs.gs-set_uniform(emitNoMore, 1, 0);
   glUniform1i(glGetUniformLocation(program, emitNoMore), 0);
 
-  //glEnable(GL_RASTERIZER_DISCARD_NV);
+  //glEnable(GL_RASTERIZER_DISCARD);
   glDrawArrays(GL_POINTS, 0, WinWidth*WinHeight);
-  //glDisable(GL_RASTERIZER_DISCARD_NV);
+  //glDisable(GL_RASTERIZER_DISCARD);
 
   glDisableVertexAttribArray(uv_stateAttribLoc);
 
@@ -667,16 +645,16 @@ Draw(void)
}
//gs.eyeRaysAsPoints-unbindAs(ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, 0);
-   glEndTransformFeedbackNV();
+   glEndTransformFeedback();
//gs.pgQuery-endQuery();
-   glEndQuery(GL_PRIMITIVES_GENERATED_NV);
+   glEndQuery(GL_PRIMITIVES_GENERATED);
//gs.fpwQuery-endQuery();
 
psoLog(LOG_RAW)  1st:   gs.fpwQuery-getQueryResult()  ,   
gs.pgQuery-getQueryResult()  \n;
 
 
swap(src, dst);
-   glBindBufferBaseNV(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0, 0);
+   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, 0);
 
clear();
 
@@ -777,15 +755,15 @@ Reshape(int width, int height)
 
{
   size_t nElem = WinWidth*WinHeight*nRayGens;
-  glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV, dst);
-  glBufferData(GL_TRANSFORM_FEEDBACK_BUFFER_NV, nElem*sizeof(GSRay), 0, 
GL_STREAM_DRAW);
-  GSRay* d = (GSRay*)glMapBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 
GL_READ_WRITE);
+  glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER, dst);
+  glBufferData(GL_TRANSFORM_FEEDBACK_BUFFER, nElem*sizeof(GSRay), 0, 
GL_STREAM_DRAW);
+  GSRay* d = (GSRay*)glMapBuffer(GL_TRANSFORM_FEEDBACK_BUFFER, 
GL_READ_WRITE);
   for (size_t i = 0; i  nElem; i++)
   {
  d[i].dir_idx = vec4(0.0F, 0.0F, 0.0F, -1.0F);
   }
-  glUnmapBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV);
-  glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER_NV, 0);
+  glUnmapBuffer(GL_TRANSFORM_FEEDBACK_BUFFER);
+  glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER, 0);
   //printf(Ping-pong VBO size 2x%d Kbytes.\n, 
(int)nElem*sizeof(GSRay)/1024);
}
 
@@ -866,12 +844,30 @@ Init(void)
   exit(-1);
}
 
+   if (!GLEW_VERSION_3_0)
+   {
+  fprintf(stderr, OpenGL 3.0 (needed for transform feedback) not 
+  supported!\n);
+  exit(-1);
+   }
+
vertShader = CompileShaderText(GL_VERTEX_SHADER, vsSource);
geomShader = CompileShaderText(GL_GEOMETRY_SHADER_ARB, gsSource);
fragShader = CompileShaderText(GL_FRAGMENT_SHADER, fsSource);
program = LinkShaders3WithGeometryInfo(vertShader, geomShader, fragShader,
   3, 

[Mesa-dev] [PATCH 11/12] geom-outlining-150: Use core geometry shaders.

2014-02-19 Thread Fabian Bieler
Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/geom-outlining-150.c | 26 --
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/src/glsl/geom-outlining-150.c b/src/glsl/geom-outlining-150.c
index 0bc20f0..3dffa16 100644
--- a/src/glsl/geom-outlining-150.c
+++ b/src/glsl/geom-outlining-150.c
@@ -256,7 +256,8 @@ Init(void)
   } \n;
static const char *geomShaderText =
   #version 150 \n
-  #extension GL_ARB_geometry_shader4: enable \n
+  layout(triangles) in; \n
+  layout(triangle_strip, max_vertices = 3) out; \n
   uniform vec2 ViewportSize; \n
   out vec2 Vert0, Vert1, Vert2; \n
   \n
@@ -271,11 +272,11 @@ Init(void)
  Vert0 = vpxform(gl_in[0].gl_Position); \n
  Vert1 = vpxform(gl_in[1].gl_Position); \n
  Vert2 = vpxform(gl_in[2].gl_Position); \n
- gl_Position = gl_PositionIn[0]; \n
+ gl_Position = gl_in[0].gl_Position; \n
  EmitVertex(); \n
- gl_Position = gl_PositionIn[1]; \n
+ gl_Position = gl_in[1].gl_Position; \n
  EmitVertex(); \n
- gl_Position = gl_PositionIn[2]; \n
+ gl_Position = gl_in[2].gl_Position; \n
  EmitVertex(); \n
   } \n;
static const char *fragShaderText =
@@ -309,15 +310,14 @@ Init(void)
if (!ShadersSupported())
   exit(1);
 
-   version = glGetString(GL_VERSION);
-   if (version[0] * 10 + version[2]  32) {
+   if (!GLEW_VERSION_3_2) {
   fprintf(stderr, Sorry, OpenGL 3.2 or later required.\n);
   exit(1);
}
 
VertShader = CompileShaderText(GL_VERTEX_SHADER, vertShaderText);
FragShader = CompileShaderText(GL_FRAGMENT_SHADER, fragShaderText);
-   GeomShader = CompileShaderText(GL_GEOMETRY_SHADER_ARB, geomShaderText);
+   GeomShader = CompileShaderText(GL_GEOMETRY_SHADER, geomShaderText);
 
Program = LinkShaders3(VertShader, GeomShader, FragShader);
assert(Program);
@@ -326,18 +326,8 @@ Init(void)
glBindAttribLocation(Program, 0, Vertex);
glBindFragDataLocation(Program, 0, FragColor);
 
-   /*
-* The geometry shader will receive and emit triangles.
-*/
-   glProgramParameteriARB(Program, GL_GEOMETRY_INPUT_TYPE_ARB,
-  GL_TRIANGLES);
-   glProgramParameteriARB(Program, GL_GEOMETRY_OUTPUT_TYPE_ARB,
-  GL_TRIANGLE_STRIP);
-   glProgramParameteriARB(Program,GL_GEOMETRY_VERTICES_OUT_ARB, 3);
-   CheckError(__LINE__);
-
/* relink */
-   glLinkProgramARB(Program);
+   glLinkProgram(Program);
 
assert(glIsProgram(Program));
assert(glIsShader(FragShader));
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] glsl/gsraytrace: Use GLSL 1.5 instead of 1.2.

2014-02-19 Thread Fabian Bieler
This commit prepares the transition from extension to core geometry shaders.
(Core geometry shaders require GLSL version 1.5 or later.)
This includes using generic vertex attributes instead of built-ins.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 58 +++--
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index ef67643..f9e708f 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -56,6 +56,7 @@ static GLuint pgQuery;
 static GLuint dst;
 static GLuint eyeRaysAsPoints;
 
+int posAttribLoc;
 int orig_tAttribLoc;
 int dir_idxAttribLoc;
 int uv_stateAttribLoc;
@@ -69,7 +70,7 @@ float rot[9] = {1,0,0,  0,1,0,   0,0,1};
 
 static const char* vsSource =
   \n
-#version 120  \n
+#version 150  \n
 #line  S__LINE__ \n
 #define SHADOWS   \n
 #define RECURSION \n
@@ -83,9 +84,10 @@ static const char* vsSource =
 uniform vec4 backgroundColor; \n
 uniform int emitNoMore;   \n
   \n
-attribute vec4 orig_t;\n
-attribute vec4 dir_idx;   \n
-attribute vec4 uv_state;  \n
+in vec4 pos;  \n
+in vec4 orig_t;   \n
+in vec4 dir_idx;  \n
+in vec4 uv_state; \n
 // uv_state.z = state \n
 // uv_state.w = type (ray generation) \n
   \n
@@ -98,9 +100,9 @@ static const char* vsSource =
 //0: not shadow ray, eye ray  \n
 //1: shadow ray   \n
   \n
-varying vec4 orig_t1; \n
-varying vec4 dir_idx1;\n
-varying vec4 uv_state1;   \n
+out vec4 orig_t1; \n
+out vec4 dir_idx1;\n
+out vec4 uv_state1;   \n
   \n
   \n
 //\n
@@ -224,7 +226,7 @@ static const char* vsSource =
   if (state == 0)
\n
   { 
\n
 // generate eye rays\n
-ray = Ray(cameraPos, normalize(vec3(gl_Vertex.x, gl_Vertex.y, -1.0) * 
rot3));\n
+ray = Ray(cameraPos, normalize(vec3(pos.x, pos.y, -1.0) * rot3));   \n
 isec.t = INF;\n
 isec.idx = -1;\n
 state = 1;\n
@@ -240,7 +242,7 @@ static const char* vsSource =
   //else state == 3 \n
 \n
   //outVS();\n
-  gl_Position  = gl_Vertex; \n
+  gl_Position  = pos;   \n
   orig_t1.xyz  = ray.orig;  \n
   orig_t1.w= isec.t;\n
   dir_idx1.xyz = ray.dir;   \n
@@ -251,7 +253,7 @@ static const char* vsSource =
 
 
 static const char* gsSource = 
-#version 120 \n
+#version 150 \n
 #line  S__LINE__ \n
 #extension GL_ARB_geometry_shader4: require  \n
  \n
@@ -310,13 +312,13 @@ static const char* gsSource =
   return isec;   \n
 }\n
  \n
-varying in vec4 orig_t1[1];

[Mesa-dev] [PATCH 09/12] glsl/gsraytrace: Switch to core profile.

2014-02-19 Thread Fabian Bieler
v2: Remove redundant 'core' in GLSL version statement.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/gsraytrace.cpp | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/glsl/gsraytrace.cpp b/src/glsl/gsraytrace.cpp
index 6df6543..44f2674 100644
--- a/src/glsl/gsraytrace.cpp
+++ b/src/glsl/gsraytrace.cpp
@@ -408,6 +408,7 @@ static const char* fsSource =
 uniform vec4 backgroundColor;\n
 uniform int emitNoMore;  \n
  \n
+out vec4 frag_color; \n
  \n
 //---\n
  \n
@@ -493,7 +494,7 @@ static const char* fsSource =
 Isec eyeHit = isec;\n
 if (eyeHit.idx == -1)\n
 {\n
-  gl_FragColor = vec4(backgroundColor.rgb, 0.0);\n
+  frag_color = vec4(backgroundColor.rgb, 0.0);\n
   return;\n
 }\n
 vec3 eyeHitPosition = eyeRay.orig + eyeRay.dir * eyeHit.t;\n
@@ -503,7 +504,7 @@ static const char* fsSource =
 vec3  L  = normalize(lightVec);   
  \n
 float NdotL  = max(dot(N, L), 0.0);   
  \n
 vec3 diffuse = idx2color(eyeHit.idx); // material color of the visible 
point\n
-gl_FragColor = vec4(diffuse * NdotL, 1.0);
  \n
+frag_color = vec4(diffuse * NdotL, 1.0);  
\n
 return;\n
   }\n
 #ifdef SHADOWS \n
@@ -514,7 +515,7 @@ static const char* fsSource =
 { \n
   discard;\n
 } \n
-gl_FragColor = vec4(-1,-1,-1, 0.0);   \n
+frag_color = vec4(-1,-1,-1, 0.0);   \n
 return;   \n
   }   \n
 #endif\n
@@ -534,7 +535,7 @@ static const char* fsSource =
 vec3  L  = normalize(lightVec);   \n
 float NdotL  = max(dot(N, L), 0.0);   \n
 vec3 diffuse = idx2color(reflHit.idx);\n
-gl_FragColor = vec4(diffuse * NdotL * 0.25, 1.0); // material color of 
the visible point\n
+frag_color = vec4(diffuse * NdotL * 0.25, 1.0); // material color of the 
visible point\n
 return;   \n
   }   \n
 #endif\n
@@ -608,6 +609,8 @@ Draw(void)
dir_idxAttribLoc = glGetAttribLocation(program, dir_idx);
uv_stateAttribLoc = glGetAttribLocation(program, uv_state);
 
+   glBindFragDataLocation(program, 0, frag_color);
+
printf(%d\n, i);
//gs.fpwQuery-beginQuery();
//gs.pgQuery-beginQuery();
@@ -755,10 +758,6 @@ Reshape(int width, int height)
WinWidth = width;
WinHeight = height;
glViewport(0, 0, width, height);
-   glMatrixMode(GL_PROJECTION);
-   glLoadIdentity();
-   glMatrixMode(GL_MODELVIEW);
-   glLoadIdentity();
 
{
   size_t nElem = WinWidth*WinHeight*nRayGens;
@@ -911,6 +910,10 @@ Init(void)
glGenBuffers(1, dst);
glGenBuffers(1, eyeRaysAsPoints);
 
+   GLuint vao;
+   glGenVertexArrays(1, vao);
+   glBindVertexArray(vao);
+
printf(\nESC = exit demo\nleft mouse + drag   = rotate 
camera\n\n);
 }
 
@@ -920,9 +923,24 @@ main(int argc, char *argv[])
 {
glutInitWindowSize(WinWidth, WinHeight);
glutInit(argc, argv);
+
+#ifdef HAVE_FREEGLUT
+   glutInitContextVersion(3, 2);
+   glutInitContextProfile(GLUT_CORE_PROFILE);
glutInitDisplayMode(GLUT_RGB | GLUT_DOUBLE | GLUT_DEPTH);
+#elif defined __APPLE__
+   glutInitDisplayMode(GLUT_3_2_CORE_PROFILE | GLUT_RGB | GLUT_DOUBLE | 
GLUT_DEPTH);
+#else
+   glutInitDisplayMode(GLUT_RGB | GLUT_DOUBLE | GLUT_DEPTH);
+#endif
Win = glutCreateWindow(argv[0]);
+
+   // glewInit requires glewExperimentel set to true for core profiles.
+   // Depending on the glew version it also generates GL_INVALID_ENUM.
+   glewExperimental = GL_TRUE;
glewInit();
+   glGetError();
+
glutReshapeFunc(Reshape);
  

[Mesa-dev] [PATCH 01/12] configure.ac: Check for freeglut.

2014-02-19 Thread Fabian Bieler
To get an OpenGL core profile context freeglut 2.6 or later is required.

Note that in spite of it's name HAVE_FREEGLUT is only defined if freeglut 2.6
(released in 2009) or later ist found.

Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 configure.ac | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/configure.ac b/configure.ac
index 0c38f4d..cd523c1 100644
--- a/configure.ac
+++ b/configure.ac
@@ -83,6 +83,12 @@ AC_CHECK_LIB([glut],
[],
[glut_enabled=no])
 
+dnl Check for FreeGLUT 2.6 or later
+AC_EGREP_HEADER([glutInitContextProfile],
+   [GL/freeglut.h],
+   [AC_DEFINE(HAVE_FREEGLUT)],
+   [])
+
 dnl Check for GLEW
 PKG_CHECK_MODULES(GLEW, [glew = 1.5.4])
 DEMO_CFLAGS=$DEMO_CFLAGS $GLEW_CFLAGS
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Tom Stellard
---
 src/gallium/auxiliary/util/u_math.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index b5e0663..49f8bda 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -741,6 +741,16 @@ util_bswap32(uint32_t n)
 #endif
 }
 
+/**
+ * Reverse byte order of a 64bit word.
+ */
+static INLINE uint64_t
+util_bswap64(uint64_t n)
+{
+   return ((uint64_t)util_bswap32(n  0x)  32) |
+  util_bswap32((n  32));
+}
+
 
 /**
  * Reverse byte order of a 16 bit word.
-- 
1.8.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] clover: Pass buffer offsets to the driver in set_global_binding() v2

2014-02-19 Thread Tom Stellard
The offsets will be stored in the handles parameter.  This makes
it possible to use sub-buffers.

v2:
  - Style fixes
  - Add support for constant sub-buffers
  - Store handles in device byte order
---
 src/gallium/drivers/r600/evergreen_compute.c  | 10 +-
 src/gallium/drivers/radeonsi/si_compute.c |  6 ++
 src/gallium/include/pipe/p_context.h  | 13 -
 src/gallium/state_trackers/clover/core/kernel.cpp | 16 +---
 4 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 70efe5c..efd7143 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -662,10 +662,18 @@ static void evergreen_set_global_binding(
 
for (int i = 0; i  n; i++)
{
+   uint32_t buffer_offset;
+   uint32_t handle;
assert(resources[i]-target == PIPE_BUFFER);
assert(resources[i]-bind  PIPE_BIND_GLOBAL);
 
-   *(handles[i]) = buffers[i]-chunk-start_in_dw * 4;
+   buffer_offset = util_le32_to_cpu(*(handles[i]));
+   handle = buffer_offset + buffers[i]-chunk-start_in_dw * 4;
+   if (R600_BIG_ENDIAN) {
+   handle = util_bswap32(handle);
+   }
+
+   *(handles[i]) = handle;
}
 
evergreen_set_rat(ctx-cs_shader_state.shader, 0, pool-bo, 0, 
pool-size_in_dw * 4);
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index a7f49e7..43d521b 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -107,8 +107,14 @@ static void si_set_global_binding(
 
for (i = first; i  first + n; i++) {
uint64_t va;
+   uint32_t offset;
program-global_buffers[i] = resources[i];
va = r600_resource_va(ctx-screen, resources[i]);
+   offset = util_le32_to_cpu(*handles[i]);
+   va += offset;
+   if (SI_BIG_ENDIAN) {
+   va = util_bswap64(va);
+   }
memcpy(handles[i], va, sizeof(va));
}
 }
diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 8ef6e27..209ec9e 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -460,11 +460,14 @@ struct pipe_context {
 *   unless it's NULL, in which case no new
 *   resources will be bound.
 * \param handlesarray of pointers to the memory locations that
-*   will be filled with the respective base
-*   addresses each buffer will be mapped to.  It
-*   should contain at least \a count elements,
-*   unless \a resources is NULL in which case \a
-*   handles should be NULL as well.
+*   will be updated with the address each buffer
+*   will be mapped to.  The base memory address of
+*   each of the buffers will be added to the value
+*   pointed to by its corresponding handle to form
+*   the final address argument.  It should contain
+*   at least \a count elements, unless \a
+*   resources is NULL in which case \a handles
+*   should be NULL as well.
 *
 * Note that the driver isn't required to make any guarantees about
 * the contents of the \a handles array being valid anytime except
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
b/src/gallium/state_trackers/clover/core/kernel.cpp
index 6d894cd..b4d555c 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -337,8 +337,17 @@ kernel::global_argument::bind(exec_context ctx,
align(ctx.input, marg.target_align);
 
if (buf) {
-  ctx.g_handles.push_back(allocate(ctx.input, marg.target_size));
-  ctx.g_buffers.push_back(buf-resource(*ctx.q).pipe);
+  const resource r = buf-resource(*ctx.q);
+  ctx.g_handles.push_back(ctx.input.size());
+  ctx.g_buffers.push_back(r.pipe);
+
+  // How to handle multi-demensional offsets?
+  // We don't need to.  Buffer offsets are always
+  // one-dimensional.
+  auto v = bytes(r.offset[0]);
+  extend(v, marg.ext_type, marg.target_size);
+  byteswap(v, ctx.q-dev.endianness());
+  insert(ctx.input, v);
} else {
   // Null pointer.
   allocate(ctx.input, marg.target_size);
@@ -395,7 +404,8 @@ kernel::constant_argument::bind(exec_context ctx,
align(ctx.input, marg.target_align);
 
if (buf) {
-  auto v = bytes(ctx.resources.size()  24);
+  const resource r = 

[Mesa-dev] [PATCH 2/3] radeonsi: Use SI_BIG_ENDIAN now that it exists

2014-02-19 Thread Tom Stellard
---
 src/gallium/drivers/radeonsi/si_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 54270cd..9fed751 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2333,7 +2333,7 @@ int si_compile_llvm(struct si_context *sctx, struct 
si_pipe_shader *shader,
}
 
ptr = (uint32_t*)sctx-b.ws-buffer_map(shader-bo-cs_buf, 
sctx-b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
-   if (0 /*SI_BIG_ENDIAN*/) {
+   if (SI_BIG_ENDIAN) {
for (i = 0; i  binary.code_size / 4; ++i) {
ptr[i] = util_bswap32(*(uint32_t*)(binary.code + i*4));
}
-- 
1.8.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/15] mesa/sso: Implement _mesa_ActiveShaderProgram

2014-02-19 Thread Ian Romanick
On 02/19/2014 01:44 PM, Jordan Justen wrote:
 On Fri, Feb 7, 2014 at 10:00 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Gregory Hainaut gregory.hain...@gmail.com

 This was originally included in another patch, but it was split out by
 Ian Romanick.

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 ---
  src/mesa/main/pipelineobj.c | 24 
  1 file changed, 24 insertions(+)

 diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
 index b47dc7a..6e490bd 100644
 --- a/src/mesa/main/pipelineobj.c
 +++ b/src/mesa/main/pipelineobj.c
 @@ -227,6 +227,30 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
 stages, GLuint program)
  void GLAPIENTRY
  _mesa_ActiveShaderProgram(GLuint pipeline, GLuint program)
  {
 +   GET_CURRENT_CONTEXT(ctx);
 +   struct gl_shader_program *shProg = (program != 0)
 +  ? _mesa_lookup_shader_program_err(ctx, program, 
 glActiveShaderProgram(program))
 +  : NULL;
 
 Seems like if/else would be more clear for this part.
 
 If _mesa_lookup_shader_program_err returns NULL, should we exit early?

Yes.  Good catch.  We should also have a piglit test for this.  I don't
think there is one already.

 - Bind a valid program to the pipeline.
 - Try to bind a non-existant, non-zero program.
 - Verify the error is generated.
 - Verify that old program is still bound to the pipeline.

 -Jordan
 
 +   struct gl_pipeline_object *pipe = lookup_pipeline_object(ctx, pipeline);
 +
 +   if (!pipe) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION, 
 glActiveShaderProgram(pipeline));
 +  return;
 +   }
 +
 +   /* Object is created by any Pipeline call but glGenProgramPipelines,
 +* glIsProgramPipeline and GetProgramPipelineInfoLog
 +*/
 +   pipe-EverBound = GL_TRUE;
 +
 +   if ((shProg != NULL)  !shProg-LinkStatus) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION,
 +glActiveShaderProgram(program %u not linked), shProg-Name);
 +  return;
 +   }
 +
 +   _mesa_reference_shader_program(ctx, pipe-ActiveProgram, shProg);
  }

  /**
 --
 1.8.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Alex Deucher
On Wed, Feb 19, 2014 at 6:09 PM, Tom Stellard thomas.stell...@amd.com wrote:
 ---
  src/gallium/auxiliary/util/u_math.h | 10 ++
  1 file changed, 10 insertions(+)

For the series:

Reviewed-by: Alex Deucher alexander.deuc...@amd.com


 diff --git a/src/gallium/auxiliary/util/u_math.h 
 b/src/gallium/auxiliary/util/u_math.h
 index b5e0663..49f8bda 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -741,6 +741,16 @@ util_bswap32(uint32_t n)
  #endif
  }

 +/**
 + * Reverse byte order of a 64bit word.
 + */
 +static INLINE uint64_t
 +util_bswap64(uint64_t n)
 +{
 +   return ((uint64_t)util_bswap32(n  0x)  32) |
 +  util_bswap32((n  32));
 +}
 +

  /**
   * Reverse byte order of a 16 bit word.
 --
 1.8.1.4


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Ilia Mirkin
On Wed, Feb 19, 2014 at 6:09 PM, Tom Stellard thomas.stell...@amd.com wrote:
 ---
  src/gallium/auxiliary/util/u_math.h | 10 ++
  1 file changed, 10 insertions(+)

 diff --git a/src/gallium/auxiliary/util/u_math.h 
 b/src/gallium/auxiliary/util/u_math.h
 index b5e0663..49f8bda 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -741,6 +741,16 @@ util_bswap32(uint32_t n)
  #endif
  }

 +/**
 + * Reverse byte order of a 64bit word.
 + */
 +static INLINE uint64_t
 +util_bswap64(uint64_t n)
 +{
 +   return ((uint64_t)util_bswap32(n  0x)  32) |
 +  util_bswap32((n  32));

Perhaps use __builtin_bswap64 if it's available? Not sure when it
became available though.

 +}
 +

  /**
   * Reverse byte order of a 16 bit word.
 --
 1.8.1.4


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Matt Turner
On Wed, Feb 19, 2014 at 3:32 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Wed, Feb 19, 2014 at 6:09 PM, Tom Stellard thomas.stell...@amd.com wrote:
 +/**
 + * Reverse byte order of a 64bit word.
 + */
 +static INLINE uint64_t
 +util_bswap64(uint64_t n)
 +{
 +   return ((uint64_t)util_bswap32(n  0x)  32) |
 +  util_bswap32((n  32));

 Perhaps use __builtin_bswap64 if it's available? Not sure when it
 became available though.

When I fixed up bswap stuff in the X server a few years ago, I
discovered that gcc was really good at detecting open-coded bswap, and
less good at recognizing when it could constant fold __builtin_bswap.

Do some experiments, but make sure your experiments include not using
__builtin_bswap32 in util_bswap32.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Francisco Jerez
Tom Stellard thomas.stell...@amd.com writes:

 ---
  src/gallium/auxiliary/util/u_math.h | 10 ++
  1 file changed, 10 insertions(+)

 diff --git a/src/gallium/auxiliary/util/u_math.h 
 b/src/gallium/auxiliary/util/u_math.h
 index b5e0663..49f8bda 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -741,6 +741,16 @@ util_bswap32(uint32_t n)
  #endif
  }
  
 +/**
 + * Reverse byte order of a 64bit word.
 + */
 +static INLINE uint64_t
 +util_bswap64(uint64_t n)
 +{
 +   return ((uint64_t)util_bswap32(n  0x)  32) |
 +  util_bswap32((n  32));
 +}
 +
  
  /**
   * Reverse byte order of a 16 bit word.
 -- 
 1.8.1.4

Reviewed-by: Francisco Jerez curroje...@riseup.net


pgpZjfBcsW_4n.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] clover: Pass buffer offsets to the driver in set_global_binding() v2

2014-02-19 Thread Francisco Jerez
Tom Stellard thomas.stell...@amd.com writes:

 The offsets will be stored in the handles parameter.  This makes
 it possible to use sub-buffers.

 v2:
   - Style fixes
   - Add support for constant sub-buffers
   - Store handles in device byte order
 ---
  src/gallium/drivers/r600/evergreen_compute.c  | 10 +-
  src/gallium/drivers/radeonsi/si_compute.c |  6 ++
  src/gallium/include/pipe/p_context.h  | 13 -
  src/gallium/state_trackers/clover/core/kernel.cpp | 16 +---
  4 files changed, 36 insertions(+), 9 deletions(-)

 diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
 b/src/gallium/drivers/r600/evergreen_compute.c
 index 70efe5c..efd7143 100644
 --- a/src/gallium/drivers/r600/evergreen_compute.c
 +++ b/src/gallium/drivers/r600/evergreen_compute.c
 @@ -662,10 +662,18 @@ static void evergreen_set_global_binding(
  
   for (int i = 0; i  n; i++)
   {
 + uint32_t buffer_offset;
 + uint32_t handle;
   assert(resources[i]-target == PIPE_BUFFER);
   assert(resources[i]-bind  PIPE_BIND_GLOBAL);
  
 - *(handles[i]) = buffers[i]-chunk-start_in_dw * 4;
 + buffer_offset = util_le32_to_cpu(*(handles[i]));
 + handle = buffer_offset + buffers[i]-chunk-start_in_dw * 4;
 + if (R600_BIG_ENDIAN) {
 + handle = util_bswap32(handle);
 + }
 +
 + *(handles[i]) = handle;

I guess you could just do *(handles[i]) = util_cpu_to_le32(handle)?
Oh, right, there isn't such a function -- though it would be trivial to
implement.

   }
  
   evergreen_set_rat(ctx-cs_shader_state.shader, 0, pool-bo, 0, 
 pool-size_in_dw * 4);
 diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
 b/src/gallium/drivers/radeonsi/si_compute.c
 index a7f49e7..43d521b 100644
 --- a/src/gallium/drivers/radeonsi/si_compute.c
 +++ b/src/gallium/drivers/radeonsi/si_compute.c
 @@ -107,8 +107,14 @@ static void si_set_global_binding(
  
   for (i = first; i  first + n; i++) {
   uint64_t va;
 + uint32_t offset;
   program-global_buffers[i] = resources[i];
   va = r600_resource_va(ctx-screen, resources[i]);
 + offset = util_le32_to_cpu(*handles[i]);
 + va += offset;
 + if (SI_BIG_ENDIAN) {
 + va = util_bswap64(va);
 + }
   memcpy(handles[i], va, sizeof(va));
   }
  }
 diff --git a/src/gallium/include/pipe/p_context.h 
 b/src/gallium/include/pipe/p_context.h
 index 8ef6e27..209ec9e 100644
 --- a/src/gallium/include/pipe/p_context.h
 +++ b/src/gallium/include/pipe/p_context.h
 @@ -460,11 +460,14 @@ struct pipe_context {
  *   unless it's NULL, in which case no new
  *   resources will be bound.
  * \param handlesarray of pointers to the memory locations that
 -*   will be filled with the respective base
 -*   addresses each buffer will be mapped to.  It
 -*   should contain at least \a count elements,
 -*   unless \a resources is NULL in which case \a
 -*   handles should be NULL as well.
 +*   will be updated with the address each buffer
 +*   will be mapped to.  The base memory address of
 +*   each of the buffers will be added to the value
 +*   pointed to by its corresponding handle to form
 +*   the final address argument.  It should contain
 +*   at least \a count elements, unless \a
 +*   resources is NULL in which case \a handles
 +*   should be NULL as well.
  *
  * Note that the driver isn't required to make any guarantees about
  * the contents of the \a handles array being valid anytime except
 diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
 b/src/gallium/state_trackers/clover/core/kernel.cpp
 index 6d894cd..b4d555c 100644
 --- a/src/gallium/state_trackers/clover/core/kernel.cpp
 +++ b/src/gallium/state_trackers/clover/core/kernel.cpp
 @@ -337,8 +337,17 @@ kernel::global_argument::bind(exec_context ctx,
 align(ctx.input, marg.target_align);
  
 if (buf) {
 -  ctx.g_handles.push_back(allocate(ctx.input, marg.target_size));
 -  ctx.g_buffers.push_back(buf-resource(*ctx.q).pipe);
 +  const resource r = buf-resource(*ctx.q);
 +  ctx.g_handles.push_back(ctx.input.size());
 +  ctx.g_buffers.push_back(r.pipe);
 +
 +  // How to handle multi-demensional offsets?
 +  // We don't need to.  Buffer offsets are always
 +  // one-dimensional.
 +  auto v = bytes(r.offset[0]);
 +  extend(v, marg.ext_type, marg.target_size);
 +  byteswap(v, ctx.q-dev.endianness());
 +  insert(ctx.input, v);
 } else {
// Null pointer.
 

[Mesa-dev] [PATCH] i965: Implement a CS stall workaround on Broadwell.

2014-02-19 Thread Kenneth Graunke
According to the latest documentation, any PIPE_CONTROL with the
Command Streamer Stall bit set must also have another bit set,
with five different options:

   - Render Target Cache Flush
   - Depth Cache Flush
   - Stall at Pixel Scoreboard
   - Post-Sync Operation
   - Depth Stall

I chose Stall at Pixel Scoreboard since we've used it effectively
in the past, but the choice is fairly arbitrary.

Implementing this in the PIPE_CONTROL emit helpers ensures that the
workaround will always take effect when it ought to.

Apparently, this workaround may be necessary on older hardware as well;
for now I've only added it to Broadwell as it's absolutely necessary
there.  Subsequent patches could add it to older platforms, provided
someone tests it there.

v2: Only flag Stall at Pixel Scoreboard when none of the other bits
are set (suggested by Ian Romanick).

Cc: Ian Romanick i...@freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 36 +++
 1 file changed, 36 insertions(+)

Sure, that seems reasonable, Ian.  I've updated the patch to only
add stall at scoreboard when one of the other bits isn't already present.

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 4624268..bdb7b6b 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -432,6 +432,38 @@ intel_batchbuffer_data(struct brw_context *brw,
 }
 
 /**
+ * According to the latest documentation, any PIPE_CONTROL with the
+ * Command Streamer Stall bit set must also have another bit set,
+ * with five different options:
+ *
+ *  - Render Target Cache Flush
+ *  - Depth Cache Flush
+ *  - Stall at Pixel Scoreboard
+ *  - Post-Sync Operation
+ *  - Depth Stall
+ *
+ * I chose Stall at Pixel Scoreboard since we've used it effectively
+ * in the past, but the choice is fairly arbitrary.
+ */
+static void
+add_cs_stall_workaround_bits(uint32_t *flags)
+{
+   uint32_t wa_bits = PIPE_CONTROL_WRITE_FLUSH |
+  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+  PIPE_CONTROL_WRITE_IMMEDIATE |
+  PIPE_CONTROL_WRITE_DEPTH_COUNT |
+  PIPE_CONTROL_WRITE_TIMESTAMP |
+  PIPE_CONTROL_STALL_AT_SCOREBOARD |
+  PIPE_CONTROL_DEPTH_STALL;
+
+   /* If we're doing a CS stall, and don't already have one of the
+* workaround bits set, add Stall at Pixel Scoreboard.
+*/
+   if ((*flags  PIPE_CONTROL_CS_STALL) != 0  (*flags  wa_bits) == 0)
+  *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
+}
+
+/**
  * Emit a PIPE_CONTROL with various flushing flags.
  *
  * The caller is responsible for deciding what flags are appropriate for the
@@ -441,6 +473,8 @@ void
 brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
 {
if (brw-gen = 8) {
+  add_cs_stall_workaround_bits(flags);
+
   BEGIN_BATCH(6);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
   OUT_BATCH(flags);
@@ -481,6 +515,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, 
uint32_t flags,
 uint32_t imm_lower, uint32_t imm_upper)
 {
if (brw-gen = 8) {
+  add_cs_stall_workaround_bits(flags);
+
   BEGIN_BATCH(6);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
   OUT_BATCH(flags);
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glcpp: Only warn for macro names containing __

2014-02-19 Thread Darius Spitznagel

Am 19.02.2014 20:09, schrieb Ian Romanick:

I'm hoping that Tapani or Darius will verify that this patch actually
fixes the problem.  That's why people CC other people on patches. :)

On 02/18/2014 10:19 AM, Ian Romanick wrote:

From: Ian Romanick ian.d.roman...@intel.com

Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:

 All macro names containing two consecutive underscores ( __ ) are
 reserved for future use as predefined macro names. All macro names
 prefixed with GL_ (GL followed by a single underscore) are also
 reserved.

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error.  Names simply
containing __ are dangerous to use, but should be allowed.  In similar
cases, the C++ preprocessor specification says, no diagnostic is
required.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org
Cc: Tapani Pälli lem...@gmail.com
Cc: Kenneth Graunke kenn...@whitecape.org
Cc: Darius Spitznagel d.spitzna...@goodbytez.de
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
---
  src/glsl/glcpp/glcpp-parse.y   | 22 +++---
  .../tests/086-reserved-macro-names.c.expected  |  4 ++--
  2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 5bb2891..bdc598f 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -1770,11 +1770,27 @@ static void
  _check_for_reserved_macro_name (glcpp_parser_t *parser, YYLTYPE *loc,
const char *identifier)
  {
-   /* According to the GLSL specification, macro names starting with __
-* or GL_ are reserved for future use.  So, don't allow them.
+   /* Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and
+* the GLSL ES spec (all versions) say:
+*
+* All macro names containing two consecutive underscores ( __ )
+* are reserved for future use as predefined macro names. All
+* macro names prefixed with GL_ (GL followed by a single
+* underscore) are also reserved.
+*
+* The intention is that names containing __ are reserved for internal
+* use by the implementation, and names prefixed with GL_ are reserved
+* for use by Khronos.  Since every extension adds a name prefixed
+* with GL_ (i.e., the name of the extension), that should be an
+* error.  Names simply containing __ are dangerous to use, but should
+* be allowed.
+*
+* A future version of the GLSL specification will clarify this.
 */
if (strstr(identifier, __)) {
-   glcpp_error (loc, parser, Macro names containing \__\ are 
reserved.\n);
+   glcpp_warning(loc, parser,
+ Macro names containing \__\ are reserved 
+ for use by the implementation.\n);
}
if (strncmp(identifier, GL_, 3) == 0) {
glcpp_error (loc, parser, Macro names starting with \GL_\ are 
reserved.\n);
diff --git a/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected 
b/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
index d8aa9f0..5ca42a9 100644
--- a/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
+++ b/src/glsl/glcpp/tests/086-reserved-macro-names.c.expected
@@ -1,8 +1,8 @@
-0:1(10): preprocessor error: Macro names containing __ are reserved.
+0:1(10): preprocessor warning: Macro names containing __ are reserved for 
use by the implementation.
  
  0:2(9): preprocessor error: Macro names starting with GL_ are reserved.
  
-0:3(9): preprocessor error: Macro names containing __ are reserved.

+0:3(9): preprocessor warning: Macro names containing __ are reserved for use 
by the implementation.
  
  
  



All three patches worked as expecting with mesa-10.0.3 which I use right 
now.

Thank you.

Kind regards
Darius

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH][RFC] dri3: Add support for the GLX_EXT_buffer_age extension

2014-02-19 Thread Adel Gadllah

Hi,

The attached patch adds support for the GLX_EXT_buffer_age extension, which is mostly used by compositors for efficient 
sub screen updates.


The extension should not be reported as supported when running DRI2 but it seems to show up when I try to disable it 
with LIBGL_DRI3_DISABLE ... not sure why suggestions welcome.



P.S: Please CC me when replying as I am not subscribed to the list.


From: Adel Gadllah adel.gadl...@gmail.com
Date: Sun, 16 Feb 2014 13:40:42 +0100
Subject: [PATCH] dri3: Add GLX_EXT_buffer_age support

---
 include/GL/glx.h  |  5 +
 include/GL/glxext.h   |  5 +
 src/glx/dri2_glx.c|  1 +
 src/glx/dri3_glx.c| 17 +
 src/glx/dri3_priv.h   |  2 ++
 src/glx/glx_pbuffer.c |  7 +++
 src/glx/glxclient.h   |  1 +
 src/glx/glxextensions.c   |  1 +
 src/glx/glxextensions.h   |  1 +
 src/mesa/drivers/x11/glxapi.c |  3 +++
 10 files changed, 43 insertions(+)

diff --git a/include/GL/glx.h b/include/GL/glx.h
index 234abc0..b8b4d75 100644
--- a/include/GL/glx.h
+++ b/include/GL/glx.h
@@ -161,6 +161,11 @@ extern C {
 #define GLX_SAMPLES 0x186a1 /*11*/


+/*
+ * GLX_EXT_buffer_age
+ */
+#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
+

 typedef struct __GLXcontextRec *GLXContext;
 typedef XID GLXPixmap;
diff --git a/include/GL/glxext.h b/include/GL/glxext.h
index 8c642f3..36e92dc 100644
--- a/include/GL/glxext.h
+++ b/include/GL/glxext.h
@@ -383,6 +383,11 @@ void glXReleaseTexImageEXT (Display *dpy, GLXDrawable 
drawable, int buffer);
 #define GLX_FLIP_COMPLETE_INTEL   0x8182
 #endif /* GLX_INTEL_swap_event */

+#ifndef GLX_EXT_buffer_age
+#define GLX_EXT_buffer_age 1
+#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
+#endif /* GLX_EXT_buffer_age */
+
 #ifndef GLX_MESA_agp_offset
 #define GLX_MESA_agp_offset 1
 typedef unsigned int ( *PFNGLXGETAGPOFFSETMESAPROC) (const void *pointer);
diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 67fe9c1..007f449 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1288,6 +1288,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
psp-waitForSBC = NULL;
psp-setSwapInterval = NULL;
psp-getSwapInterval = NULL;
+   psp-queryBufferAge = NULL;

if (pdp-driMinor = 2) {
   psp-getDrawableMSC = dri2DrawableGetMSC;
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 70ec057..07120e1 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1345,6 +1345,8 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
  target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - 
priv-recv_sbc);

   priv-buffers[buf_id]-busy = 1;
+  priv-buffers[buf_id]-last_swap = priv-swap_count;
+
   xcb_present_pixmap(c,
  priv-base.xDrawable,
  priv-buffers[buf_id]-pixmap,
@@ -1379,11 +1381,23 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   xcb_flush(c);
   if (priv-stamp)
  ++(*priv-stamp);
+
+   priv-swap_count++;
}

return ret;
 }

+static int
+dri3_query_buffer_age(__GLXDRIdrawable *pdraw)
+{
+  struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
+  int buf_id = DRI3_BACK_ID(priv-cur_back);
+  if (!priv-buffers[buf_id]-last_swap)
+return 0;
+  return priv-swap_count - priv-buffers[buf_id]-last_swap;
+}
+
 /** dri3_open
  *
  * Wrapper around xcb_dri3_open
@@ -1742,6 +1756,9 @@ dri3_create_screen(int screen, struct glx_display * priv)
psp-copySubBuffer = dri3_copy_sub_buffer;
__glXEnableDirectExtension(psc-base, GLX_MESA_copy_sub_buffer);

+   psp-queryBufferAge = dri3_query_buffer_age;
+   __glXEnableDirectExtension(psc-base, GLX_EXT_buffer_age);
+
free(driverName);
free(deviceName);

diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 1d124f8..d00440a 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -97,6 +97,7 @@ struct dri3_buffer {
uint32_t cpp;
uint32_t flags;
uint32_t width, height;
+   uint32_t last_swap;

enum dri3_buffer_typebuffer_type;
 };
@@ -184,6 +185,7 @@ struct dri3_drawable {
struct dri3_buffer *buffers[DRI3_NUM_BUFFERS];
int cur_back;
int num_back;
+   uint32_t swap_count;

uint32_t *stamp;

diff --git a/src/glx/glx_pbuffer.c b/src/glx/glx_pbuffer.c
index 411d6e5..a87a0a4 100644
--- a/src/glx/glx_pbuffer.c
+++ b/src/glx/glx_pbuffer.c
@@ -365,6 +365,13 @@ GetDrawableAttribute(Display * dpy, GLXDrawable drawable,
 #if defined(GLX_DIRECT_RENDERING)  !defined(GLX_USE_APPLEGL)
  {
 __GLXDRIdrawable *pdraw = GetGLXDRIDrawable(dpy, drawable);
+struct glx_screen *psc = pdraw-psc;
+
+if (attribute == GLX_BACK_BUFFER_AGE_EXT  pdraw != NULL 
+psc-driScreen-queryBufferAge != NULL) {
+
+*value = psc-driScreen-queryBufferAge (pdraw);
+}

 if 

Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/19/2014 12:08 PM, Kenneth Graunke wrote:
 On 02/18/2014 09:48 PM, Chia-I Wu wrote:
 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is
 not advertised for OpenGL ES contexts due to the terrible
 performance.  It is still enabled for desktop GL because it was
 believed GL 3.0+ requires Z16.
 
 It turns out only GL 3.0 requires Z16, and that is corrected in
 later GL versions.  In light of that, and per Ian's suggestion,
 stop advertising Z16 support by default, and add a drirc option,
 gl30_sized_format_rules, so that users can override.
 
 I actually don't think that GL 3.0 requires Z16, either.
 
 In glspec30.20080923.pdf, page 180, it says: [...] memory
 allocation per texture component is assigned by the GL to match the
 allocations listed in tables 3.16-3.18 as closely as possible. 
 [...]
 
 Required Texture Formats [...] In addition, implementations are
 required to support the following sized internal formats.
 Requesting one of these internal formats for any texture type will
 allocate exactly the internal component sizes and types shown for
 that format in tables 3.16-3.17:
 
 Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table
 3.16 or table 3.17.  It appears in table 3.18, where the exact
 rule doesn't apply, and thus we fall back to the closely as
 possible rule.
 
 The confusing part is that the ordering of the tables in the PDF
 is:
 
 Table 3.16 (pages 182-184) Table 3.18 (bottom of page 184 to top of
 185) Table 3.17 (page 185)
 
 I'm guessing that people saw table 3.16, then saw the one after
 with DEPTH_COMPONENT* formats, and assumed it was 3.17.  But it's
 not.

Yay latex!  Thank you for putting things in random order because it
fit better. :(

 I think we should just drop Z16 support entirely, and I think we
 should remove the requirement from the Piglit test.

If the test is wrong, and it sounds like it is, then I'm definitely in
favor of changing it.

The reason to have Z16 is low-bandwidth GPUs in resource constrained
environments.  If an app specifically asks for Z16, then there's a
non-zero (though possibly infinitesimal) probability they're doing it
for a reason.  For at least some platforms, isn't there just a
work-around to implement to fix the performance issue?  Doesn't the
performance issue only affect some platforms to begin with?

Maybe just change the check to

   ctx-TextureFormatSupported[MESA_FORMAT_Z_UNORM16] =
  ! platform has z16 performance issues;

 This regresses required-sized-texture-formats on GL 3.0.
 
 Signed-off-by: Chia-I Wu o...@lunarg.com Cc: Ian Romanick
 ian.d.roman...@intel.com --- 
 src/mesa/drivers/dri/i965/brw_context.c | 3 +++ 
 src/mesa/drivers/dri/i965/brw_context.h | 1 + 
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 7 --- 
 src/mesa/drivers/dri/i965/intel_screen.c| 4  4 files
 changed, 12 insertions(+), 3 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c
 b/src/mesa/drivers/dri/i965/brw_context.c index ffbdb94..8ecf80b
 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++
 b/src/mesa/drivers/dri/i965/brw_context.c @@ -553,6 +553,9 @@
 brw_process_driconf_options(struct brw_context *brw) 
 brw-disable_derivative_optimization = 
 driQueryOptionb(brw-optionCache,
 disable_derivative_optimization);
 
 +   brw-enable_z16 = +  driQueryOptionb(brw-optionCache,
 gl30_sized_format_rules); + brw-precompile =
 driQueryOptionb(brw-optionCache, shader_precompile);
 
 ctx-Const.ForceGLSLExtensionsWarn = diff --git
 a/src/mesa/drivers/dri/i965/brw_context.h
 b/src/mesa/drivers/dri/i965/brw_context.h index 98e90e2..fd10884
 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++
 b/src/mesa/drivers/dri/i965/brw_context.h @@ -1093,6 +1093,7 @@
 struct brw_context bool disable_throttling; bool precompile; bool
 disable_derivative_optimization; +   bool enable_z16;
 
 driOptionCache optionCache; /** @} */ diff --git
 a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c index
 6a7e00a..1d5f044 100644 ---
 a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -623,10
 +623,11 @@ brw_init_surface_formats(struct brw_context *brw) *
 increased depth stalls from a cacheline-based heuristic for
 detecting * depth stalls. * -* However, desktop GL 3.0+
 require that you get exactly 16 bits when -* asking for
 DEPTH_COMPONENT16, so we have to respect that. +* However,
 desktop GL 3.0, and no other version, requires that you get +
 * exactly 16 bits when asking for DEPTH_COMPONENT16, so we have
 an drirc +* option to decide whether to respect that or not. 
 */ -   if (_mesa_is_desktop_gl(ctx)) +   if (brw-enable_z16) 
 ctx-TextureFormatSupported[MESA_FORMAT_Z_UNORM16] = true;
 
 /* On hardware that lacks support for ETC1, we map ETC1 to RGBX 
 diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
 

[Mesa-dev] [PATCH] st/omx/enc: add multi scaling buffers for performance improvement

2014-02-19 Thread Leo Liu
From: Leo Liu leoxs...@gmail.com

Signed-off-by: Leo Liu leoxs...@gmail.com
---
 src/gallium/state_trackers/omx/vid_enc.c | 39 
 src/gallium/state_trackers/omx/vid_enc.h |  7 --
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 6e65274..fcdb305 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -273,8 +273,9 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE 
*comp)
vl_compositor_cleanup_state(priv-cstate);
vl_compositor_cleanup(priv-compositor);
  
-   if (priv-scale_buffer)
- priv-scale_buffer-destroy(priv-scale_buffer);
+   for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i)
+  if (priv-scale_buffer[i])
+ priv-scale_buffer[i]-destroy(priv-scale_buffer[i]);
 
if (priv-s_pipe)
   priv-s_pipe-destroy(priv-s_pipe);
@@ -447,7 +448,8 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
OMX_COMPONENTTYPE *comp = handle;
vid_enc_PrivateType *priv = comp-pComponentPrivate;
OMX_ERRORTYPE r;
-
+   int i;
+
if (!config)
   return OMX_ErrorBadParameter;
  
@@ -473,11 +475,12 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
   if (scale-xWidth  176 || scale-xHeight  144)
  return OMX_ErrorBadParameter;
 
-  if (priv-scale_buffer) {
- priv-scale_buffer-destroy(priv-scale_buffer);
- priv-scale_buffer = NULL;
+  for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i) {
+ if (priv-scale_buffer[i]) {
+priv-scale_buffer[i]-destroy(priv-scale_buffer[i]);
+priv-scale_buffer[i] = NULL;
+ }
   }
-
   priv-scale = *scale;
   if (priv-scale.xWidth != 0x  priv-scale.xHeight != 
0x) {
  struct pipe_video_buffer templat = {};
@@ -487,9 +490,11 @@ static OMX_ERRORTYPE vid_enc_SetConfig(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE idx,
  templat.width = priv-scale.xWidth; 
  templat.height = priv-scale.xHeight; 
  templat.interlaced = false;
- priv-scale_buffer = priv-s_pipe-create_video_buffer(priv-s_pipe, 
templat);
- if (!priv-scale_buffer)
-return OMX_ErrorInsufficientResources;
+ for (i = 0; i  OMX_VID_ENC_NUM_SCALING_BUFFERS; ++i) {
+priv-scale_buffer[i] = 
priv-s_pipe-create_video_buffer(priv-s_pipe, templat);
+if (!priv-scale_buffer[i])
+   return OMX_ErrorInsufficientResources;
+ }
   }
 
   break;
@@ -545,8 +550,10 @@ static OMX_ERRORTYPE 
vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq
  templat.profile = PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE;
  templat.entrypoint = PIPE_VIDEO_ENTRYPOINT_ENCODE;
  templat.chroma_format = PIPE_VIDEO_CHROMA_FORMAT_420;
- templat.width = priv-scale_buffer ? priv-scale.xWidth : 
port-sPortParam.format.video.nFrameWidth;
- templat.height = priv-scale_buffer ? priv-scale.xHeight : 
port-sPortParam.format.video.nFrameHeight;
+ templat.width = priv-scale_buffer[priv-current_scale_buffer] ?
+priv-scale.xWidth : 
port-sPortParam.format.video.nFrameWidth;
+ templat.height = priv-scale_buffer[priv-current_scale_buffer] ?
+priv-scale.xHeight : 
port-sPortParam.format.video.nFrameHeight;
  templat.max_references = 1;
 
  priv-codec = priv-s_pipe-create_video_codec(priv-s_pipe, 
templat);
@@ -736,7 +743,7 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
 
/* -- scale input image - */
 
-   if (priv-scale_buffer) {
+   if (priv-scale_buffer[priv-current_scale_buffer]) {
   struct vl_compositor *compositor = priv-compositor;
   struct vl_compositor_state *s = priv-cstate;
   struct pipe_sampler_view **views;
@@ -744,7 +751,8 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
   unsigned i;
 
   views = vbuf-get_sampler_view_planes(vbuf);
-  dst_surface = priv-scale_buffer-get_surfaces(priv-scale_buffer);
+  dst_surface = 
priv-scale_buffer[priv-current_scale_buffer]-get_surfaces
+   (priv-scale_buffer[priv-current_scale_buffer]);
   vl_compositor_clear_layers(s);
 
   for (i = 0; i  VL_MAX_SURFACES; ++i) {
@@ -768,7 +776,8 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
   }
   
   size  = priv-scale.xWidth * priv-scale.xHeight * 2; 
-  vbuf = priv-scale_buffer; 
+  vbuf = priv-scale_buffer[priv-current_scale_buffer++];
+  priv-current_scale_buffer %= OMX_VID_ENC_NUM_SCALING_BUFFERS;
}
 
priv-s_pipe-flush(priv-s_pipe, NULL, 0);
diff --git a/src/gallium/state_trackers/omx/vid_enc.h 

Re: [Mesa-dev] [PATCH][RFC] dri3: Add support for the GLX_EXT_buffer_age extension

2014-02-19 Thread Ilia Mirkin
On Wed, Feb 19, 2014 at 5:49 PM, Adel Gadllah adel.gadl...@gmail.com wrote:
 Hi,

 The attached patch adds support for the GLX_EXT_buffer_age extension, which
 is mostly used by compositors for efficient sub screen updates.

 The extension should not be reported as supported when running DRI2 but it
 seems to show up when I try to disable it with LIBGL_DRI3_DISABLE ... not
 sure why suggestions welcome.


 P.S: Please CC me when replying as I am not subscribed to the list.


 From: Adel Gadllah adel.gadl...@gmail.com
 Date: Sun, 16 Feb 2014 13:40:42 +0100
 Subject: [PATCH] dri3: Add GLX_EXT_buffer_age support

 ---
  include/GL/glx.h  |  5 +
  include/GL/glxext.h   |  5 +
  src/glx/dri2_glx.c|  1 +
  src/glx/dri3_glx.c| 17 +
  src/glx/dri3_priv.h   |  2 ++
  src/glx/glx_pbuffer.c |  7 +++
  src/glx/glxclient.h   |  1 +
  src/glx/glxextensions.c   |  1 +
  src/glx/glxextensions.h   |  1 +
  src/mesa/drivers/x11/glxapi.c |  3 +++
  10 files changed, 43 insertions(+)

 diff --git a/include/GL/glx.h b/include/GL/glx.h
 index 234abc0..b8b4d75 100644
 --- a/include/GL/glx.h
 +++ b/include/GL/glx.h
 @@ -161,6 +161,11 @@ extern C {
  #define GLX_SAMPLES 0x186a1 /*11*/


 +/*
 + * GLX_EXT_buffer_age
 + */
 +#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
 +

  typedef struct __GLXcontextRec *GLXContext;
  typedef XID GLXPixmap;
 diff --git a/include/GL/glxext.h b/include/GL/glxext.h
 index 8c642f3..36e92dc 100644
 --- a/include/GL/glxext.h
 +++ b/include/GL/glxext.h
 @@ -383,6 +383,11 @@ void glXReleaseTexImageEXT (Display *dpy, GLXDrawable
 drawable, int buffer);
  #define GLX_FLIP_COMPLETE_INTEL   0x8182
  #endif /* GLX_INTEL_swap_event */

 +#ifndef GLX_EXT_buffer_age
 +#define GLX_EXT_buffer_age 1
 +#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
 +#endif /* GLX_EXT_buffer_age */
 +
  #ifndef GLX_MESA_agp_offset
  #define GLX_MESA_agp_offset 1
  typedef unsigned int ( *PFNGLXGETAGPOFFSETMESAPROC) (const void *pointer);
 diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
 index 67fe9c1..007f449 100644
 --- a/src/glx/dri2_glx.c
 +++ b/src/glx/dri2_glx.c
 @@ -1288,6 +1288,7 @@ dri2CreateScreen(int screen, struct glx_display *
 priv)
 psp-waitForSBC = NULL;
 psp-setSwapInterval = NULL;
 psp-getSwapInterval = NULL;
 +   psp-queryBufferAge = NULL;

 if (pdp-driMinor = 2) {
psp-getDrawableMSC = dri2DrawableGetMSC;
 diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
 index 70ec057..07120e1 100644
 --- a/src/glx/dri3_glx.c
 +++ b/src/glx/dri3_glx.c
 @@ -1345,6 +1345,8 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t
 target_msc, int64_t divisor,
   target_msc = priv-msc + priv-swap_interval * (priv-send_sbc -
 priv-recv_sbc);

priv-buffers[buf_id]-busy = 1;
 +  priv-buffers[buf_id]-last_swap = priv-swap_count;
 +
xcb_present_pixmap(c,
   priv-base.xDrawable,
   priv-buffers[buf_id]-pixmap,
 @@ -1379,11 +1381,23 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t
 target_msc, int64_t divisor,
xcb_flush(c);
if (priv-stamp)
   ++(*priv-stamp);
 +
 +   priv-swap_count++;
 }

 return ret;
  }

 +static int
 +dri3_query_buffer_age(__GLXDRIdrawable *pdraw)
 +{
 +  struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
 +  int buf_id = DRI3_BACK_ID(priv-cur_back);
 +  if (!priv-buffers[buf_id]-last_swap)
 +return 0;
 +  return priv-swap_count - priv-buffers[buf_id]-last_swap;
 +}
 +
  /** dri3_open
   *
   * Wrapper around xcb_dri3_open
 @@ -1742,6 +1756,9 @@ dri3_create_screen(int screen, struct glx_display *
 priv)
 psp-copySubBuffer = dri3_copy_sub_buffer;
 __glXEnableDirectExtension(psc-base, GLX_MESA_copy_sub_buffer);

 +   psp-queryBufferAge = dri3_query_buffer_age;
 +   __glXEnableDirectExtension(psc-base, GLX_EXT_buffer_age);
 +
 free(driverName);
 free(deviceName);

 diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
 index 1d124f8..d00440a 100644
 --- a/src/glx/dri3_priv.h
 +++ b/src/glx/dri3_priv.h
 @@ -97,6 +97,7 @@ struct dri3_buffer {
 uint32_t cpp;
 uint32_t flags;
 uint32_t width, height;
 +   uint32_t last_swap;

 enum dri3_buffer_typebuffer_type;
  };
 @@ -184,6 +185,7 @@ struct dri3_drawable {
 struct dri3_buffer *buffers[DRI3_NUM_BUFFERS];
 int cur_back;
 int num_back;
 +   uint32_t swap_count;

 uint32_t *stamp;

 diff --git a/src/glx/glx_pbuffer.c b/src/glx/glx_pbuffer.c
 index 411d6e5..a87a0a4 100644
 --- a/src/glx/glx_pbuffer.c
 +++ b/src/glx/glx_pbuffer.c
 @@ -365,6 +365,13 @@ GetDrawableAttribute(Display * dpy, GLXDrawable
 drawable,
  #if defined(GLX_DIRECT_RENDERING)  !defined(GLX_USE_APPLEGL)
   {
  __GLXDRIdrawable *pdraw = GetGLXDRIDrawable(dpy, drawable);
 +struct glx_screen *psc = pdraw-psc;
 +
 +if 

Re: [Mesa-dev] [PATCH] i965: Implement a CS stall workaround on Broadwell.

2014-02-19 Thread Ian Romanick
Reviewed-by: Ian Romanick ian.d.roman...@intel.com

On 02/19/2014 04:28 PM, Kenneth Graunke wrote:
 According to the latest documentation, any PIPE_CONTROL with the
 Command Streamer Stall bit set must also have another bit set,
 with five different options:
 
- Render Target Cache Flush
- Depth Cache Flush
- Stall at Pixel Scoreboard
- Post-Sync Operation
- Depth Stall
 
 I chose Stall at Pixel Scoreboard since we've used it effectively
 in the past, but the choice is fairly arbitrary.
 
 Implementing this in the PIPE_CONTROL emit helpers ensures that the
 workaround will always take effect when it ought to.
 
 Apparently, this workaround may be necessary on older hardware as well;
 for now I've only added it to Broadwell as it's absolutely necessary
 there.  Subsequent patches could add it to older platforms, provided
 someone tests it there.
 
 v2: Only flag Stall at Pixel Scoreboard when none of the other bits
 are set (suggested by Ian Romanick).
 
 Cc: Ian Romanick i...@freedesktop.org
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 36 
 +++
  1 file changed, 36 insertions(+)
 
 Sure, that seems reasonable, Ian.  I've updated the patch to only
 add stall at scoreboard when one of the other bits isn't already present.
 
 diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
 b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
 index 4624268..bdb7b6b 100644
 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
 +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
 @@ -432,6 +432,38 @@ intel_batchbuffer_data(struct brw_context *brw,
  }
  
  /**
 + * According to the latest documentation, any PIPE_CONTROL with the
 + * Command Streamer Stall bit set must also have another bit set,
 + * with five different options:
 + *
 + *  - Render Target Cache Flush
 + *  - Depth Cache Flush
 + *  - Stall at Pixel Scoreboard
 + *  - Post-Sync Operation
 + *  - Depth Stall
 + *
 + * I chose Stall at Pixel Scoreboard since we've used it effectively
 + * in the past, but the choice is fairly arbitrary.
 + */
 +static void
 +add_cs_stall_workaround_bits(uint32_t *flags)
 +{
 +   uint32_t wa_bits = PIPE_CONTROL_WRITE_FLUSH |
 +  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
 +  PIPE_CONTROL_WRITE_IMMEDIATE |
 +  PIPE_CONTROL_WRITE_DEPTH_COUNT |
 +  PIPE_CONTROL_WRITE_TIMESTAMP |
 +  PIPE_CONTROL_STALL_AT_SCOREBOARD |
 +  PIPE_CONTROL_DEPTH_STALL;
 +
 +   /* If we're doing a CS stall, and don't already have one of the
 +* workaround bits set, add Stall at Pixel Scoreboard.
 +*/
 +   if ((*flags  PIPE_CONTROL_CS_STALL) != 0  (*flags  wa_bits) == 0)
 +  *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
 +}
 +
 +/**
   * Emit a PIPE_CONTROL with various flushing flags.
   *
   * The caller is responsible for deciding what flags are appropriate for the
 @@ -441,6 +473,8 @@ void
  brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
  {
 if (brw-gen = 8) {
 +  add_cs_stall_workaround_bits(flags);
 +
BEGIN_BATCH(6);
OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
OUT_BATCH(flags);
 @@ -481,6 +515,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, 
 uint32_t flags,
  uint32_t imm_lower, uint32_t imm_upper)
  {
 if (brw-gen = 8) {
 +  add_cs_stall_workaround_bits(flags);
 +
BEGIN_BATCH(6);
OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
OUT_BATCH(flags);
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] meta: Add support for integer blits.

2014-02-19 Thread Eric Anholt
Compared to i965, the code generated doesn't use the AVG instruction.  But
I'm not sure that multisampled integer resolves are really that important
to worry about.
---
 src/mesa/drivers/common/meta.h  | 10 ++
 src/mesa/drivers/common/meta_blit.c | 68 +
 2 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index c7a21fc..fcf45c4 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -221,9 +221,19 @@ struct blit_shader_table {
struct blit_shader sampler_cubemap_array;
 };
 
+/**
+ * Indices in the blit_state-msaa_shaders[] array
+ *
+ * Note that setup_glsl_msaa_blit_shader() assumes that the _INT enums are one
+ * more than the non-_INT version and _UINT is one beyond that.
+ */
 enum blit_msaa_shader {
BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_INT,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_UINT,
BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY_INT,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY_UINT,
BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_RESOLVE,
BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_COPY,
BLIT_MSAA_SHADER_COUNT,
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index 7f5416d..34b58d9 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -95,9 +95,24 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
enum blit_msaa_shader shader_index;
const char *samplers[] = {
   [BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE] = sampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_INT] = isampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_UINT] = usampler2DMS,
   [BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY] = sampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY_INT] = isampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY_UINT] = usampler2DMS,
};
bool dst_is_msaa = false;
+   GLenum src_datatype;
+   const char *vec4_prefix;
+
+   if (src_rb) {
+  src_datatype = _mesa_get_format_datatype(src_rb-Format);
+   } else {
+  /* depth-or-color glCopyTexImage fallback path that passes a NULL rb and
+   * doesn't handle integer.
+   */
+  src_datatype = GL_UNSIGNED_NORMALIZED;
+   }
 
if (ctx-DrawBuffer-Visual.samples  1) {
   /* If you're calling meta_BlitFramebuffer with the destination
@@ -135,6 +150,21 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
   shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE;
}
 
+   /* We rely on the enum being sorted this way. */
+   STATIC_ASSERT(BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_INT ==
+ BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE + 1);
+   STATIC_ASSERT(BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE_UINT ==
+ BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE + 2);
+   if (src_datatype == GL_INT) {
+  shader_index++;
+  vec4_prefix = i;
+   } else if (src_datatype == GL_UNSIGNED_INT) {
+  shader_index += 2;
+  vec4_prefix = u;
+   } else {
+  vec4_prefix = ;
+   }
+
if (blit-msaa_shaders[shader_index]) {
   _mesa_UseProgram(blit-msaa_shaders[shader_index]);
   return;
@@ -199,11 +229,25 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
   int samples = MAX2(src_rb-NumSamples, 1);
   char *sample_resolve;
   const char *arb_sample_shading_extension_string;
+  const char *merge_function;
 
   if (dst_is_msaa) {
  arb_sample_shading_extension_string = #extension 
GL_ARB_sample_shading : enable;
  sample_resolve = ralloc_asprintf(mem_ctx,out_color = 
texelFetch(texSampler, ivec2(texCoords), gl_SampleID););
+ merge_function = ;
   } else {
+ if (src_datatype == GL_INT) {
+merge_function =
+   ivec4 merge(ivec4 a, ivec4 b) { return (a  ivec4(1)) + (b  
ivec4(1)) + (a  b  ivec4(1)); }\n;
+ } else if (src_datatype == GL_UNSIGNED_INT) {
+merge_function =
+   uvec4 merge(uvec4 a, uvec4 b) { return (a  uvec4(1)) + (b  
uvec4(1)) + (a  b  uvec4(1)); }\n;
+ } else {
+/* The divide will happen at the end for floats. */
+merge_function =
+   vec4 merge(vec4 a, vec4 b) { return (a + b); }\n;
+ }
+
  arb_sample_shading_extension_string = ;
 
  /* We're assuming power of two samples for this resolution procedure.
@@ -218,8 +262,8 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
  sample_resolve = rzalloc_size(mem_ctx, 1);
  for (int i = 0; i  samples; i++) {
 ralloc_asprintf_append(sample_resolve,
-  vec4 sample_1_%d = 
texelFetch(texSampler, ivec2(texCoords), %d);\n,
-   i, i);
+  %svec4 sample_1_%d = 
texelFetch(texSampler, 

[Mesa-dev] [PATCH 5/6] meta: Add support for doing MSAA to MSAA blits.

2014-02-19 Thread Eric Anholt
These are non-stretched, non-resolving blits, so it's just a matter of
sampling once from our gl_SampleID and storing that to our color/depth.
---
 src/mesa/drivers/common/meta.h  |   6 +-
 src/mesa/drivers/common/meta_blit.c | 147 
 2 files changed, 104 insertions(+), 49 deletions(-)

diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 7d4474e..c7a21fc 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -222,8 +222,10 @@ struct blit_shader_table {
 };
 
 enum blit_msaa_shader {
-   BLIT_MSAA_SHADER_2D_MULTISAMPLE,
-   BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_RESOLVE,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_COPY,
BLIT_MSAA_SHADER_COUNT,
 };
 
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index 65e2692..7f5416d 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -35,6 +35,7 @@
 #include main/fbobject.h
 #include main/macros.h
 #include main/matrix.h
+#include main/multisample.h
 #include main/readpix.h
 #include main/shaderapi.h
 #include main/texobj.h
@@ -93,22 +94,45 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
void *mem_ctx;
enum blit_msaa_shader shader_index;
const char *samplers[] = {
-  [BLIT_MSAA_SHADER_2D_MULTISAMPLE] = sampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE] = sampler2DMS,
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY] = sampler2DMS,
};
+   bool dst_is_msaa = false;
+
+   if (ctx-DrawBuffer-Visual.samples  1) {
+  /* If you're calling meta_BlitFramebuffer with the destination
+   * multisampled, this is the only path that will work -- swrast and
+   * CopyTexImage won't work on it either.
+   */
+  assert(ctx-Extensions.ARB_sample_shading);
+
+  dst_is_msaa = true;
+
+  /* We need shader invocation per sample, not per pixel */
+  _mesa_set_enable(ctx, GL_MULTISAMPLE, GL_TRUE);
+  _mesa_set_enable(ctx, GL_SAMPLE_SHADING, GL_TRUE);
+  _mesa_MinSampleShading(1.0);
+   }
 
switch (target) {
case GL_TEXTURE_2D_MULTISAMPLE:
   if (src_rb-_BaseFormat == GL_DEPTH_COMPONENT ||
   src_rb-_BaseFormat == GL_DEPTH_STENCIL) {
- shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH;
+ if (dst_is_msaa)
+shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_COPY;
+ else
+shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_RESOLVE;
   } else {
- shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE;
+ if (dst_is_msaa)
+shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_COPY;
+ else
+shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE;
   }
   break;
default:
   _mesa_problem(ctx, Unkown texture target %s\n,
 _mesa_lookup_enum_by_nr(target));
-  shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE;
+  shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE;
}
 
if (blit-msaa_shaders[shader_index]) {
@@ -118,17 +142,32 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
 
mem_ctx = ralloc_context(NULL);
 
-   if (shader_index == BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH) {
-  /* From the GL 4.3 spec:
-   *
-   * If there is a multisample buffer (the value of SAMPLE_BUFFERS is
-   *  one), then values are obtained from the depth samples in this
-   *  buffer. It is recommended that the depth value of the centermost
-   *  sample be used, though implementations may choose any function
-   *  of the depth sample values at each pixel.
-   *
-   * We're slacking and instead of choosing centermost, we've got 0.
-   */
+   if (shader_index == BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_RESOLVE ||
+   shader_index == BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH_COPY) {
+  char *sample_index;
+  const char *arb_sample_shading_extension_string;
+
+  if (dst_is_msaa) {
+ arb_sample_shading_extension_string = #extension 
GL_ARB_sample_shading : enable;
+ sample_index = gl_SampleID;
+  } else {
+ /* Don't need that extension, since we're drawing to a single-sampled
+  * destination.
+  */
+ arb_sample_shading_extension_string = ;
+ /* From the GL 4.3 spec:
+  *
+  * If there is a multisample buffer (the value of SAMPLE_BUFFERS
+  *  is one), then values are obtained from the depth samples in
+  *  this buffer. It is recommended that the depth value of the
+  *  centermost sample be used, though implementations may choose
+  *  any function of the depth sample values at each pixel.
+  *
+  * We're slacking and instead of choosing centermost, we've got 0.
+  */

[Mesa-dev] [PATCH 2/6] meta: Add support for doing multisample resolves.

2014-02-19 Thread Eric Anholt
Note that this doesn't handle GL_EXT_multisample_scaled_blit yet.  The
i965 code for that extension bakes in knowledge of the sample positions
(well, knowledge of the sample positions aligned to a lower-resolution
grid), which we would have to do at runtime somehow for meta.
---
 src/mesa/drivers/common/meta.h  |   7 ++
 src/mesa/drivers/common/meta_blit.c | 202 +---
 2 files changed, 197 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 822bfa1..5d79253 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -221,6 +221,12 @@ struct blit_shader_table {
struct blit_shader sampler_cubemap_array;
 };
 
+enum blit_msaa_shader {
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE,
+   BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH,
+   BLIT_MSAA_SHADER_COUNT,
+};
+
 /**
  * State for glBlitFramebufer()
  */
@@ -230,6 +236,7 @@ struct blit_state
GLuint VBO;
GLuint DepthFP;
struct blit_shader_table shaders;
+   GLuint msaa_shaders[BLIT_MSAA_SHADER_COUNT];
struct temp_texture depthTex;
 };
 
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index a2b284b..be91247 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -31,6 +31,7 @@
 #include main/condrender.h
 #include main/depth.h
 #include main/enable.h
+#include main/enums.h
 #include main/fbobject.h
 #include main/macros.h
 #include main/matrix.h
@@ -81,8 +82,158 @@ init_blit_depth_pixels(struct gl_context *ctx)
 }
 
 static void
+setup_glsl_msaa_blit_shader(struct gl_context *ctx,
+struct blit_state *blit,
+struct gl_renderbuffer *src_rb,
+GLenum target)
+{
+   const char *vs_source;
+   char *fs_source;
+   GLuint vs, fs;
+   void *mem_ctx;
+   enum blit_msaa_shader shader_index;
+   const char *samplers[] = {
+  [BLIT_MSAA_SHADER_2D_MULTISAMPLE] = sampler2DMS,
+   };
+
+   switch (target) {
+   case GL_TEXTURE_2D_MULTISAMPLE:
+  if (src_rb-_BaseFormat == GL_DEPTH_COMPONENT ||
+  src_rb-_BaseFormat == GL_DEPTH_STENCIL) {
+ shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH;
+  } else {
+ shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE;
+  }
+  break;
+   default:
+  _mesa_problem(ctx, Unkown texture target %s\n,
+_mesa_lookup_enum_by_nr(target));
+  shader_index = BLIT_MSAA_SHADER_2D_MULTISAMPLE;
+   }
+
+   if (blit-msaa_shaders[shader_index]) {
+  _mesa_UseProgram(blit-msaa_shaders[shader_index]);
+  return;
+   }
+
+   mem_ctx = ralloc_context(NULL);
+
+   if (shader_index == BLIT_MSAA_SHADER_2D_MULTISAMPLE_DEPTH) {
+  /* From the GL 4.3 spec:
+   *
+   * If there is a multisample buffer (the value of SAMPLE_BUFFERS is
+   *  one), then values are obtained from the depth samples in this
+   *  buffer. It is recommended that the depth value of the centermost
+   *  sample be used, though implementations may choose any function
+   *  of the depth sample values at each pixel.
+   *
+   * We're slacking and instead of choosing centermost, we've got 0.
+   */
+  vs_source = ralloc_asprintf(mem_ctx,
+  #version 130\n
+  in vec2 position;\n
+  in vec2 textureCoords;\n
+  out vec2 texCoords;\n
+  void main()\n
+  {\n
+ texCoords = textureCoords;\n
+ gl_Position = vec4(position, 0.0, 
1.0);\n
+  }\n);
+  fs_source = ralloc_asprintf(mem_ctx,
+  #version 130\n
+  #extension GL_ARB_texture_multisample : 
enable\n
+  uniform sampler2DMS texSampler;\n
+  in vec2 texCoords;\n
+  out vec4 out_color;\n
+  \n
+  void main()\n
+  {\n
+ gl_FragDepth = texelFetch(texSampler, 
ivec2(texCoords), 0).r;\n
+  }\n);
+   } else if (shader_index == BLIT_MSAA_SHADER_2D_MULTISAMPLE) {
+  char *sample_resolve;
+  /* You can create 2D_MULTISAMPLE textures with 0 sample count (meaning 1
+   * sample).  Yes, this is ridiculous.
+   */
+  int samples = MAX2(src_rb-NumSamples, 1);
+
+  /* We're assuming power of two samples for this resolution procedure.
+   *
+   * To avoid losing any floating point precision if the samples all
+   * happen to have the same value, we merge pairs of values at a time (so
+   * the 

[Mesa-dev] [PATCH 4/6] meta: Save and restore a bunch of MSAA state.

2014-02-19 Thread Eric Anholt
We're disabling GL_MULTISAMPLE, so we didn't need to worry about a lot of
that state.  But to do MSAA to MSAA blits, we need to start handling more
state.

v2: Fix pasteo caught by Kenneth.
---
 src/mesa/drivers/common/meta.c | 40 +---
 src/mesa/drivers/common/meta.h |  2 +-
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index a0613f2..2dec2c3 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -51,6 +51,7 @@
 #include main/macros.h
 #include main/matrix.h
 #include main/mipmap.h
+#include main/multisample.h
 #include main/pixel.h
 #include main/pbo.h
 #include main/polygon.h
@@ -719,9 +720,20 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
}
 
if (state  MESA_META_MULTISAMPLE) {
-  save-MultisampleEnabled = ctx-Multisample.Enabled;
+  save-Multisample = ctx-Multisample; /* struct copy */
+
   if (ctx-Multisample.Enabled)
  _mesa_set_multisample(ctx, GL_FALSE);
+  if (ctx-Multisample.SampleCoverage)
+ _mesa_set_enable(ctx, GL_SAMPLE_COVERAGE, GL_FALSE);
+  if (ctx-Multisample.SampleAlphaToCoverage)
+ _mesa_set_enable(ctx, GL_SAMPLE_ALPHA_TO_COVERAGE, GL_FALSE);
+  if (ctx-Multisample.SampleAlphaToOne)
+ _mesa_set_enable(ctx, GL_SAMPLE_ALPHA_TO_ONE, GL_FALSE);
+  if (ctx-Multisample.SampleShading)
+ _mesa_set_enable(ctx, GL_SAMPLE_SHADING, GL_FALSE);
+  if (ctx-Multisample.SampleMask)
+ _mesa_set_enable(ctx, GL_SAMPLE_MASK, GL_FALSE);
}
 
if (state  MESA_META_FRAMEBUFFER_SRGB) {
@@ -1059,8 +1071,30 @@ _mesa_meta_end(struct gl_context *ctx)
}
 
if (state  MESA_META_MULTISAMPLE) {
-  if (ctx-Multisample.Enabled != save-MultisampleEnabled)
- _mesa_set_multisample(ctx, save-MultisampleEnabled);
+  struct gl_multisample_attrib *ctx_ms = ctx-Multisample;
+  struct gl_multisample_attrib *save_ms = save-Multisample;
+
+  if (ctx_ms-Enabled != save_ms-Enabled)
+ _mesa_set_multisample(ctx, save_ms-Enabled);
+  if (ctx_ms-SampleCoverage != save_ms-SampleCoverage)
+ _mesa_set_enable(ctx, GL_SAMPLE_COVERAGE, save_ms-SampleCoverage);
+  if (ctx_ms-SampleAlphaToCoverage != save_ms-SampleAlphaToCoverage)
+ _mesa_set_enable(ctx, GL_SAMPLE_ALPHA_TO_COVERAGE, 
save_ms-SampleAlphaToCoverage);
+  if (ctx_ms-SampleAlphaToOne != save_ms-SampleAlphaToOne)
+ _mesa_set_enable(ctx, GL_SAMPLE_ALPHA_TO_ONE, 
save_ms-SampleAlphaToOne);
+  if (ctx_ms-SampleCoverageValue != save_ms-SampleCoverageValue ||
+  ctx_ms-SampleCoverageInvert != save_ms-SampleCoverageInvert) {
+ _mesa_SampleCoverage(save_ms-SampleCoverageValue,
+  save_ms-SampleCoverageInvert);
+  }
+  if (ctx_ms-SampleShading != save_ms-SampleShading)
+ _mesa_set_enable(ctx, GL_SAMPLE_SHADING, save_ms-SampleShading);
+  if (ctx_ms-SampleMask != save_ms-SampleMask)
+ _mesa_set_enable(ctx, GL_SAMPLE_MASK, save_ms-SampleMask);
+  if (ctx_ms-SampleMaskValue != save_ms-SampleMaskValue)
+ _mesa_SampleMaski(0, save_ms-SampleMaskValue);
+  if (ctx_ms-MinSampleShadingValue != save_ms-MinSampleShadingValue)
+ _mesa_MinSampleShading(save_ms-MinSampleShadingValue);
}
 
if (state  MESA_META_FRAMEBUFFER_SRGB) {
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 5d79253..7d4474e 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -168,7 +168,7 @@ struct save_state
struct gl_feedback Feedback;
 
/** MESA_META_MULTISAMPLE */
-   GLboolean MultisampleEnabled;
+   struct gl_multisample_attrib Multisample;
 
/** MESA_META_FRAMEBUFFER_SRGB */
GLboolean sRGBEnabled;
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] meta: Try to do blending of sRGB values in linear colorspace.

2014-02-19 Thread Eric Anholt
Blending of values would occur when doing GL_LINEAR filtering with
scaling, and in an upcoming commit when doing MSAA resolves.
---
 src/mesa/drivers/common/meta_blit.c | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index be91247..65e2692 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -375,13 +375,33 @@ blitframebuffer_texture(struct gl_context *ctx,
_mesa_SamplerParameteri(sampler, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
_mesa_SamplerParameteri(sampler, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
 
-   /* Always do our blits with no sRGB decode or encode.  Note that
-* GL_FRAMEBUFFER_SRGB has already been disabled by
-* _mesa_meta_begin().
+   /* Always do our blits with no net sRGB decode or encode.
+*
+* However, if both the src and dst can be srgb decode/encoded, enable them
+* so that we do any blending (from scaling or from MSAA resolves) in the
+* right colorspace.
+*
+* Our choice of not doing any net encode/decode is from the GL 3.0
+* specification:
+*
+* Blit operations bypass the fragment pipeline. The only fragment
+*  operations which affect a blit are the pixel ownership test and the
+*  scissor test.
+*
+* The GL 4.4 specification disagrees and says that the sRGB part of the
+* fragment pipeline applies, but this was found to break applications.
 */
if (ctx-Extensions.EXT_texture_sRGB_decode) {
-  _mesa_SamplerParameteri(sampler, GL_TEXTURE_SRGB_DECODE_EXT,
-  GL_SKIP_DECODE_EXT);
+  if (_mesa_get_format_color_encoding(rb-Format) == GL_SRGB 
+  ctx-DrawBuffer-Visual.sRGBCapable) {
+ _mesa_SamplerParameteri(sampler, GL_TEXTURE_SRGB_DECODE_EXT,
+ GL_DECODE_EXT);
+ _mesa_set_framebuffer_srgb(ctx, GL_TRUE);
+  } else {
+ _mesa_SamplerParameteri(sampler, GL_TEXTURE_SRGB_DECODE_EXT,
+ GL_SKIP_DECODE_EXT);
+ /* set_framebuffer_srgb was set by _mesa_meta_begin(). */
+  }
}
 
if (!glsl_version) {
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] i965: Fix miptree matching for multisampled, non-interleaved miptrees.

2014-02-19 Thread Eric Anholt
We haven't been executing this code before the meta-blit case, because
we've been flagging the miptree as validated at texstorage time, and never
having to revalidate.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 15 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5461562..355f7cd 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -876,13 +876,26 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
if (mt-target == GL_TEXTURE_CUBE_MAP)
   depth = 6;
 
+   int level_depth = mt-level[level].depth;
+   if (mt-num_samples  1) {
+  switch (mt-msaa_layout) {
+  case INTEL_MSAA_LAYOUT_NONE:
+  case INTEL_MSAA_LAYOUT_IMS:
+ break;
+  case INTEL_MSAA_LAYOUT_UMS:
+  case INTEL_MSAA_LAYOUT_CMS:
+ level_depth /= mt-num_samples;
+ break;
+  }
+   }
+
/* Test image dimensions against the base level image adjusted for
 * minification.  This will also catch images not present in the
 * tree, changed targets, etc.
 */
if (width != minify(mt-logical_width0, level) ||
height != minify(mt-logical_height0, level) ||
-   depth != mt-level[level].depth) {
+   depth != level_depth) {
   return false;
}
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 6c45cfd..c274994 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -115,6 +115,8 @@ struct intel_mipmap_level
 *- For GL_TEXTURE_3D, it is the texture's depth at this miplevel. Its
 *  value, like width and height, varies with miplevel.
 *- For other texture types, depth is 1.
+*- Additionally, for UMS and CMS miptrees, depth is multiplied by
+*  sample count.
 */
GLuint depth;
 
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Create a hardware context before initializing state module.

2014-02-19 Thread Kenneth Graunke
brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
contexts are enabled (brw-hw_ctx != NULL), this will upload some
initial invariant state for the GPU.  Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.

Commit 46d3c2bf4ddd227193b98861f1e632498fe547d8 accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.

Fixes a regression since 46d3c2bf4ddd227193b98861f1e632498fe547d8.

Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_context.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

The diff looks weird - I actually moved the hardware context initialization
block up a few lines.  Diff instead decided that I moved these three lines
down below it.  Which is equivalent, but...odd.

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 5800092..9791a49 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -700,12 +700,6 @@ brwCreateContext(gl_api api,
 
intel_batchbuffer_init(brw);
 
-   brw_init_state(brw);
-
-   intelInitExtensions(ctx);
-
-   intel_fbo_init(brw);
-
if (brw-gen = 6) {
   /* Create a new hardware context.  Using a hardware context means that
* our GPU state will be saved/restored on context switch, allowing us
@@ -723,6 +717,12 @@ brwCreateContext(gl_api api,
   }
}
 
+   brw_init_state(brw);
+
+   intelInitExtensions(ctx);
+
+   intel_fbo_init(brw);
+
brw_init_surface_formats(brw);
 
if (brw-is_g4x || brw-gen = 5) {
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH][RFC] dri3: Add support for the GLX_EXT_buffer_age extension

2014-02-19 Thread Ian Romanick
On 02/19/2014 02:49 PM, Adel Gadllah wrote:
 Hi,
 
 The attached patch adds support for the GLX_EXT_buffer_age extension,
 which is mostly used by compositors for efficient sub screen updates.
 
 The extension should not be reported as supported when running DRI2 but
 it seems to show up when I try to disable it with LIBGL_DRI3_DISABLE ...
 not sure why suggestions welcome.
 
 
 P.S: Please CC me when replying as I am not subscribed to the list.

You'll need to fix that. :)

You didn't send this patch with git-send-email.  Whatever you used to
send it also mangled it, so it won't apply.

 From: Adel Gadllah adel.gadl...@gmail.com
 Date: Sun, 16 Feb 2014 13:40:42 +0100
 Subject: [PATCH] dri3: Add GLX_EXT_buffer_age support
 
 ---
  include/GL/glx.h  |  5 +
  include/GL/glxext.h   |  5 +
  src/glx/dri2_glx.c|  1 +
  src/glx/dri3_glx.c| 17 +
  src/glx/dri3_priv.h   |  2 ++
  src/glx/glx_pbuffer.c |  7 +++
  src/glx/glxclient.h   |  1 +
  src/glx/glxextensions.c   |  1 +
  src/glx/glxextensions.h   |  1 +
  src/mesa/drivers/x11/glxapi.c |  3 +++
  10 files changed, 43 insertions(+)
 
 diff --git a/include/GL/glx.h b/include/GL/glx.h
 index 234abc0..b8b4d75 100644
 --- a/include/GL/glx.h
 +++ b/include/GL/glx.h
 @@ -161,6 +161,11 @@ extern C {
  #define GLX_SAMPLES 0x186a1 /*11*/
 
 
 +/*
 + * GLX_EXT_buffer_age
 + */
 +#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
 +
 
  typedef struct __GLXcontextRec *GLXContext;
  typedef XID GLXPixmap;
 diff --git a/include/GL/glxext.h b/include/GL/glxext.h
 index 8c642f3..36e92dc 100644
 --- a/include/GL/glxext.h
 +++ b/include/GL/glxext.h
 @@ -383,6 +383,11 @@ void glXReleaseTexImageEXT (Display *dpy,
 GLXDrawable drawable, int buffer);
  #define GLX_FLIP_COMPLETE_INTEL   0x8182
  #endif /* GLX_INTEL_swap_event */
 
 +#ifndef GLX_EXT_buffer_age
 +#define GLX_EXT_buffer_age 1
 +#define GLX_BACK_BUFFER_AGE_EXT 0x20F4
 +#endif /* GLX_EXT_buffer_age */
 +

We get glxext.h directly from Khronos, so it should not be modified...
except to import new versions from upstream.  It looks like the upstream
glxext.h has this, so the first patch in the series should be glx:
Update glxext.h to revision 25407.  And drop the change to glx.h.

  #ifndef GLX_MESA_agp_offset
  #define GLX_MESA_agp_offset 1
  typedef unsigned int ( *PFNGLXGETAGPOFFSETMESAPROC) (const void *pointer);
 diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
 index 67fe9c1..007f449 100644
 --- a/src/glx/dri2_glx.c
 +++ b/src/glx/dri2_glx.c
 @@ -1288,6 +1288,7 @@ dri2CreateScreen(int screen, struct glx_display *
 priv)
 psp-waitForSBC = NULL;
 psp-setSwapInterval = NULL;
 psp-getSwapInterval = NULL;
 +   psp-queryBufferAge = NULL;
 
 if (pdp-driMinor = 2) {
psp-getDrawableMSC = dri2DrawableGetMSC;
 diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
 index 70ec057..07120e1 100644
 --- a/src/glx/dri3_glx.c
 +++ b/src/glx/dri3_glx.c
 @@ -1345,6 +1345,8 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t
 target_msc, int64_t divisor,
   target_msc = priv-msc + priv-swap_interval * (priv-send_sbc
 - priv-recv_sbc);
 
priv-buffers[buf_id]-busy = 1;
 +  priv-buffers[buf_id]-last_swap = priv-swap_count;
 +
xcb_present_pixmap(c,
   priv-base.xDrawable,
   priv-buffers[buf_id]-pixmap,
 @@ -1379,11 +1381,23 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw,
 int64_t target_msc, int64_t divisor,
xcb_flush(c);
if (priv-stamp)
   ++(*priv-stamp);
 +
 +   priv-swap_count++;
 }
 
 return ret;
  }
 
 +static int
 +dri3_query_buffer_age(__GLXDRIdrawable *pdraw)
 +{
 +  struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
 +  int buf_id = DRI3_BACK_ID(priv-cur_back);

Blank line here.

Also maybe use dri3_back_buffer instead?

   const struct dri3_buffer *const back = dri3_back_buffer(priv);

   if (back-last_swap != 0)
  return 0;
   else
  return priv-swap_count - back-last_swap;

 +  if (!priv-buffers[buf_id]-last_swap)
 +return 0;

And here.

 +  return priv-swap_count - priv-buffers[buf_id]-last_swap;
 +}
 +
  /** dri3_open
   *
   * Wrapper around xcb_dri3_open
 @@ -1742,6 +1756,9 @@ dri3_create_screen(int screen, struct glx_display
 * priv)
 psp-copySubBuffer = dri3_copy_sub_buffer;
 __glXEnableDirectExtension(psc-base, GLX_MESA_copy_sub_buffer);
 
 +   psp-queryBufferAge = dri3_query_buffer_age;
 +   __glXEnableDirectExtension(psc-base, GLX_EXT_buffer_age);
 +
 free(driverName);
 free(deviceName);
 
 diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
 index 1d124f8..d00440a 100644
 --- a/src/glx/dri3_priv.h
 +++ b/src/glx/dri3_priv.h
 @@ -97,6 +97,7 @@ struct dri3_buffer {
 uint32_t cpp;
 uint32_t flags;
 uint32_t width, height;
 +   uint32_t last_swap;
 
 enum dri3_buffer_typebuffer_type;

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Create a hardware context before initializing state module.

2014-02-19 Thread Ian Romanick
On 02/19/2014 05:40 PM, Kenneth Graunke wrote:
 brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
 contexts are enabled (brw-hw_ctx != NULL), this will upload some
 initial invariant state for the GPU.  Without hardware contexts, we
 rely on this state being uploaded via atoms that subscribe to the
 BRW_NEW_CONTEXT bit.
 
 Commit 46d3c2bf4ddd227193b98861f1e632498fe547d8 accidentally moved
 the call to brw_init_state() before creating a hardware context.
 This meant brw_upload_initial_gpu_state would always early return.
 Except on Gen6+, we stopped uploading the initial GPU state via
 state atoms, so it never happened.
 
 Fixes a regression since 46d3c2bf4ddd227193b98861f1e632498fe547d8.

This seems like a rational change... but why didn't 46d3c2b blow up the
world on IVB and HSW?  ...and only cause heisenbugs on BDW?

 Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

Either way,

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 ---
  src/mesa/drivers/dri/i965/brw_context.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)
 
 The diff looks weird - I actually moved the hardware context initialization
 block up a few lines.  Diff instead decided that I moved these three lines
 down below it.  Which is equivalent, but...odd.
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index 5800092..9791a49 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -700,12 +700,6 @@ brwCreateContext(gl_api api,
  
 intel_batchbuffer_init(brw);
  
 -   brw_init_state(brw);
 -
 -   intelInitExtensions(ctx);
 -
 -   intel_fbo_init(brw);
 -
 if (brw-gen = 6) {
/* Create a new hardware context.  Using a hardware context means that
 * our GPU state will be saved/restored on context switch, allowing us
 @@ -723,6 +717,12 @@ brwCreateContext(gl_api api,
}
 }
  
 +   brw_init_state(brw);
 +
 +   intelInitExtensions(ctx);
 +
 +   intel_fbo_init(brw);
 +
 brw_init_surface_formats(brw);
  
 if (brw-is_g4x || brw-gen = 5) {
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_bswap64()

2014-02-19 Thread Michel Dänzer
On Mit, 2014-02-19 at 15:09 -0800, Tom Stellard wrote:
 ---
  src/gallium/auxiliary/util/u_math.h | 10 ++
  1 file changed, 10 insertions(+)
 
 diff --git a/src/gallium/auxiliary/util/u_math.h 
 b/src/gallium/auxiliary/util/u_math.h
 index b5e0663..49f8bda 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -741,6 +741,16 @@ util_bswap32(uint32_t n)
  #endif
  }
  
 +/**
 + * Reverse byte order of a 64bit word.
 + */
 +static INLINE uint64_t
 +util_bswap64(uint64_t n)
 +{

Please use __builtin_bswap64() when available, as per util_bswap32().


 +   return ((uint64_t)util_bswap32(n  0x)  32) |

There is no need for  0x.


 +  util_bswap32((n  32));
 +}


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] clover: Pass buffer offsets to the driver in set_global_binding() v2

2014-02-19 Thread Michel Dänzer
On Don, 2014-02-20 at 00:53 +0100, Francisco Jerez wrote:
 Tom Stellard thomas.stell...@amd.com writes:
 
  diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
  b/src/gallium/drivers/r600/evergreen_compute.c
  index 70efe5c..efd7143 100644
  --- a/src/gallium/drivers/r600/evergreen_compute.c
  +++ b/src/gallium/drivers/r600/evergreen_compute.c
  @@ -662,10 +662,18 @@ static void evergreen_set_global_binding(
   
  for (int i = 0; i  n; i++)
  {
  +   uint32_t buffer_offset;
  +   uint32_t handle;
  assert(resources[i]-target == PIPE_BUFFER);
  assert(resources[i]-bind  PIPE_BIND_GLOBAL);
   
  -   *(handles[i]) = buffers[i]-chunk-start_in_dw * 4;
  +   buffer_offset = util_le32_to_cpu(*(handles[i]));
  +   handle = buffer_offset + buffers[i]-chunk-start_in_dw * 4;
  +   if (R600_BIG_ENDIAN) {
  +   handle = util_bswap32(handle);
  +   }
  +
  +   *(handles[i]) = handle;
 
 I guess you could just do *(handles[i]) = util_cpu_to_le32(handle)?
 Oh, right, there isn't such a function -- though it would be trivial to
 implement.

Right, just add:

#define util_cpu_to_le64 util_le64_to_cpu [0]
#define util_cpu_to_le32 util_le32_to_cpu
#define util_cpu_to_le16 util_le16_to_cpu

[0] and add util_le64_to_cpu in the first place :)


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] radeonsi: Use SI_BIG_ENDIAN now that it exists

2014-02-19 Thread Michel Dänzer
On Mit, 2014-02-19 at 15:09 -0800, Tom Stellard wrote:
 ---
  src/gallium/drivers/radeonsi/si_shader.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
 b/src/gallium/drivers/radeonsi/si_shader.c
 index 54270cd..9fed751 100644
 --- a/src/gallium/drivers/radeonsi/si_shader.c
 +++ b/src/gallium/drivers/radeonsi/si_shader.c
 @@ -2333,7 +2333,7 @@ int si_compile_llvm(struct si_context *sctx, struct 
 si_pipe_shader *shader,
   }
  
   ptr = (uint32_t*)sctx-b.ws-buffer_map(shader-bo-cs_buf, 
 sctx-b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
 - if (0 /*SI_BIG_ENDIAN*/) {
 + if (SI_BIG_ENDIAN) {
   for (i = 0; i  binary.code_size / 4; ++i) {
   ptr[i] = util_bswap32(*(uint32_t*)(binary.code + i*4));
   }

I would prefer using util_cpu_to_le32() here as well.


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/13] i965: Enable smooth points when multisampling without point sprites.

2014-02-19 Thread Roland Scheidegger
Am 19.02.2014 11:04, schrieb Kenneth Graunke:
 According to the Point Multisample Rasterization of the OpenGL
 specification (3.0 or later), smooth points are supposed to be enabled
 implicitly when multisampling, regardless of the GL_POINT_SMOOTH flag.
 
 However, if GL_POINT_SPRITE is enabled, you get square points no matter
 what.  Core contexts always enable point sprites, so this effectively
 makes smooth points go away, even in the case of multisampling.
 
 Fixes Piglit's EXT_framebuffer_multisample/point-smooth tests.
 (Yes, that's right folks, we actually have Piglit tests for this.)
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_sf_state.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
 b/src/mesa/drivers/dri/i965/gen8_sf_state.c
 index b31b17e..0693fee 100644
 --- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
 @@ -139,8 +139,11 @@ upload_sf(struct brw_context *brw)
 if (!(ctx-VertexProgram.PointSizeEnabled || ctx-Point._Attenuated))
dw3 |= GEN6_SF_USE_STATE_POINT_WIDTH;
  
 -   if (ctx-Point.SmoothFlag)
 +   /* _NEW_POINT | _NEW_MULTISAMPLE */
 +   if ((ctx-Point.SmoothFlag || ctx-Multisample._Enabled) 
 +   !ctx-Point.PointSprite) {
dw3 |= GEN8_SF_SMOOTH_POINT_ENABLE;
 +   }
  
 dw3 |= GEN6_SF_LINE_AA_MODE_TRUE;
  
 @@ -166,6 +169,7 @@ const struct brw_tracked_state gen8_sf_state = {
.mesa  = _NEW_LIGHT |
 _NEW_PROGRAM |
 _NEW_LINE |
 +   _NEW_MULTISAMPLE |
 _NEW_POINT,
.brw   = BRW_NEW_CONTEXT,
.cache = 0,
 

Wow your hw can rasterize round points directly? At least ten years too
late to be useful but that's a slick feature!
In any case the logic looks right to me. I have not much idea about the
hw do you need to match the raster bit
(GEN8_RASTER_SMOOTH_POINT_ENABLE) for this too?

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 71870] Metro: Last Light rendering issues

2014-02-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=71870

Tapani Pälli lem...@gmail.com changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED

--- Comment #46 from Tapani Pälli lem...@gmail.com ---
I've verified with git master that issues 1, 3 and 4 are gone. 2 and 5 still
hold but let's handle those separately from this bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH-RFC] i965: do not advertise MESA_FORMAT_Z_UNORM16 support

2014-02-19 Thread Chia-I Wu
On Thu, Feb 20, 2014 at 7:03 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On 02/19/2014 02:27 PM, Ian Romanick wrote:
 On 02/19/2014 12:08 PM, Kenneth Graunke wrote:
 On 02/18/2014 09:48 PM, Chia-I Wu wrote:
 Since 73bc6061f5c3b6a3bb7a8114bb2e1ab77d23cfdb, Z16 support is
 not advertised for OpenGL ES contexts due to the terrible
 performance.  It is still enabled for desktop GL because it was
 believed GL 3.0+ requires Z16.

 It turns out only GL 3.0 requires Z16, and that is corrected in
 later GL versions.  In light of that, and per Ian's suggestion,
 stop advertising Z16 support by default, and add a drirc option,
 gl30_sized_format_rules, so that users can override.

 I actually don't think that GL 3.0 requires Z16, either.

 In glspec30.20080923.pdf, page 180, it says: [...] memory
 allocation per texture component is assigned by the GL to match the
 allocations listed in tables 3.16-3.18 as closely as possible.
 [...]

 Required Texture Formats [...] In addition, implementations are
 required to support the following sized internal formats.
 Requesting one of these internal formats for any texture type will
 allocate exactly the internal component sizes and types shown for
 that format in tables 3.16-3.17:

 Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table
 3.16 or table 3.17.  It appears in table 3.18, where the exact
 rule doesn't apply, and thus we fall back to the closely as
 possible rule.

 The confusing part is that the ordering of the tables in the PDF
 is:

 Table 3.16 (pages 182-184) Table 3.18 (bottom of page 184 to top of
 185) Table 3.17 (page 185)

 I'm guessing that people saw table 3.16, then saw the one after
 with DEPTH_COMPONENT* formats, and assumed it was 3.17.  But it's
 not.

 Yay latex!  Thank you for putting things in random order because it
 fit better. :(

 I think we should just drop Z16 support entirely, and I think we
 should remove the requirement from the Piglit test.

 If the test is wrong, and it sounds like it is, then I'm definitely in
 favor of changing it.

 The reason to have Z16 is low-bandwidth GPUs in resource constrained
 environments.  If an app specifically asks for Z16, then there's a
 non-zero (though possibly infinitesimal) probability they're doing it
 for a reason.  For at least some platforms, isn't there just a
 work-around to implement to fix the performance issue?  Doesn't the
 performance issue only affect some platforms to begin with?

 Maybe just change the check to

ctx-TextureFormatSupported[MESA_FORMAT_Z_UNORM16] =
   ! platform has z16 performance issues;

 Currently, all platforms have Z16 performance issues.  On Haswell and
 later, we could potentially implement the PMA stall optimization, which
 I believe would reduce(?) the problem.  I'm not sure if it would
 eliminate it though.

 I think the best course of action is:
 1. Fix the Piglit test to not require precise depth formats.
 2. Disable Z16 on all generations.
 3. Add a to do item for implementing the HSW+ PMA stall optimization.
 4. Add a to do item for re-evaluating Z16 on HSW+ once that's done.
I've sent a fix for the piglit test.  What is PMA stall
optimization?  I could not find any reference to it in the public
docs.



 --Ken




-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Create a hardware context before initializing state module.

2014-02-19 Thread Kenneth Graunke
On 02/19/2014 06:05 PM, Ian Romanick wrote:
 On 02/19/2014 05:40 PM, Kenneth Graunke wrote:
 brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
 contexts are enabled (brw-hw_ctx != NULL), this will upload some
 initial invariant state for the GPU.  Without hardware contexts, we
 rely on this state being uploaded via atoms that subscribe to the
 BRW_NEW_CONTEXT bit.

 Commit 46d3c2bf4ddd227193b98861f1e632498fe547d8 accidentally moved
 the call to brw_init_state() before creating a hardware context.
 This meant brw_upload_initial_gpu_state would always early return.
 Except on Gen6+, we stopped uploading the initial GPU state via
 state atoms, so it never happened.

 Fixes a regression since 46d3c2bf4ddd227193b98861f1e632498fe547d8.
 
 This seems like a rational change... but why didn't 46d3c2b blow up the
 world on IVB and HSW?  ...and only cause heisenbugs on BDW?

Presumably because it doesn't do much.  On Gen6+, all it does is:
- PIPELINE_SELECT (probably already render, unless you're doing media/gpgpu)
- STATE_SIP (basically only matters if you hit breakpoints or invalid
operations)
- VF_STATISTICS (we don't use the counters anyway)

On Broadwell, it also uploads 3DSTATE_SAMPLE_PATTERN, which meant that
any Piglit test that relied on legitimate sample positions would fail.
That is, until I ran with a branch that actually emitted
3DSTATE_SAMPLE_PATTERN---after that, the sample positions remained
programmed, and tests continued to work fine until reboot.

 Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 
 Either way,
 
 Reviewed-by: Ian Romanick ian.d.roman...@intel.com




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50: enable txg where supported

2014-02-19 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

This applies on top of Dave Airlie's r600g-texture-gather branch. Ran piglit
with -t gather, passed all 1057 tests. Can't say I fully understand what all
the arguments to handleTEX in the Coverter are but... seems to work. Will
probably require some care for nvc0 support which should have SM5 caps.

 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 3 ++-
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 
 src/gallium/drivers/nouveau/nv50/nv50_screen.c| 3 ++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
index bef103f..e2f93bb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
@@ -1447,7 +1447,7 @@ CodeEmitterNV50::emitTEX(const TexInstruction *i)
   code[0] |= 0x0100;
   break;
case OP_TXG:
-  code[0] = 0x0100;
+  code[0] |= 0x0100;
   code[1] = 0x8000;
   break;
default:
@@ -1790,6 +1790,7 @@ CodeEmitterNV50::emitInstruction(Instruction *insn)
case OP_TXB:
case OP_TXL:
case OP_TXF:
+   case OP_TXG:
   emitTEX(insn-asTex());
   break;
case OP_TXQ:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index d226d0c..ccddb9a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -558,6 +558,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
NV50_IR_OPCODE_CASE(SAD, SAD);
NV50_IR_OPCODE_CASE(TXF, TXF);
NV50_IR_OPCODE_CASE(TXQ, TXQ);
+   NV50_IR_OPCODE_CASE(TG4, TXG);
 
NV50_IR_OPCODE_CASE(EMIT, EMIT);
NV50_IR_OPCODE_CASE(ENDPRIM, RESTART);
@@ -2434,6 +2435,9 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
case TGSI_OPCODE_TXD:
   handleTEX(dst0, 3, 3, 0x03, 0x0f, 0x10, 0x20);
   break;
+   case TGSI_OPCODE_TG4:
+  handleTEX(dst0, 2, 2, 0x03, 0x0f, 0x00, 0x00);
+  break;
case TGSI_OPCODE_TEX2:
   handleTEX(dst0, 2, 2, 0x03, 0x10, 0x00, 0x00);
   break;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 488642a..9aafe94 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -193,11 +193,12 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_ENDIANNESS:
   return PIPE_ENDIAN_LITTLE;
case PIPE_CAP_TGSI_VS_LAYER:
-   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_TEXTURE_GATHER_SM5:
   return 0;
case PIPE_CAP_MAX_VIEWPORTS:
   return NV50_MAX_VIEWPORTS;
+   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
+  return (class_3d = NVA3_3D_CLASS) ? 4 : 0;
default:
   NOUVEAU_ERR(unknown PIPE_CAP %d\n, param);
   return 0;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev