Re: [Mesa-dev] [PATCH] Add initial Haiku build support

2011-12-22 Thread Maarten Lankhorst
Hey Alexander,

On 12/21/2011 07:16 PM, Alexander von Gluck wrote:

 * Doesn't reintroduce legacy drivers
 * Adds Haiku mklib code
 * Removes some broken PIPE_OS_HAIKU defines
 * Removes an NDEBUG ifdef in link_uniforms.cpp,
   there is an item that uses the union without
   checking NDEBUG below.
 * Haiku has a opengl kit that will wrap all of
   these build binaries(pretty much an external beos
   mesa driver)
Smells like this patch should be split up to address each point separately..

~Maarten
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vbo: count min/max_index before vbo-draw_prims

2011-12-22 Thread Yuanhan Liu
For the case that index data is stored in element array buffer object,
and user called glMultiDrawElements, count the min/max_index before
calling vbo-draw_prims. vbo_get_minmax_index() isn't friendly to this
case. So do it while building the prim info.

Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
---
 src/mesa/vbo/vbo_exec_array.c |   14 +-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index a6e41e9..70efd3f 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1147,11 +1147,18 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
GLenum mode,
   fallback = GL_TRUE;
 
if (!fallback) {
+  struct _mesa_index_buffer tmp_ib;
+  GLuint min_index = ~0;
+  GLuint max_index = 0;
+  GLuint tmp_min, tmp_max;
+
   ib.count = (max_index_ptr - min_index_ptr) / index_type_size;
   ib.type = type;
   ib.obj = ctx-Array.ArrayObj-ElementArrayBufferObj;
   ib.ptr = (void *)min_index_ptr;
 
+  tmp_ib = ib;
+
   for (i = 0; i  primcount; i++) {
 prim[i].begin = (i == 0);
 prim[i].end = (i == primcount - 1);
@@ -1166,11 +1173,16 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
GLenum mode,
prim[i].basevertex = basevertex[i];
 else
prim[i].basevertex = 0;
+
+ tmp_ib.ptr = indices[i];
+ vbo_get_minmax_index(ctx, prim[i], tmp_ib, tmp_min, tmp_max);
+ min_index = MIN2(min_index, tmp_min);
+ max_index = MAX2(max_index, tmp_max);
   }
 
   check_buffers_are_unmapped(exec-array.inputs);
   vbo-draw_prims(ctx, exec-array.inputs, prim, primcount, ib,
- GL_FALSE, ~0, ~0, NULL);
+  GL_TRUE, min_index, max_index, NULL);
} else {
   /* render one prim at a time */
   for (i = 0; i  primcount; i++) {
-- 
1.7.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] vl: Only initialize vlc once

2011-12-22 Thread Christian König

Hi Maarten,

On 21.12.2011 21:52, Maarten Lankhorst wrote:

It would be nice if you inlined patches for easier reviewing. :)
Well I can try, but I can't promise that Thunderbird isn't badly fucking 
up all whitespaces, newest version of the patch is in-lined below.



I'm spotting an overflow that could be triggered with 64 single-byte unaligned 
buffers, maybe this is better:

Should be fixed.

With all the pointer math, maybe change the type for 'end' and 'data' to
uint8_t? Then you would only need that single cast in fillbits (which
I did above) and you can kill all the casts everywhere.

Yeah, we now use it as int8_t more often, that saves at least some casts.


Another thought, this would prevent the need to read past end of file which could 
show up erroneously in valgrind, with something like this: if (end-data= 4) { 
// Nom nom 4 bytes } else { // Read at most 3 bytes }
Ok, take a look below. I now use if(bytes_left ==0) {...} else if 
(bytes_left = 4) {...} else {...} construct, should this work or do you 
see any more corner cases I missed?


I also inverted the valid_bits handling to invalid_bits and moved range 
where invalid_bits is now moving in between to -32 and 32, that should 
save a few more CPU cycles.


Last but not least I added at least one sentence of documentation to 
each function. Any more thoughts or should we commit that now?




Only initialize vlc in MPEG2 decoding once for all slices,
add more sanity checks to vlc decoding functions, support
multiple vlc input buffer, improve documentation of the
vlc functions.

v2: also implement multiple inputs for the vlc functions
v3: some bug fixes for buffer size and alignment corner cases
v4: rework of the patch, add some more improvements and documentation

Signed-off-by: Maarten Lankhorstm.b.lankho...@gmail.com
Signed-off-by: Christian Königdeathsim...@vodafone.de
---
 src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c |   46 +++
 src/gallium/auxiliary/vl/vl_vlc.h  |  169 +++-
 2 files changed, 156 insertions(+), 59 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c 
b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
index 936cf2c..7e20d71 100644
--- a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
+++ b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
@@ -786,7 +786,7 @@ entry:
}
 }

-static INLINE bool
+static INLINE void
 decode_slice(struct vl_mpg12_bs *bs)
 {
struct pipe_mpeg12_macroblock mb;
@@ -800,6 +800,7 @@ decode_slice(struct vl_mpg12_bs *bs)
mb.blocks = dct_blocks;

reset_predictor(bs);
+   vl_vlc_fillbits(bs-vlc);
dct_scale = quant_scale[bs-desc.q_scale_type][vl_vlc_get_uimsbf(bs-vlc, 
5)];

if (vl_vlc_get_uimsbf(bs-vlc, 1))
@@ -807,13 +808,15 @@ decode_slice(struct vl_mpg12_bs *bs)
  vl_vlc_fillbits(bs-vlc);

vl_vlc_fillbits(bs-vlc);
+   assert(vl_vlc_bits_left(bs-vlc)  23  vl_vlc_peekbits(bs-vlc, 23));
do {
   int inc = 0;

-  while (vl_vlc_peekbits(bs-vlc, 11) == 15) {
- vl_vlc_eatbits(bs-vlc, 11);
- vl_vlc_fillbits(bs-vlc);
-  }
+  if (bs-decoder-profile == PIPE_VIDEO_PROFILE_MPEG1)
+ while (vl_vlc_peekbits(bs-vlc, 11) == 15) {
+vl_vlc_eatbits(bs-vlc, 11);
+vl_vlc_fillbits(bs-vlc);
+ }

   while (vl_vlc_peekbits(bs-vlc, 11) == 8) {
  vl_vlc_eatbits(bs-vlc, 11);
@@ -928,7 +931,6 @@ decode_slice(struct vl_mpg12_bs *bs)

mb.num_skipped_macroblocks = 0;
bs-decoder-decode_macroblock(bs-decoder,mb.base, 1);
-   return true;
 }

 void
@@ -959,32 +961,22 @@ void
 vl_mpg12_bs_decode(struct vl_mpg12_bs *bs, unsigned num_bytes, const uint8_t 
*buffer)
 {
assert(bs);
-   assert(buffer  num_bytes);

-   while(num_bytes  2) {
-  if (buffer[0] == 0x00  buffer[1] == 0x00  buffer[2] == 0x01
-   buffer[3]= 0x01  buffer[3]  0xAF) {
- unsigned consumed;
+   vl_vlc_init(bs-vlc, 1, (const void * const *)buffer,num_bytes);
+   while (vl_vlc_bits_left(bs-vlc)  32) {
+  uint32_t code = vl_vlc_peekbits(bs-vlc, 32);

- buffer += 3;
- num_bytes -= 3;
+  if (code= 0x101  code= 0x1AF) {
+ vl_vlc_eatbits(bs-vlc, 24);
+ decode_slice(bs);

- vl_vlc_init(bs-vlc, buffer, num_bytes);
-
- if (!decode_slice(bs))
-return;
-
- consumed = num_bytes - vl_vlc_bits_left(bs-vlc) / 8;
-
- /* crap, this is a bug we have consumed more bytes than left in the 
buffer */
- assert(consumed= num_bytes);
-
- num_bytes -= consumed;
- buffer += consumed;
+ /* align to a byte again */
+ vl_vlc_eatbits(bs-vlc, vl_vlc_valid_bits(bs-vlc)  7);

   } else {
- ++buffer;
- --num_bytes;
+ vl_vlc_eatbits(bs-vlc, 8);
   }
+
+  vl_vlc_fillbits(bs-vlc);
}
 }
diff --git a/src/gallium/auxiliary/vl/vl_vlc.h 
b/src/gallium/auxiliary/vl/vl_vlc.h
index dc4faed..5e5e64c 100644
--- a/src/gallium/auxiliary/vl/vl_vlc.h
+++ 

[Mesa-dev] [PATCH] glsl_to_tgsi: v2 Invalidate and revalidate uniform backing storage

2011-12-22 Thread Vadim Girlin
If glUniform1i and friends are going to dump data directly in
driver-allocated, the pointers have to be updated when the storage
moves.  This should fix the regressions seen with commit 7199096.

I'm not sure if this is the only place that needs this treatment.  I'm
a little uncertain about the various functions in st_glsl_to_tgsi that
modify the TGSI IR and try to propagate changes about that up to the
gl_program.  That seems sketchy to me.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com

v2:

Revalidate when shader_program is not NULL.
Update the pointers for all _LinkedShaders.
Init glsl_to_tgsi_visitor::shader_program to NULL in the
get_pixel_transfer_visitor  get_bitmap_visitor.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Based on the patch from Ian Romanick:
http://lists.freedesktop.org/archives/mesa-dev/2011-November/014675.html

Fixes uniform regressions with r600g (and probably other drivers)
after commit 719909698c67c287a393d2380278e7b7495ae018

Tested on evergreen with r600.tests: no regressions.

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 77aa0d1..fce92bb 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3708,6 +3708,7 @@ get_pixel_transfer_visitor(struct st_fragment_program *fp,
/* Copy attributes of the glsl_to_tgsi_visitor in the original shader. */
v-ctx = original-ctx;
v-prog = prog;
+   v-shader_program = NULL;
v-glsl_version = original-glsl_version;
v-native_integers = original-native_integers;
v-options = original-options;
@@ -3837,6 +3838,7 @@ get_bitmap_visitor(struct st_fragment_program *fp,
/* Copy attributes of the glsl_to_tgsi_visitor in the original shader. */
v-ctx = original-ctx;
v-prog = prog;
+   v-shader_program = NULL;
v-glsl_version = original-glsl_version;
v-native_integers = original-native_integers;
v-options = original-options;
@@ -4550,6 +4552,15 @@ st_translate_program(
t-pointSizeOutIndex = -1;
t-prevInstWrotePointSize = GL_FALSE;
 
+   if (program-shader_program) {
+  for (i = 0; i  program-shader_program-NumUserUniformStorage; i++) {
+ struct gl_uniform_storage *const storage =
+   program-shader_program-UniformStorage[i];
+
+ _mesa_uniform_detach_all_driver_storage(storage);
+  }
+   }
+
/*
 * Declare input attributes.
 */
@@ -4776,6 +4787,20 @@ st_translate_program(
t-insn[t-labels[i].branch_target]);
}
 
+   if (program-shader_program) {
+  /* This has to be done last.  Any operation the can cause
+   * prog-ParameterValues to get reallocated (e.g., anything that adds a
+   * program constant) has to happen before creating this linkage.
+   */
+  for (unsigned i = 0; i  MESA_SHADER_TYPES; i++) {
+ if (program-shader_program-_LinkedShaders[i] == NULL)
+continue;
+
+ _mesa_associate_uniform_storage(ctx, program-shader_program,
+   
program-shader_program-_LinkedShaders[i]-Program-Parameters);
+  }
+   }
+
 out:
if (t) {
   FREE(t-insn);
-- 
1.7.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: consolidate texstore functions

2011-12-22 Thread Brian Paul
From: Brian Paul bri...@vmware.com

The code for storing 1D, 2D and 3D tex images (whole or sub-images) was
all pretty similar.  This consolidates those six paths.

v2: rework switch statement to catch unexpected targets
---
 src/mesa/main/texstore.c |  484 +++---
 1 files changed, 153 insertions(+), 331 deletions(-)

diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
index fb1ad04..86c35d3 100644
--- a/src/mesa/main/texstore.c
+++ b/src/mesa/main/texstore.c
@@ -4565,9 +4565,137 @@ get_read_write_mode(GLenum userFormat, gl_format 
texFormat)
   return GL_MAP_WRITE_BIT;
 }
 
+
+/**
+ * Helper function for storing 1D, 2D, 3D whole and subimages into texture
+ * memory.
+ * The source of the image data may be user memory or a PBO.  In the later
+ * case, we'll map the PBO, copy from it, then unmap it.
+ */
+static void
+store_texsubimage(struct gl_context *ctx,
+  struct gl_texture_image *texImage,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLint width, GLint height, GLint depth,
+  GLenum format, GLenum type, const GLvoid *pixels,
+  const struct gl_pixelstore_attrib *packing,
+  const char *caller)
+
+{
+   const GLbitfield mapMode = get_read_write_mode(format, texImage-TexFormat);
+   const GLenum target = texImage-TexObject-Target;
+   GLboolean success = GL_FALSE;
+   GLuint dims, slice, numSlices = 1, sliceOffset = 0;
+   GLint srcImageStride = 0;
+   const GLubyte *src;
+
+   assert(xoffset + width = texImage-Width);
+   assert(yoffset + height = texImage-Height);
+   assert(zoffset + depth = texImage-Depth);
+
+   switch (target) {
+   case GL_TEXTURE_1D:
+  dims = 1;
+  break;
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_3D:
+  dims = 3;
+  break;
+   default:
+  dims = 2;
+   }
+
+   /* get pointer to src pixels (may be in a pbo which we'll map here) */
+   src = (const GLubyte *)
+  _mesa_validate_pbo_teximage(ctx, dims, width, height, depth,
+  format, type, pixels, packing, caller);
+   if (!src)
+  return;
+
+   /* compute slice info (and do some sanity checks) */
+   switch (target) {
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_RECTANGLE:
+   case GL_TEXTURE_CUBE_MAP:
+  /* one image slice, nothing special needs to be done */
+  break;
+   case GL_TEXTURE_1D:
+  assert(height == 1);
+  assert(depth == 1);
+  assert(yoffset == 0);
+  assert(zoffset == 0);
+  break;
+   case GL_TEXTURE_1D_ARRAY:
+  assert(depth == 1);
+  assert(zoffset == 0);
+  numSlices = height;
+  sliceOffset = yoffset;
+  height = 1;
+  yoffset = 0;
+  srcImageStride = _mesa_image_row_stride(packing, width, format, type);
+  break;
+   case GL_TEXTURE_2D_ARRAY:
+  numSlices = depth;
+  sliceOffset = zoffset;
+  depth = 1;
+  zoffset = 0;
+  srcImageStride = _mesa_image_image_stride(packing, width, height,
+format, type);
+  break;
+   case GL_TEXTURE_3D:
+  /* we'll store 3D images as a series of slices */
+  numSlices = depth;
+  sliceOffset = zoffset;
+  srcImageStride = _mesa_image_image_stride(packing, width, height,
+format, type);
+  break;
+   default:
+  _mesa_warning(ctx, Unexpected target 0x%x in store_texsubimage(), 
target);
+  return;
+   }
+
+   assert(numSlices == 1 || srcImageStride != 0);
+
+   for (slice = 0; slice  numSlices; slice++) {
+  GLubyte *dstMap;
+  GLint dstRowStride;
+
+  ctx-Driver.MapTextureImage(ctx, texImage,
+  slice + sliceOffset,
+  xoffset, yoffset, width, height,
+  mapMode, dstMap, dstRowStride);
+  if (dstMap) {
+ /* Note: we're only storing a 2D (or 1D) slice at a time but we need
+  * to pass the right 'dims' value so that GL_UNPACK_SKIP_IMAGES is
+  * used for 3D images.
+  */
+ success = _mesa_texstore(ctx, dims, texImage-_BaseFormat,
+  texImage-TexFormat,
+  0, 0, 0,  /* dstX/Y/Zoffset */
+  dstRowStride,
+  dstMap,
+  width, height, 1,  /* w, h, d */
+  format, type, src, packing);
+
+ ctx-Driver.UnmapTextureImage(ctx, texImage, slice + sliceOffset);
+  }
+
+  src += srcImageStride;
+
+  if (!success)
+ break;
+   }
+
+   if (!success)
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, caller);
+
+   _mesa_unmap_teximage_pbo(ctx, packing);
+}
+
+
+
 /**
- * This is the software fallback for Driver.TexImage1D().
- * \sa _mesa_store_teximage2d()
+ * This is the fallback for 

Re: [Mesa-dev] [PATCH] mesa: consolidate texstore functions

2011-12-22 Thread Jose Fonseca
Looks good to me.

Jose

- Original Message -
 From: Brian Paul bri...@vmware.com
 
 The code for storing 1D, 2D and 3D tex images (whole or sub-images)
 was
 all pretty similar.  This consolidates those six paths.
 
 v2: rework switch statement to catch unexpected targets
 ---
  src/mesa/main/texstore.c |  484
  +++---
  1 files changed, 153 insertions(+), 331 deletions(-)
 
 diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
 index fb1ad04..86c35d3 100644
 --- a/src/mesa/main/texstore.c
 +++ b/src/mesa/main/texstore.c
 @@ -4565,9 +4565,137 @@ get_read_write_mode(GLenum userFormat,
 gl_format texFormat)
return GL_MAP_WRITE_BIT;
  }
  
 +
 +/**
 + * Helper function for storing 1D, 2D, 3D whole and subimages into
 texture
 + * memory.
 + * The source of the image data may be user memory or a PBO.  In the
 later
 + * case, we'll map the PBO, copy from it, then unmap it.
 + */
 +static void
 +store_texsubimage(struct gl_context *ctx,
 +  struct gl_texture_image *texImage,
 +  GLint xoffset, GLint yoffset, GLint zoffset,
 +  GLint width, GLint height, GLint depth,
 +  GLenum format, GLenum type, const GLvoid *pixels,
 +  const struct gl_pixelstore_attrib *packing,
 +  const char *caller)
 +
 +{
 +   const GLbitfield mapMode = get_read_write_mode(format,
 texImage-TexFormat);
 +   const GLenum target = texImage-TexObject-Target;
 +   GLboolean success = GL_FALSE;
 +   GLuint dims, slice, numSlices = 1, sliceOffset = 0;
 +   GLint srcImageStride = 0;
 +   const GLubyte *src;
 +
 +   assert(xoffset + width = texImage-Width);
 +   assert(yoffset + height = texImage-Height);
 +   assert(zoffset + depth = texImage-Depth);
 +
 +   switch (target) {
 +   case GL_TEXTURE_1D:
 +  dims = 1;
 +  break;
 +   case GL_TEXTURE_2D_ARRAY:
 +   case GL_TEXTURE_3D:
 +  dims = 3;
 +  break;
 +   default:
 +  dims = 2;
 +   }
 +
 +   /* get pointer to src pixels (may be in a pbo which we'll map
 here) */
 +   src = (const GLubyte *)
 +  _mesa_validate_pbo_teximage(ctx, dims, width, height, depth,
 +  format, type, pixels, packing,
 caller);
 +   if (!src)
 +  return;
 +
 +   /* compute slice info (and do some sanity checks) */
 +   switch (target) {
 +   case GL_TEXTURE_2D:
 +   case GL_TEXTURE_RECTANGLE:
 +   case GL_TEXTURE_CUBE_MAP:
 +  /* one image slice, nothing special needs to be done */
 +  break;
 +   case GL_TEXTURE_1D:
 +  assert(height == 1);
 +  assert(depth == 1);
 +  assert(yoffset == 0);
 +  assert(zoffset == 0);
 +  break;
 +   case GL_TEXTURE_1D_ARRAY:
 +  assert(depth == 1);
 +  assert(zoffset == 0);
 +  numSlices = height;
 +  sliceOffset = yoffset;
 +  height = 1;
 +  yoffset = 0;
 +  srcImageStride = _mesa_image_row_stride(packing, width,
 format, type);
 +  break;
 +   case GL_TEXTURE_2D_ARRAY:
 +  numSlices = depth;
 +  sliceOffset = zoffset;
 +  depth = 1;
 +  zoffset = 0;
 +  srcImageStride = _mesa_image_image_stride(packing, width,
 height,
 +format, type);
 +  break;
 +   case GL_TEXTURE_3D:
 +  /* we'll store 3D images as a series of slices */
 +  numSlices = depth;
 +  sliceOffset = zoffset;
 +  srcImageStride = _mesa_image_image_stride(packing, width,
 height,
 +format, type);
 +  break;
 +   default:
 +  _mesa_warning(ctx, Unexpected target 0x%x in
 store_texsubimage(), target);
 +  return;
 +   }
 +
 +   assert(numSlices == 1 || srcImageStride != 0);
 +
 +   for (slice = 0; slice  numSlices; slice++) {
 +  GLubyte *dstMap;
 +  GLint dstRowStride;
 +
 +  ctx-Driver.MapTextureImage(ctx, texImage,
 +  slice + sliceOffset,
 +  xoffset, yoffset, width, height,
 +  mapMode, dstMap, dstRowStride);
 +  if (dstMap) {
 + /* Note: we're only storing a 2D (or 1D) slice at a time
 but we need
 +  * to pass the right 'dims' value so that
 GL_UNPACK_SKIP_IMAGES is
 +  * used for 3D images.
 +  */
 + success = _mesa_texstore(ctx, dims, texImage-_BaseFormat,
 +  texImage-TexFormat,
 +  0, 0, 0,  /* dstX/Y/Zoffset */
 +  dstRowStride,
 +  dstMap,
 +  width, height, 1,  /* w, h, d */
 +  format, type, src, packing);
 +
 + ctx-Driver.UnmapTextureImage(ctx, texImage, slice +
 sliceOffset);
 +  }
 +
 +  src += srcImageStride;
 +
 +  if (!success)
 + break;
 +   }
 +
 +   if (!success)
 +  _mesa_error(ctx, GL_OUT_OF_MEMORY, 

Re: [Mesa-dev] [PATCH 02/22] swrast: do fast_copy_pixels() with Map/UnmapRenderbuffer()

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 12:58 PM, Eric Anholt e...@anholt.net wrote:
 -   temp = malloc(width * MAX_PIXEL_BYTES);
 -   if (!temp) {
 -      _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels);
 -      return GL_FALSE;
 +      /* different src/dst buffers */
 +      ctx-Driver.MapRenderbuffer(ctx, srcRb, srcX, srcY,
 +                                  width, height,
 +                                  GL_MAP_READ_BIT, srcMap, srcRowStride);
 +      if (!srcMap) {
 +         _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels);
 +         return GL_TRUE; /* don't retry with slow path */
 +      }
 +      ctx-Driver.MapRenderbuffer(ctx, dstRb, dstX, dstY,
 +                                  width, height,
 +                                  GL_MAP_WRITE_BIT, dstMap, dstRowStride);
 +      if (!dstMap) {
 +         ctx-Driver.UnmapRenderbuffer(ctx, srcRb);
 +         _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels);
 +         return GL_TRUE; /* don't retry with slow path */
 +      }
     }

     for (row = 0; row  height; row++) {
 -      srcRb-GetRow(ctx, srcRb, width, srcX, srcY, temp);
 -      dstRb-PutRow(ctx, dstRb, width, dstX, dstY, temp, NULL);
 -      srcY += yStep;
 -      dstY += yStep;
 +      memcpy(dstMap, srcMap, widthInBytes);
 +      dstMap += dstRowStride;
 +      srcMap += srcRowStride;
     }

 So, previously we didn't have to worry about X direction for overlap
 because we used a temp between the Get and Put.  Now, I think you need
 to use memmove instead of memcpy.

 Patch 1, and 3-7 are:
 Reviewed-by: Eric Anholt e...@anholt.net

 this one is too if memmove is the solution.

I'll fix that.  Thanks.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/22] swrast: rewrite _swrast_read_stencil_span()

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 1:01 PM, Eric Anholt e...@anholt.net wrote:
 On Sun, 18 Dec 2011 20:08:13 -0700, Brian Paul bri...@vmware.com wrote:
 Use format pack/unpack functions instead of deprecated renderbuffer
 GetRow/PutRow functions.
 ---
  src/mesa/swrast/s_stencil.c |   31 ++-
  1 files changed, 26 insertions(+), 5 deletions(-)

 diff --git a/src/mesa/swrast/s_stencil.c b/src/mesa/swrast/s_stencil.c
 index 17b3b12..1d78e97 100644
 --- a/src/mesa/swrast/s_stencil.c
 +++ b/src/mesa/swrast/s_stencil.c

  /**
 + * Return the address of a stencil value in a renderbuffer.
 + */
 +static inline GLubyte *
 +get_stencil_address(struct gl_renderbuffer *rb, GLint x, GLint y)
 +{
 +   const GLint bpp = _mesa_get_format_bytes(rb-Format);
 +   const GLint rowStride = rb-RowStride * bpp;
 +   assert(rb-Data);
 +   return (GLubyte *) rb-Data + y * rowStride + x * bpp;
 +}
 +
 +
 +
 +/**
   * Apply the given stencil operator to the array of stencil values.
   * Don't touch stencil[i] if mask[i] is zero.
   * Input:  n - size of stencil array
 @@ -1075,6 +1090,8 @@ _swrast_read_stencil_span(struct gl_context *ctx, 
 struct gl_renderbuffer *rb,
                            GLint n, GLint x, GLint y, GLubyte stencil[])
  {
     GLubyte *src;
 +   const GLuint bpp = _mesa_get_format_bytes(rb-Format);
 +   const GLuint rowStride = rb-RowStride * bpp;

     if (y  0 || y = (GLint) rb-Height ||
         x + n = 0 || x = (GLint) rb-Width) {
 @@ -1096,7 +1113,7 @@ _swrast_read_stencil_span(struct gl_context *ctx, 
 struct gl_renderbuffer *rb,
        return;
     }

 -   src = (GLubyte *) rb-Data + y * rb-RowStride +x;
 +   src = (GLubyte *) rb-Data + y * rowStride + x * bpp;
     _mesa_unpack_ubyte_stencil_row(rb-Format, n, src, stencil);
  }

 Don't you want to just reuse get_stencil_address here?

Yup.


 @@ -1115,9 +1132,10 @@ _swrast_write_stencil_span(struct gl_context *ctx, 
 GLint n, GLint x, GLint y,
                             const GLubyte stencil[] )
  {
     struct gl_framebuffer *fb = ctx-DrawBuffer;
 -   struct gl_renderbuffer *rb = fb-_StencilBuffer;
 +   struct gl_renderbuffer *rb = fb-Attachment[BUFFER_STENCIL].Renderbuffer;
     const GLuint stencilMax = (1  fb-Visual.stencilBits) - 1;
     const GLuint stencilMask = ctx-Stencil.WriteMask[0];
 +   GLubyte *stencilBuf;

     if (y  0 || y = (GLint) rb-Height ||
         x + n = 0 || x = (GLint) rb-Width) {
 @@ -1138,19 +1156,22 @@ _swrast_write_stencil_span(struct gl_context *ctx, 
 GLint n, GLint x, GLint y,
        return;
     }

 +   stencilBuf = get_stencil_address(rb, x, y);
 +
     if ((stencilMask  stencilMax) != stencilMax) {
        /* need to apply writemask */
        GLubyte destVals[MAX_WIDTH], newVals[MAX_WIDTH];
        GLint i;
 -      rb-GetRow(ctx, rb, n, x, y, destVals);
 +
 +      _mesa_unpack_ubyte_stencil_row(rb-Format, n, stencilBuf, destVals);
        for (i = 0; i  n; i++) {
           newVals[i]
              = (stencil[i]  stencilMask) | (destVals[i]  ~stencilMask);
        }
 -      rb-PutRow(ctx, rb, n, x, y, newVals, NULL);
 +      _mesa_pack_ubyte_stencil_row(rb-Format, n, destVals, stencilBuf);

 s/destVals/newVals/ ?

Will do.  R-b?

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Drooping multiple driver support in EGL?

2011-12-22 Thread Chia-I Wu
Hi list,

Multiple driver support in EGL is hard to get right, if not impossible.
On Linux desktop, we almost always want to use egl_dri2.  It allows EGL
to loads DRI2 drivers, thus allowing it to share DRI2 drivers with
libGL.

In one case where the app wants to use OpenVG, libEGL needs to load
egl_gallium instead.  The problem comes from that we cannot know that an
OpenVG context is to be created until it is created.  But before a
context can be created, EGL needs to be initialized and an EGLConfig
needs to be chosen.  So when EGL is to be initialized, we need to load
and initilaize all EGL drivers.  When an EGLConfig is to be picked, we
need to pick it from all drivers.

But this also introduces new problems.   For example, when the vendor
string or the extension string is queried, whose string of all EGL
drivers should be returned?

My proposal is to simply drop multiple driver support from EGL.
Instead, we will provide four libEGL implementations:

 - libEGL_dri2: derived from egl_dri2
 - libEGL_gallium: derived from egl_gallium
 - libEGL_glx: derived from egl_glx
 - libEGL_loader: see below

All of them are conformant EGL implementations.  That is, any one of
them can be installed as /usr/lib/libEGL.so.

libEGL_loader is new.  It is basically a wrapper that loads another
implementation to do the real work.  As such, the problems we face with
multiple driver support will remain in libEGL_loader.

Distros may choose to install libEGL_loader as libEGL and let it pick
the real implementation.  Or they may choose to have the first three
installed as libEGL, and package them separately.

Thoughts?

-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/22] swrast: fast_draw_depth_stencil() for glDrawPixels(GL_DEPTH_STENCIL)

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 1:30 PM, Eric Anholt e...@anholt.net wrote:
 On Sun, 18 Dec 2011 20:08:21 -0700, Brian Paul bri...@vmware.com wrote:
 Stop using deprecated renderbuffer PutRow() function.  Note that we
 aren't using Map/UnmapRenderbuffer() yet because this call is inside
 a swrast_render_start/finish() pair.
 ---
  src/mesa/swrast/s_drawpix.c |   64 
 ---
  1 files changed, 48 insertions(+), 16 deletions(-)

 diff --git a/src/mesa/swrast/s_drawpix.c b/src/mesa/swrast/s_drawpix.c
 index 4a661a0..19b43f6 100644
 --- a/src/mesa/swrast/s_drawpix.c
 +++ b/src/mesa/swrast/s_drawpix.c
 @@ -551,6 +551,49 @@ draw_rgba_pixels( struct gl_context *ctx, GLint x, 
 GLint y,


  /**
 + * Draw depth+stencil values into a MESA_FORAMT_Z24_S8 or MESA_FORMAT_S8_Z24
 + * renderbuffer.  No masking, zooming, scaling, etc.
 + */
 +static void
 +fast_draw_depth_stencil(struct gl_context *ctx, GLint x, GLint y,
 +                        GLsizei width, GLsizei height,
 +                        const struct gl_pixelstore_attrib *unpack,
 +                        const GLvoid *pixels)
 +{

 +   for (i = 0; i  height; i++) {
 +      if (rb-Format == MESA_FORMAT_Z24_S8) {
 +         memcpy(dst, src, width * 4);
 +      }
 +      else {
 +         /* swap Z24_S8 - S8_Z24 */
 +         GLuint j, *dst4 = (GLuint *) dst, *src4 = (GLuint *) src;
 +         for (j = 0; j  width; j++) {
 +            dst4[j] = (src4[j]  24) | (src4[j]  8);
 +         }
 +      }

 Reuse _mesa_pack_uint_24_8_depth_stencil_row() here?  Other than that,
 looks good.

Yeah, I'll fix that.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: Save and restore GL_RASTERIZER_DISCARD state during meta ops.

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 2:39 PM, Paul Berry stereotype...@gmail.com wrote:
 During meta-operations (such as _mesa_meta_GenerateMipmap()), we need
 to be able to draw even if GL_RASTERIZER_DISCARD is enabled.  This
 patch causes _mesa_meta_begin() to save the state of
 GL_RASTERIZER_DISCARD and disable it (so that drawing can be done
 during the meta-op), and causes _mesa_meta_end() to restore it.

 Fixes piglit test EXT_transform_feedback/generatemipmap discard on
 i965 Gen6.


Reviewed-by: Brian Paul bri...@vmare.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] mesa: Ensure that Paused is reset to false on EndTransformFeedback.

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 2:39 PM, Paul Berry stereotype...@gmail.com wrote:
 If a client calls BeginTransformFeedback(), then
 PauseTransformFeedback(), then EndTransformFeedback(), we need to make
 sure that the transform feedback object is not left in a paused
 state, otherwise the next call to BeginTransformFeedback() will leave
 transform feedback paused.
 ---
  src/mesa/main/transformfeedback.c |    1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/src/mesa/main/transformfeedback.c 
 b/src/mesa/main/transformfeedback.c
 index 53c09e2..fea711a 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -387,6 +387,7 @@ _mesa_EndTransformFeedback(void)

    FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
    ctx-TransformFeedback.CurrentObject-Active = GL_FALSE;
 +   ctx-TransformFeedback.CurrentObject-Paused = GL_FALSE;
    ctx-TransformFeedback.CurrentObject-EndedAnytime = GL_TRUE;

    assert(ctx-Driver.EndTransformFeedback);

Reviewed-by: Brian Paul bri...@vmare.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: signal _NEW_ARRAY when transitioning between glBegin/End, glDrawArrays

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 10:36 AM, Marek Olšák mar...@gmail.com wrote:
 Hi Brian,

 Is there a reason to set _NEW_ARRAY when transitioning between
 DrawArrays and DrawElements?

Probably not, actually.  I was just being overly paranoid.  I can fix that.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/22] swrast: rewrite _swrast_read_stencil_span()

2011-12-22 Thread Eric Anholt
On Thu, 22 Dec 2011 09:31:59 -0700, Brian Paul brian.e.p...@gmail.com wrote:
 On Wed, Dec 21, 2011 at 1:01 PM, Eric Anholt e...@anholt.net wrote:
  On Sun, 18 Dec 2011 20:08:13 -0700, Brian Paul bri...@vmware.com wrote:
      if ((stencilMask  stencilMax) != stencilMax) {
         /* need to apply writemask */
         GLubyte destVals[MAX_WIDTH], newVals[MAX_WIDTH];
         GLint i;
  -      rb-GetRow(ctx, rb, n, x, y, destVals);
  +
  +      _mesa_unpack_ubyte_stencil_row(rb-Format, n, stencilBuf, destVals);
         for (i = 0; i  n; i++) {
            newVals[i]
               = (stencil[i]  stencilMask) | (destVals[i]  ~stencilMask);
         }
  -      rb-PutRow(ctx, rb, n, x, y, newVals, NULL);
  +      _mesa_pack_ubyte_stencil_row(rb-Format, n, destVals, stencilBuf);
 
  s/destVals/newVals/ ?
 
 Will do.  R-b?

Yeah.


pgpJHjiNbVCav.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/22] swrast: stop using depth/stencil wrappers in CopyPixels code

2011-12-22 Thread Eric Anholt
On Thu, 22 Dec 2011 09:49:33 -0700, Brian Paul brian.e.p...@gmail.com wrote:
 On Wed, Dec 21, 2011 at 1:20 PM, Eric Anholt e...@anholt.net wrote:
  On Sun, 18 Dec 2011 20:08:19 -0700, Brian Paul bri...@vmware.com wrote:
  The functions that read depth/stencil values understand all (packed)
  depth/stencil buffer formats now so there's no reason to use the
  wrappers.
 
  Also, improve the format checks in fast_copy_pixels() to catch mismatched
  depth/stencil cases.
 
  +   if (type == GL_STENCIL || type == GL_DEPTH_COMPONENT) {
  +      /* can't handle packed depth+stencil here */
  +      if (_mesa_is_format_packed_depth_stencil(srcRb-Format) ||
  +          _mesa_is_format_packed_depth_stencil(dstRb-Format))
  +         return GL_FALSE;
  +   }
  +   else if (type == GL_DEPTH_STENCIL) {
  +      /* can't handle separate depth/stencil buffers */
  +      if (!_mesa_is_format_packed_depth_stencil(srcRb-Format) ||
  +          !_mesa_is_format_packed_depth_stencil(dstRb-Format))
  +         return GL_FALSE;
  +   }
 
  I think the GL_DEPTH_STENCIL test here wants
  srcRb != srcFb-Attachment[BUFFER_STENCIL].Renderbuffer and same for
  dst.  Other than that, looks good.
 
 And remove the _mesa_is_format_packed_depth_stencil() calls, right?
 If Att[BUFFER_DEPTH] == Att[BUFFER_STENCIL] we clearly have a combined
 depth+stencil buffer.

Yeah.


pgpJLkmeg8XVe.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/22] swrast: stop using _DepthBuffer in triangle code

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 2:16 PM, Eric Anholt e...@anholt.net wrote:
 On Sun, 18 Dec 2011 20:08:26 -0700, Brian Paul bri...@vmware.com wrote:
 The only consequence is we can only use the occlusion_zless_16_triangle()
 function with MESA_FORMAT_Z16.

 I'm not following that conclusion, probably due to ignorance of swrast
 spans code.  I would think that Z32 would still work for that other path
 -- or is span.z stored as something other than 32 bits of Z in that
 case, too?

Z32 would still work, but in practice that format of depth buffer is
seldom used with swrast.  Z16 is the default.  I was tempted to remove
the function entirely.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] meta: Disable GL_TEXTURE_EXTERNAL_OES in meta_begin()

2011-12-22 Thread Brian Paul
On Wed, Dec 21, 2011 at 7:34 PM, Chad Versace
chad.vers...@linux.intel.com wrote:
 If the meta flag MESA_META_TEXTURE is present, then disable the texture
 target GL_TEXTURE_EXTERNAL_OES.

 Signed-off-by: Chad Versace chad.vers...@linux.intel.com
 ---
  src/mesa/drivers/common/meta.c |    2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)

 diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
 index c5c59eb..5673205 100644
 --- a/src/mesa/drivers/common/meta.c
 +++ b/src/mesa/drivers/common/meta.c
 @@ -576,6 +576,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
                _mesa_set_enable(ctx, GL_TEXTURE_CUBE_MAP, GL_FALSE);
             if (ctx-Extensions.NV_texture_rectangle)
                _mesa_set_enable(ctx, GL_TEXTURE_RECTANGLE, GL_FALSE);
 +            if (ctx-Extensions.OES_EGL_image_external)
 +               _mesa_set_enable(ctx, GL_TEXTURE_EXTERNAL_OES, GL_FALSE);
             _mesa_set_enable(ctx, GL_TEXTURE_GEN_S, GL_FALSE);
             _mesa_set_enable(ctx, GL_TEXTURE_GEN_T, GL_FALSE);
             _mesa_set_enable(ctx, GL_TEXTURE_GEN_R, GL_FALSE);


Reviewed-by: Brian Paul bri...@vmare.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swrast: stop using depth/stencil wrappers in CopyPixels code

2011-12-22 Thread Brian Paul
From: Brian Paul bri...@vmware.com

The functions that read depth/stencil values understand all (packed)
depth/stencil buffer formats now so there's no reason to use the
wrappers.

Also, improve the format checks in fast_copy_pixels() to catch mismatched
depth/stencil cases.

v2: fix the test for combined depth+stencil buffers, per Eric.
---
 src/mesa/swrast/s_copypix.c |   29 +
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/mesa/swrast/s_copypix.c b/src/mesa/swrast/s_copypix.c
index 9769a47..907645e 100644
--- a/src/mesa/swrast/s_copypix.c
+++ b/src/mesa/swrast/s_copypix.c
@@ -245,7 +245,7 @@ copy_depth_pixels( struct gl_context *ctx, GLint srcx, 
GLint srcy,
GLint destx, GLint desty )
 {
struct gl_framebuffer *fb = ctx-ReadBuffer;
-   struct gl_renderbuffer *readRb = fb-_DepthBuffer;
+   struct gl_renderbuffer *readRb = fb-Attachment[BUFFER_DEPTH].Renderbuffer;
GLfloat *p, *tmpImage;
GLint sy, dy, stepy;
GLint j;
@@ -339,7 +339,7 @@ copy_stencil_pixels( struct gl_context *ctx, GLint srcx, 
GLint srcy,
  GLint destx, GLint desty )
 {
struct gl_framebuffer *fb = ctx-ReadBuffer;
-   struct gl_renderbuffer *rb = fb-_StencilBuffer;
+   struct gl_renderbuffer *rb = fb-Attachment[BUFFER_STENCIL].Renderbuffer;
GLint sy, dy, stepy;
GLint j;
GLubyte *p, *tmpImage;
@@ -446,7 +446,7 @@ copy_depth_stencil_pixels(struct gl_context *ctx,
 
depthDrawRb = ctx-DrawBuffer-_DepthBuffer;
depthReadRb = ctx-ReadBuffer-_DepthBuffer;
-   stencilReadRb = ctx-ReadBuffer-_StencilBuffer;
+   stencilReadRb = ctx-ReadBuffer-Attachment[BUFFER_STENCIL].Renderbuffer;
 
ASSERT(depthDrawRb);
ASSERT(depthReadRb);
@@ -599,7 +599,7 @@ copy_depth_stencil_pixels(struct gl_context *ctx,
 
 
 /**
- * Try to do a fast copy pixels.
+ * Try to do a fast copy pixels with memcpy.
  * \return GL_TRUE if successful, GL_FALSE otherwise.
  */
 static GLboolean
@@ -630,12 +630,12 @@ fast_copy_pixels(struct gl_context *ctx,
   dstRb = dstFb-_ColorDrawBuffers[0];
}
else if (type == GL_STENCIL) {
-  srcRb = srcFb-_StencilBuffer;
-  dstRb = dstFb-_StencilBuffer;
+  srcRb = srcFb-Attachment[BUFFER_STENCIL].Renderbuffer;
+  dstRb = dstFb-Attachment[BUFFER_STENCIL].Renderbuffer;
}
else if (type == GL_DEPTH) {
-  srcRb = srcFb-_DepthBuffer;
-  dstRb = dstFb-_DepthBuffer;
+  srcRb = srcFb-Attachment[BUFFER_DEPTH].Renderbuffer;
+  dstRb = dstFb-Attachment[BUFFER_DEPTH].Renderbuffer;
}
else {
   ASSERT(type == GL_DEPTH_STENCIL_EXT);
@@ -649,6 +649,19 @@ fast_copy_pixels(struct gl_context *ctx,
   return GL_FALSE;
}
 
+   if (type == GL_STENCIL || type == GL_DEPTH_COMPONENT) {
+  /* can't handle packed depth+stencil here */
+  if (_mesa_is_format_packed_depth_stencil(srcRb-Format) ||
+  _mesa_is_format_packed_depth_stencil(dstRb-Format))
+ return GL_FALSE;
+   }
+   else if (type == GL_DEPTH_STENCIL) {
+  /* can't handle separate depth/stencil buffers */
+  if (srcRb != srcFb-Attachment[BUFFER_STENCIL].Renderbuffer ||
+  dstRb != dstFb-Attachment[BUFFER_STENCIL].Renderbuffer)
+ return GL_FALSE;
+   }
+
/* clipping not supported */
if (srcX  0 || srcX + width  (GLint) srcFb-Width ||
srcY  0 || srcY + height  (GLint) srcFb-Height ||
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swrast: new fast_draw_depth_stencil() for glDrawPixels(GL_DEPTH_STENCIL)

2011-12-22 Thread Brian Paul
From: Brian Paul bri...@vmware.com

Stop using deprecated renderbuffer PutRow() function.  Note that we
aren't using Map/UnmapRenderbuffer() yet because this call is inside
a swrast_render_start/finish() pair.

v2: use _mesa_pack_uint_24_8_depth_stencil_row(), per Eric.
---
 src/mesa/swrast/s_drawpix.c |   56 ++
 1 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/src/mesa/swrast/s_drawpix.c b/src/mesa/swrast/s_drawpix.c
index 4a661a0..e9136d5 100644
--- a/src/mesa/swrast/s_drawpix.c
+++ b/src/mesa/swrast/s_drawpix.c
@@ -551,6 +551,41 @@ draw_rgba_pixels( struct gl_context *ctx, GLint x, GLint y,
 
 
 /**
+ * Draw depth+stencil values into a MESA_FORAMT_Z24_S8 or MESA_FORMAT_S8_Z24
+ * renderbuffer.  No masking, zooming, scaling, etc.
+ */
+static void
+fast_draw_depth_stencil(struct gl_context *ctx, GLint x, GLint y,
+GLsizei width, GLsizei height,
+const struct gl_pixelstore_attrib *unpack,
+const GLvoid *pixels)
+{
+   const GLenum format = GL_DEPTH_STENCIL_EXT;
+   const GLenum type = GL_UNSIGNED_INT_24_8;
+   struct gl_renderbuffer *rb =
+  ctx-DrawBuffer-Attachment[BUFFER_DEPTH].Renderbuffer;
+   GLubyte *src, *dst;
+   GLint srcRowStride, dstRowStride;
+   GLint i;
+
+   src = _mesa_image_address2d(unpack, pixels, width, height,
+   format, type, 0, 0);
+   srcRowStride = _mesa_image_row_stride(unpack, width, format, type);
+
+   dst = _swrast_pixel_address(rb, x, y);
+   dstRowStride = rb-RowStride * 4;
+
+   for (i = 0; i  height; i++) {
+  _mesa_pack_uint_24_8_depth_stencil_row(rb-Format, width,
+ (const GLuint *) src, dst);
+  dst += dstRowStride;
+  src += srcRowStride;
+   }
+}
+
+
+
+/**
  * This is a bit different from drawing GL_DEPTH_COMPONENT pixels.
  * The only per-pixel operations that apply are depth scale/bias,
  * stencil offset/shift, GL_DEPTH_WRITEMASK and GL_STENCIL_WRITEMASK,
@@ -587,27 +622,16 @@ draw_depth_stencil_pixels(struct gl_context *ctx, GLint 
x, GLint y,
ASSERT(depthRb);
ASSERT(stencilRb);
 
-   if (depthRb-_BaseFormat == GL_DEPTH_STENCIL_EXT 
-   depthRb-Format == MESA_FORMAT_Z24_S8 
+   if (depthRb == stencilRb 
+   (depthRb-Format == MESA_FORMAT_Z24_S8 ||
+depthRb-Format == MESA_FORMAT_S8_Z24) 
type == GL_UNSIGNED_INT_24_8 
-   depthRb == stencilRb 
-   depthRb-GetRow   /* May be null if depthRb is a wrapper around
-   * separate depth and stencil buffers. */
!scaleOrBias 
!zoom 
ctx-Depth.Mask 
(stencilMask  0xff) == 0xff) {
-  /* This is the ideal case.
-   * Drawing GL_DEPTH_STENCIL pixels into a combined depth/stencil buffer.
-   * Plus, no pixel transfer ops, zooming, or masking needed.
-   */
-  GLint i;
-  for (i = 0; i  height; i++) {
- const GLuint *src = (const GLuint *) 
-_mesa_image_address2d(clippedUnpack, pixels, width, height,
-  GL_DEPTH_STENCIL_EXT, type, i, 0);
- depthRb-PutRow(ctx, depthRb, width, x, y + i, src, NULL);
-  }
+  fast_draw_depth_stencil(ctx, x, y, width, height,
+  clippedUnpack, pixels);
}
else {
   /* sub-optimal cases:
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965: get the jmp distance by instruction index

2011-12-22 Thread Eric Anholt
On Thu, 22 Dec 2011 10:18:05 +0800, Yuanhan Liu yuanhan@linux.intel.com 
wrote:
 On Wed, Dec 21, 2011 at 05:57:35AM -0800, Eric Anholt wrote:
  On Wed, 21 Dec 2011 17:33:41 +0800, Yuanhan Liu 
  yuanhan@linux.intel.com wrote:
   If dynamic instruction store size is enabled, while after the brw_JMPI()
   and before the brw_land_fwd_jump() function, the eu instruction store
   base address(p-store) may change. Thus, the safe way to reference the
   jmp instruction is by index instead of by the instruction address.
  
  Our other instructions return the instruction pointer, I don't think
  jmpi should be special in that respect.
 
 Right. Fixed and how about the following patch?

Reviewed-by: Eric Anholt e...@anholt.net


pgpke5a5pBpVc.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't use BRW_DEPTHFORMAT_D24_UNORM_X8_UINT on Gen4.

2011-12-22 Thread Eric Anholt
On Wed, 21 Dec 2011 16:36:45 -0800, Kenneth Graunke kenn...@whitecape.org 
wrote:
 X8 depth formats weren't supported until Ironlake (Gen 5).
 
 Fixes GPU hangs introduced in d84a180417d1eabd680554970f1eaaa93abcd41e.
 One example test case was fbo-missing-attachment-blit from.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

Reviewed-by: Eric Anholt e...@anholt.net


pgpUOToCT6G49.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] i965 gen6: Fix interactions between transform feedback and meta-ops.

2011-12-22 Thread Eric Anholt
On Wed, 21 Dec 2011 13:39:05 -0800, Paul Berry stereotype...@gmail.com wrote:
 This patch series ensures that meta-ops (such as glClear or
 glGenerateMipmapEXT) function properly when transform feedback or
 rasterizer discard is enabled.
 
 Most of the code changes necessary to make this work are in core mesa:
 patches 1/5 and 5/5 ensure that meta ops properly pause transform
 feedback and disable rasterizer discard (and restore the state
 properly when the meta op is over).  Patch 2/5 ensures that
 PauseTransformFeedback interacts properly with BeginTransformFeedback
 and EndTransformFeedback (so that there is no danger of transform
 feedback being in a paused state after a call to
 BeginTransformFeedback).  Patch 3/5 ensures that that while transform
 feedback is paused, it's possible to switch programs and do drawing
 that isn't compatible with the transform feedback mode.
 
 Patch 4/5 implements transform feedback pause/resume functionality in
 the i965 driver.  We don't expose this functionality to the user yet,
 but we need it for meta ops to work correctly.

The series is

Reviewed-by: Eric Anholt e...@anholt.net


pgpNBmEPw3ABl.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Drooping multiple driver support in EGL?

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 09:01 AM, Chia-I Wu wrote:
 Hi list,
 
 Multiple driver support in EGL is hard to get right, if not impossible.
 On Linux desktop, we almost always want to use egl_dri2.  It allows EGL
 to loads DRI2 drivers, thus allowing it to share DRI2 drivers with
 libGL.
 
 In one case where the app wants to use OpenVG, libEGL needs to load
 egl_gallium instead.  The problem comes from that we cannot know that an
 OpenVG context is to be created until it is created.  But before a
 context can be created, EGL needs to be initialized and an EGLConfig
 needs to be chosen.  So when EGL is to be initialized, we need to load
 and initilaize all EGL drivers.  When an EGLConfig is to be picked, we
 need to pick it from all drivers.
 
 But this also introduces new problems.   For example, when the vendor
 string or the extension string is queried, whose string of all EGL
 drivers should be returned?
 
 My proposal is to simply drop multiple driver support from EGL.
 Instead, we will provide four libEGL implementations:
 
  - libEGL_dri2: derived from egl_dri2
  - libEGL_gallium: derived from egl_gallium
  - libEGL_glx: derived from egl_glx
  - libEGL_loader: see below

Somewhat tangentially...what is the advantage of egl_glx?  Does anybody
use it?   Why?  Is it being tested?

I'm mostly curious, as I've always used egl_dri2.

 All of them are conformant EGL implementations.  That is, any one of
 them can be installed as /usr/lib/libEGL.so.
 
 libEGL_loader is new.  It is basically a wrapper that loads another
 implementation to do the real work.  As such, the problems we face with
 multiple driver support will remain in libEGL_loader.
 
 Distros may choose to install libEGL_loader as libEGL and let it pick
 the real implementation.  Or they may choose to have the first three
 installed as libEGL, and package them separately.
 
 Thoughts?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] i965 gen6: Fix interactions between transform feedback and meta-ops.

2011-12-22 Thread Kenneth Graunke
On 12/21/2011 01:39 PM, Paul Berry wrote:
 This patch series ensures that meta-ops (such as glClear or
 glGenerateMipmapEXT) function properly when transform feedback or
 rasterizer discard is enabled.
 
 Most of the code changes necessary to make this work are in core mesa:
 patches 1/5 and 5/5 ensure that meta ops properly pause transform
 feedback and disable rasterizer discard (and restore the state
 properly when the meta op is over).  Patch 2/5 ensures that
 PauseTransformFeedback interacts properly with BeginTransformFeedback
 and EndTransformFeedback (so that there is no danger of transform
 feedback being in a paused state after a call to
 BeginTransformFeedback).  Patch 3/5 ensures that that while transform
 feedback is paused, it's possible to switch programs and do drawing
 that isn't compatible with the transform feedback mode.
 
 Patch 4/5 implements transform feedback pause/resume functionality in
 the i965 driver.  We don't expose this functionality to the user yet,
 but we need it for meta ops to work correctly.
 
 [PATCH 1/5] mesa: Save and restore GL_RASTERIZER_DISCARD state during meta 
 ops.
 [PATCH 2/5] mesa: Ensure that Paused is reset to false on 
 EndTransformFeedback.
 [PATCH 3/5] mesa: Disable certain error checks when transform feedback is 
 paused
 [PATCH 4/5] i965 gen6: Implement transform feedback pause/resume 
 functionality.
 [PATCH 5/5] mesa: Pause transform feedback during meta ops.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Patches 1-3 and 5 are
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

I have questions about patch 4.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965 gen6: Implement transform feedback pause/resume functionality.

2011-12-22 Thread Kenneth Graunke
On 12/21/2011 01:39 PM, Paul Berry wrote:
 Although i965 gen6 does not yet support ARB_transform_feedback2 or
 NV_transform_feedback2, it needs to support pause/resume functionality
 so that meta-ops will work correctly.
 ---
  src/mesa/drivers/dri/i965/brw_draw.c |3 ++-
  src/mesa/drivers/dri/i965/brw_gs.c   |3 ++-
  src/mesa/drivers/dri/i965/gen6_sol.c |3 ++-
  3 files changed, 6 insertions(+), 3 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
 b/src/mesa/drivers/dri/i965/brw_draw.c
 index 082bb9a..93f27d7 100644
 --- a/src/mesa/drivers/dri/i965/brw_draw.c
 +++ b/src/mesa/drivers/dri/i965/brw_draw.c
 @@ -389,7 +389,8 @@ brw_update_primitive_count(struct brw_context *brw,
  {
 uint32_t count = count_tessellated_primitives(prim);
 brw-sol.primitives_generated += count;
 -   if (brw-intel.ctx.TransformFeedback.CurrentObject-Active) {
 +   if (brw-intel.ctx.TransformFeedback.CurrentObject-Active 
 +   !brw-intel.ctx.TransformFeedback.CurrentObject-Paused) {
/* Update brw-sol.svbi_0_max_index to reflect the amount by which the
 * hardware is going to increment SVBI 0 when this drawing operation
 * occurs.  This is necessary because the kernel does not (yet) save 
 and
 diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
 b/src/mesa/drivers/dri/i965/brw_gs.c
 index 886bf98..850d7b4 100644
 --- a/src/mesa/drivers/dri/i965/brw_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_gs.c
 @@ -183,7 +183,8 @@ static void populate_key( struct brw_context *brw,
 } else if (intel-gen == 6) {
/* On Gen6, GS is used for transform feedback. */
/* _NEW_TRANSFORM_FEEDBACK */
 -  if (ctx-TransformFeedback.CurrentObject-Active) {
 +  if (ctx-TransformFeedback.CurrentObject-Active 
 +  !ctx-TransformFeedback.CurrentObject-Paused) {
   const struct gl_shader_program *shaderprog =
  ctx-Shader.CurrentVertexProgram;
   const struct gl_transform_feedback_info *linked_xfb_info =

Nevermind, I answered my own question.  I was wondering if Paused needed
to be in the key, and how you got updates about it changing.  But no,
putting it in the key would be bizarre, and the _NEW_TRANSFORM_FEEDBACK
dirty bit covers this.

For the whole series:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
 b/src/mesa/drivers/dri/i965/gen6_sol.c
 index 5d11481..32f56d3 100644
 --- a/src/mesa/drivers/dri/i965/gen6_sol.c
 +++ b/src/mesa/drivers/dri/i965/gen6_sol.c
 @@ -47,7 +47,8 @@ gen6_update_sol_surfaces(struct brw_context *brw)
  
 for (i = 0; i  BRW_MAX_SOL_BINDINGS; ++i) {
const int surf_index = SURF_INDEX_SOL_BINDING(i);
 -  if (xfb_obj-Active  i  linked_xfb_info-NumOutputs) {
 +  if (xfb_obj-Active  !xfb_obj-Paused 
 +  i  linked_xfb_info-NumOutputs) {
   unsigned buffer = linked_xfb_info-Outputs[i].OutputBuffer;
   unsigned buffer_offset =
  xfb_obj-Offset[buffer] / 4 +

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] i965: call next_insn() before referencing a instruction by index

2011-12-22 Thread Kenneth Graunke
On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
[snip]
 +   int emit_endif = 1;

Please use bool and true/false rather than int.

 /* In single program flow mode, we can express IF and ELSE instructions
  * equivalently as ADD instructions that operate on IP.  On platforms 
 prior
 @@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p)
  * instructions to conditional ADDs.  So we only do this trick on Gen4 and
  * Gen5.
  */
 -   if (intel-gen  6  p-single_program_flow) {
 +   if (intel-gen  6  p-single_program_flow)
 +  emit_endif = 0;

You could actually just do this:

/* In single program flow mode, we can express IF and ELSE ...
 */
bool emit_endif = !(intel-gen  6  p-single_program_flow);

But I'm fine with bool emit_endif = true and emit_endif = false if
you prefer that.

Assuming you make one of those changes, this patch is
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 +   /*
 +* A single next_insn() may change the base adress of instruction store
 +* memory(p-store), so call it first before referencing the instruction
 +* store pointer from an index
 +*/
 +   if (emit_endif)
 +  insn = next_insn(p, BRW_OPCODE_ENDIF);
 +
 +   /* Pop the IF and (optional) ELSE instructions from the stack */
 +   p-if_depth_in_loop[p-loop_stack_depth]--;
 +   tmp = pop_if_stack(p);
 +   if (tmp-header.opcode == BRW_OPCODE_ELSE) {
 +  else_inst = tmp;
 +  tmp = pop_if_stack(p);
 +   }
 +   if_inst = tmp;
 +
 +   if (!emit_endif) {
/* ENDIF is useless; don't bother emitting it. */
convert_IF_ELSE_to_ADD(p, if_inst, else_inst);
return;
 }
  
 -   insn = next_insn(p, BRW_OPCODE_ENDIF);
 -
 if (intel-gen  6) {
brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
 @@ -1393,13 +1403,12 @@ struct brw_instruction *brw_WHILE(struct brw_compile 
 *p)
 struct brw_instruction *insn, *do_insn;
 GLuint br = 1;
  
 -   do_insn = get_inner_do_insn(p);
 -
 if (intel-gen = 5)
br = 2;
  
 if (intel-gen = 7) {
insn = next_insn(p, BRW_OPCODE_WHILE);
 +  do_insn = get_inner_do_insn(p);
  
brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
 @@ -1409,6 +1418,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
insn-header.execution_size = BRW_EXECUTE_8;
 } else if (intel-gen == 6) {
insn = next_insn(p, BRW_OPCODE_WHILE);
 +  do_insn = get_inner_do_insn(p);
  
brw_set_dest(p, insn, brw_imm_w(0));
insn-bits1.branch_gen6.jump_count = br * (do_insn - insn);
 @@ -1419,6 +1429,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
 } else {
if (p-single_program_flow) {
insn = next_insn(p, BRW_OPCODE_ADD);
 + do_insn = get_inner_do_insn(p);
  
brw_set_dest(p, insn, brw_ip_reg());
brw_set_src0(p, insn, brw_ip_reg());
 @@ -1426,6 +1437,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
insn-header.execution_size = BRW_EXECUTE_1;
} else {
insn = next_insn(p, BRW_OPCODE_WHILE);
 + do_insn = get_inner_do_insn(p);
  
assert(do_insn-header.opcode == BRW_OPCODE_DO);
  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: Close a memory leak

2011-12-22 Thread Lauri Kasanen
Hi all

This fixes a memory leak of 32 bytes on exit.

From 924f8fdccb41b011f372bc57252005bcdb096105 Mon Sep 17 00:00:00 2001
From: Lauri Kasanen cur...@operamail.com
Date: Thu, 22 Dec 2011 21:28:33 +0200
Subject: [PATCH] gallivm: Close a memory leak

As reported by valgrind --leak-check=full glxgears.

Signed-off-by: Lauri Kasanen cur...@operamail.com
---
 src/gallium/auxiliary/gallivm/lp_bld_init.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 45addee..503c04e 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -345,6 +345,7 @@ 
gallivm_remove_garbage_collector_callback(garbage_collect_callback_func func,
   if (cb-func == func  cb-cb_data == cb_data) {
  /* found, remove it */
  remove_from_list(cb);
+ FREE(cb);
  return;
   }
}
-- 
1.7.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add back glGetnUniformfv() overflow error reporting

2011-12-22 Thread nobled
The error was erroneously removed in this commit:

719909698c67c287a393d2380278e7b7495ae018
mesa: Rewrite the way uniforms are tracked and handled

You also aren't even supposed to truncate the output to 'bufSize',
so just return like before.

Also fixup an old comment and add an assert.
---
(This function has a random mixture of tabs+spaces and pure spaces for
indentation, so I had no idea which style to use...)

 src/mesa/main/uniform_query.cpp |   16 
 src/mesa/main/uniforms.c|2 +-
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 33ba53c..8e58fc0 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -203,10 +203,18 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint
program, GLint location,
   const union gl_constant_value *const src =
 uni-storage[offset * elements];

-  unsigned bytes = sizeof(uni-storage[0]) * elements;
-  if (bytes  (unsigned) bufSize) {
-elements = bufSize / sizeof(uni-storage[0]);
-bytes = bufSize;
+  assert(returnType == GLSL_TYPE_FLOAT || returnType == GLSL_TYPE_INT ||
+ returnType == GLSL_TYPE_UINT);
+  /* The three (currently) supported types all have the same size,
+   * which is of course the same as their union. That'll change
+   * with glGetUniformdv()...
+   */
+  unsigned bytes = sizeof(src[0]) * elements;
+  if (bufSize  0 || bytes  (unsigned) bufSize) {
+_mesa_error( ctx, GL_INVALID_OPERATION,
+glGetnUniformfvARB(out of bounds: bufSize is %d,
+ but %u bytes are required), bufSize, bytes );
+return;
   }

   /* If the return type and the uniform's native type are compatible,
diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index 685c0f1..981874e 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -478,7 +478,7 @@ _mesa_GetnUniformdvARB(GLhandleARB program, GLint location,
(void) params;

/*
-   _mesa_get_uniform(ctx, program, location, bufSize, GL_DOUBLE, params);
+   _mesa_get_uniform(ctx, program, location, bufSize,
GLSL_TYPE_DOUBLE, params);
*/
_mesa_error(ctx, GL_INVALID_OPERATION, glGetUniformdvARB
(GL_ARB_gpu_shader_fp64 not implemented));
-- 
1.7.4.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965: Rename BRW_NEW_WM_SURFACES to BRW_NEW_SURFACES.

2011-12-22 Thread Paul Berry
The surface states tracked by BRW_NEW_WM_SURFACES are no longer used
for just WM.  They are also used for vertex texturing and transform
feedback.  To avoid confusion, this patch renames BRW_NEW_WM_SURFACES
to BRW_NEW_SURFACES.
---
 src/mesa/drivers/dri/i965/brw_context.h  |4 ++--
 src/mesa/drivers/dri/i965/brw_state_upload.c |2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |   12 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 15a781b..fb41fd1 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -131,7 +131,7 @@ enum brw_state_id {
BRW_STATE_CONTEXT,
BRW_STATE_WM_INPUT_DIMENSIONS,
BRW_STATE_PSP,
-   BRW_STATE_WM_SURFACES,
+   BRW_STATE_SURFACES,
BRW_STATE_VS_BINDING_TABLE,
BRW_STATE_GS_BINDING_TABLE,
BRW_STATE_PS_BINDING_TABLE,
@@ -158,7 +158,7 @@ enum brw_state_id {
 #define BRW_NEW_CONTEXT (1  BRW_STATE_CONTEXT)
 #define BRW_NEW_WM_INPUT_DIMENSIONS (1  BRW_STATE_WM_INPUT_DIMENSIONS)
 #define BRW_NEW_PSP (1  BRW_STATE_PSP)
-#define BRW_NEW_WM_SURFACES(1  BRW_STATE_WM_SURFACES)
+#define BRW_NEW_SURFACES   (1  BRW_STATE_SURFACES)
 #define BRW_NEW_VS_BINDING_TABLE   (1  BRW_STATE_VS_BINDING_TABLE)
 #define BRW_NEW_GS_BINDING_TABLE   (1  BRW_STATE_GS_BINDING_TABLE)
 #define BRW_NEW_PS_BINDING_TABLE   (1  BRW_STATE_PS_BINDING_TABLE)
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 74d01d8..a8bda5a 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -360,7 +360,7 @@ static struct dirty_bit_map brw_bits[] = {
DEFINE_BIT(BRW_NEW_WM_INPUT_DIMENSIONS),
DEFINE_BIT(BRW_NEW_PROGRAM_CACHE),
DEFINE_BIT(BRW_NEW_PSP),
-   DEFINE_BIT(BRW_NEW_WM_SURFACES),
+   DEFINE_BIT(BRW_NEW_SURFACES),
DEFINE_BIT(BRW_NEW_INDICES),
DEFINE_BIT(BRW_NEW_INDEX_BUFFER),
DEFINE_BIT(BRW_NEW_VERTICES),
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 3801c09..e908430 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -828,7 +828,7 @@ brw_upload_wm_pull_constants(struct brw_context *brw)
 drm_intel_bo_unreference(brw-wm.const_bo);
 brw-wm.const_bo = NULL;
 brw-bind.surf_offset[surf_index] = 0;
-brw-state.dirty.brw |= BRW_NEW_WM_SURFACES;
+brw-state.dirty.brw |= BRW_NEW_SURFACES;
   }
   return;
}
@@ -850,7 +850,7 @@ brw_upload_wm_pull_constants(struct brw_context *brw)
   params-NumParameters,
   brw-bind.surf_offset[surf_index]);
 
-   brw-state.dirty.brw |= BRW_NEW_WM_SURFACES;
+   brw-state.dirty.brw |= BRW_NEW_SURFACES;
 }
 
 const struct brw_tracked_state brw_wm_pull_constants = {
@@ -1004,7 +1004,7 @@ brw_update_renderbuffer_surfaces(struct brw_context *brw)
} else {
   intel-vtbl.update_null_renderbuffer_surface(brw, 0);
}
-   brw-state.dirty.brw |= BRW_NEW_WM_SURFACES;
+   brw-state.dirty.brw |= BRW_NEW_SURFACES;
 }
 
 const struct brw_tracked_state brw_renderbuffer_surfaces = {
@@ -1046,7 +1046,7 @@ brw_update_texture_surfaces(struct brw_context *brw)
   }
}
 
-   brw-state.dirty.brw |= BRW_NEW_WM_SURFACES;
+   brw-state.dirty.brw |= BRW_NEW_SURFACES;
 }
 
 const struct brw_tracked_state brw_texture_surfaces = {
@@ -1075,7 +1075,7 @@ brw_upload_binding_table(struct brw_context *brw)
  sizeof(uint32_t) * BRW_MAX_SURFACES,
  32, brw-bind.bo_offset);
 
-   /* BRW_NEW_WM_SURFACES and BRW_NEW_VS_CONSTBUF */
+   /* BRW_NEW_SURFACES and BRW_NEW_VS_CONSTBUF */
for (i = 0; i  BRW_MAX_SURFACES; i++) {
   bind[i] = brw-bind.surf_offset[i];
}
@@ -1089,7 +1089,7 @@ const struct brw_tracked_state brw_binding_table = {
   .mesa = 0,
   .brw = (BRW_NEW_BATCH |
  BRW_NEW_VS_CONSTBUF |
- BRW_NEW_WM_SURFACES),
+ BRW_NEW_SURFACES),
   .cache = 0
},
.emit = brw_upload_binding_table,
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965 gen6: Resend binding table pointer after updating SOL bindings.

2011-12-22 Thread Paul Berry
After creating new binding table entries for transform feedback, we
need to set the dirty flag BRW_NEW_SURFACES, so that a new binding
table pointer will be sent to the hardware.  Otherwise the new binding
table entries will not take effect.
---
 src/mesa/drivers/dri/i965/gen6_sol.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 32f56d3..437b3ae 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -61,6 +61,8 @@ gen6_update_sol_surfaces(struct brw_context *brw)
  brw-bind.surf_offset[surf_index] = 0;
   }
}
+
+   brw-state.dirty.brw |= BRW_NEW_SURFACES;
 }
 
 const struct brw_tracked_state gen6_sol_surface = {
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965 Gen6+: Invalidate VF address-based cache on flush

2011-12-22 Thread Paul Berry
Although there is not much documentation of this fact, there are in
fact two separate VF caches:

- an index-based cache (described in the Sandy Bridge PRM, vol 2
  part 1, section 2.1.2 Vertex Cache).  This cache stores URB
  handles of vertex shader outputs; its purpose is to avoid redundant
  invocations of the vertex shader when drawing in random access mode
  (e.g. glDrawElements()), and the same vertex index is specified
  multiple times.  It is automatically invalidated between
  3D_PRIMITIVE commands and between instances within a single
  3D_PRIMITIVE command.

- an address-based cache (mentioned briefly in vol 2 part 1, section
  1.7.4 PIPE_CONTROL Command).  This cache stores the data read from
  vertex buffers; its purpose is to avoid redundant memory accesses
  when doing instanced drawing or when multiple 3D_PRIMITIVE commands
  access the same vertex data.  It needs to be manually invalidated
  whenever new data is written to a buffer that is used for vertex
  data.

Previous to this patch, it was not necessary for Mesa to explicitly
invalidate the address-based cache, because there were no reasonable
use cases in which the GPU would write to a vertex data buffer during
a batch, and inter-batch flushing was taken care of by the kernel.

However, with transform feedback, there is now a reasonable use case:
vertex data is written to a buffer using transform feedback, and then
that data is immediately re-used as vertex input in the next drawing
operation.  To make this use case work, we need to flush the
address-based VF cache between transform feedback and the next draw
operation.  Since we are already calling
intel_batchbuffer_emit_mi_flush() when transform feedback completes,
and intel_batchbuffer_emit_mi_flush() is intended to invalidate all
caches, it seems reasonable to add VF cache invalidation to this
function.

As with commit 63cf7fad13fc9cfdd2ae7b031426f79107000300 (i965: Flush
pipeline on EndTransformFeedback), this is not an ideal solution.  It
would be preferable to only invalidate the VF cache if the next draw
call was about to consume data generated by a previous draw call in
the same batch.  However, since we don't have the necessary dependency
tracking infrastructure to figure that out right now, we have to
overzealously invalidate the cache.

Fixes Piglit test EXT_transform_feedback/immediate-reuse.
---
 src/mesa/drivers/dri/intel/intel_batchbuffer.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c 
b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
index 4ff098a..cb23dbc 100644
--- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
@@ -460,6 +460,7 @@ intel_batchbuffer_emit_mi_flush(struct intel_context *intel)
 OUT_BATCH(PIPE_CONTROL_INSTRUCTION_FLUSH |
   PIPE_CONTROL_WRITE_FLUSH |
   PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+   PIPE_CONTROL_VF_CACHE_INVALIDATE |
   PIPE_CONTROL_TC_FLUSH |
   PIPE_CONTROL_NO_WRITE |
PIPE_CONTROL_CS_STALL);
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically

2011-12-22 Thread Kenneth Graunke
On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
 Here is the final patch to enable dynamic eu instruction store size:
 increase the brw eu instruction store size dynamically instead of just
 allocating it statically with a constant limit. This would fix something
 that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would
 limit it to 1'.
 
 Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_eu.c  |7 +++
  src/mesa/drivers/dri/i965/brw_eu.h  |7 ---
  src/mesa/drivers/dri/i965/brw_eu_emit.c |   12 +++-
  3 files changed, 22 insertions(+), 4 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_eu.c 
 b/src/mesa/drivers/dri/i965/brw_eu.c
 index 9b4dde8..7d206f3 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu.c
 +++ b/src/mesa/drivers/dri/i965/brw_eu.c
 @@ -174,6 +174,13 @@ void
  brw_init_compile(struct brw_context *brw, struct brw_compile *p, void 
 *mem_ctx)
  {
 p-brw = brw;
 +   /*
 +* Set the initial instruction store array size to 1024, if found that
 +* isn't enough, then it will double the store size at brw_next_insn()
 +* until it meet the BRW_EU_MAX_INSN
 +*/
 +   p-store_size = 1024;
 +   p-store = rzalloc_array(mem_ctx, struct brw_instruction, p-store_size);
 p-nr_insn = 0;
 p-current = p-stack;
 p-compressed = false;
 diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
 b/src/mesa/drivers/dri/i965/brw_eu.h
 index 9d3d7de..52567c2 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu.h
 +++ b/src/mesa/drivers/dri/i965/brw_eu.h
 @@ -100,11 +100,12 @@ struct brw_glsl_call;
  
  
  
 -#define BRW_EU_MAX_INSN_STACK 5
 -#define BRW_EU_MAX_INSN 1
 +#define BRW_EU_MAX_INSN_STACK   5
 +#define BRW_EU_MAX_INSN (1024 * 1024)

I'm actually surprised to see BRW_EU_MAX_INSN at all.  As far as I know,
there isn't an actual hardware limit on the number of instructions, so
I'm not sure why we should cap it at all.  Especially not to some
arbitrary number.  (I'm assuming that 1024 * 1024 is just something you
came up with arbitrarily...)

  struct brw_compile {
 -   struct brw_instruction store[BRW_EU_MAX_INSN];
 +   struct brw_instruction *store;
 +   int store_size;
 GLuint nr_insn;
  
 void *mem_ctx;
 diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
 b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 index bd5fe6a..4396a0c 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
 +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 @@ -691,7 +691,17 @@ brw_next_insn(struct brw_compile *p, GLuint opcode)
  {
 struct brw_instruction *insn;
  
 -   assert(p-nr_insn + 1  BRW_EU_MAX_INSN);
 +   if (p-nr_insn + 1  p-store_size) {
 +  if (p-nr_insn + 1  BRW_EU_MAX_INSN) {
 + assert(!exceed max brw allowed eu instructions);
 +  } else {
 + if (0)
 +printf(incresing the store size to %d\n, p-store_size  1);
 + p-store_size = 1;
 + p-store = reralloc(p-mem_ctx, p-store,
 + struct brw_instruction, p-store_size);
 +  }
 +   }
  
 insn = p-store[p-nr_insn++];
 memcpy(insn, p-current, sizeof(*insn));
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] i965: dynamic eu instruction store size

2011-12-22 Thread Kenneth Graunke
On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
 Hi, this is a new series of patches for dynamic eu instruction store
 size. The first 4 is from Eric. I just grabed it to make it rebase to
 current repo. The last 4 patch is from mine which some are based on
 those patches from Eric.
 
 Please help to review it.
 
 BTW, I checked those patches with all oglc test cases, and found
 no regression. (Sandybridge only).
 
 Thanks,
 Yuanhan Liu
 
 
 --
 Eric Anholt (4):
   i965: Drop unused do_insn argument from gen6_CONT().
   i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start
   i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT
 patching
   i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in
 loop
 
 Yuanhan Liu (4):
   i965: let the if_stack just store the instruction index
   i965: get the jmp distance by instruction index
   i965: call next_insn() before referencing a instruction by index


Patches 1-7 (v2 of 6 and after changing to bool in 7) are:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

   i965: increase the brw eu instruction store size dynamically

Patch 8 does not get a R-b just yet.

Thanks for doing this, Yuanhan, I'm really glad to see the arbitrary
1 limit die.  And Eric, thanks for cleaning up the rest of the
control flow stack code---it's /so/ much nicer now!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965 Gen6+: Invalidate VF address-based cache on flush

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 02:06 PM, Paul Berry wrote:
 Although there is not much documentation of this fact, there are in
 fact two separate VF caches:
 
 - an index-based cache (described in the Sandy Bridge PRM, vol 2
   part 1, section 2.1.2 Vertex Cache).  This cache stores URB
   handles of vertex shader outputs; its purpose is to avoid redundant
   invocations of the vertex shader when drawing in random access mode
   (e.g. glDrawElements()), and the same vertex index is specified
   multiple times.  It is automatically invalidated between
   3D_PRIMITIVE commands and between instances within a single
   3D_PRIMITIVE command.
 
 - an address-based cache (mentioned briefly in vol 2 part 1, section
   1.7.4 PIPE_CONTROL Command).  This cache stores the data read from
   vertex buffers; its purpose is to avoid redundant memory accesses
   when doing instanced drawing or when multiple 3D_PRIMITIVE commands
   access the same vertex data.  It needs to be manually invalidated
   whenever new data is written to a buffer that is used for vertex
   data.
 
 Previous to this patch, it was not necessary for Mesa to explicitly
 invalidate the address-based cache, because there were no reasonable
 use cases in which the GPU would write to a vertex data buffer during
 a batch, and inter-batch flushing was taken care of by the kernel.
 
 However, with transform feedback, there is now a reasonable use case:
 vertex data is written to a buffer using transform feedback, and then
 that data is immediately re-used as vertex input in the next drawing
 operation.  To make this use case work, we need to flush the
 address-based VF cache between transform feedback and the next draw
 operation.  Since we are already calling
 intel_batchbuffer_emit_mi_flush() when transform feedback completes,
 and intel_batchbuffer_emit_mi_flush() is intended to invalidate all
 caches, it seems reasonable to add VF cache invalidation to this
 function.
 
 As with commit 63cf7fad13fc9cfdd2ae7b031426f79107000300 (i965: Flush
 pipeline on EndTransformFeedback), this is not an ideal solution.  It
 would be preferable to only invalidate the VF cache if the next draw
 call was about to consume data generated by a previous draw call in
 the same batch.  However, since we don't have the necessary dependency
 tracking infrastructure to figure that out right now, we have to
 overzealously invalidate the cache.
 
 Fixes Piglit test EXT_transform_feedback/immediate-reuse.
 ---
  src/mesa/drivers/dri/intel/intel_batchbuffer.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c 
 b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
 index 4ff098a..cb23dbc 100644
 --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
 +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
 @@ -460,6 +460,7 @@ intel_batchbuffer_emit_mi_flush(struct intel_context 
 *intel)
OUT_BATCH(PIPE_CONTROL_INSTRUCTION_FLUSH |
  PIPE_CONTROL_WRITE_FLUSH |
  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
 +   PIPE_CONTROL_VF_CACHE_INVALIDATE |
  PIPE_CONTROL_TC_FLUSH |
  PIPE_CONTROL_NO_WRITE |
 PIPE_CONTROL_CS_STALL);

I checked the workaround list, and it doesn't look like there are any
workarounds needed for VF (address based) Cache invalidation.  Plus, we
now do that in the kernel inbetween every batch, so I'm not concerned
about adding it here.

This series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Add Haiku build support

2011-12-22 Thread kallisti5
From: Alexander von Gluck IV kallis...@unixzen.com

* Add Haiku as a platform to mklib
* Fix GLU to allow building static libGLU
* Remove a few existing Haiku defines that break the build
---
 Makefile |1 +
 acinclude.m4 |2 +-
 bin/mklib|   37 ++
 src/gallium/auxiliary/os/os_thread.h |2 +-
 src/gallium/auxiliary/util/u_debug.h |2 -
 src/gallium/drivers/r300/Makefile|1 +
 src/glsl/link_uniforms.cpp   |3 +-
 src/glu/sgi/Makefile |   15 +++--
 src/mesa/main/querymatrix.c  |2 +-
 9 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index cf6555c..4caa8ce 100644
--- a/Makefile
+++ b/Makefile
@@ -90,6 +90,7 @@ freebsd \
 freebsd-dri \
 freebsd-dri-amd64 \
 freebsd-dri-x86 \
+haiku \
 hpux10 \
 hpux10-gcc \
 hpux10-static \
diff --git a/acinclude.m4 b/acinclude.m4
index a5b389d..33ed8a8 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -34,7 +34,7 @@ if test $enable_pic != no; then
 # see if we're using GCC
 if test x$GCC = xyes; then
 case $host_os in
-aix*|beos*|cygwin*|irix5*|irix6*|osf3*|osf4*|osf5*)
+aix*|cygwin*|haiku*|irix5*|irix6*|osf3*|osf4*|osf5*)
 # PIC is the default for these OSes.
 ;;
 mingw*|os2*|pw32*)
diff --git a/bin/mklib b/bin/mklib
index 70bd1a2..ca4b62c 100755
--- a/bin/mklib
+++ b/bin/mklib
@@ -959,6 +959,43 @@ case $ARCH in
 fi
;;
 
+'Haiku')
+if [ $STATIC = 1 ] ; then
+LIBNAME=lib${LIBNAME}.a
+if [ x$LINK = x ] ; then
+# -linker was not specified so set default link command now
+if [ $CPLUSPLUS = 1 ] ; then
+LINK=g++
+else
+LINK=gcc
+fi
+fi
+
+OPTS=-ru
+if [ ${ALTOPTS} ] ; then
+OPTS=${ALTOPTS}
+fi
+
+echo mklib: Making static library for Haiku:  ${LIBNAME}
+
+# expand .a into .o files
+NEW_OBJECTS=`expand_archives ${LIBNAME}.obj $OBJECTS`
+
+# make static lib
+FINAL_LIBS=`make_ar_static_lib ${OPTS} 1 ${LIBNAME} ${NEW_OBJECTS}`
+
+# remove temporary extracted .o files
+rm -rf ${LIBNAME}.obj
+else
+LIBNAME=lib${LIBNAME}.so  # prefix with lib, suffix with .so
+OPTS=-shared
+
+echo mklib: Making shared library for Haiku:  ${LIBNAME}
+   ${LINK} ${OPTS} ${LDFLAGS} ${OBJECTS} ${DEPS} -o 
${LIBNAME}
+FINAL_LIBS=${LIBNAME}
+fi
+;;
+
 'example')
# If you're adding support for a new architecture, you can
# start with this:
diff --git a/src/gallium/auxiliary/os/os_thread.h 
b/src/gallium/auxiliary/os/os_thread.h
index d830129..3e1c273 100644
--- a/src/gallium/auxiliary/os/os_thread.h
+++ b/src/gallium/auxiliary/os/os_thread.h
@@ -314,7 +314,7 @@ typedef int64_t pipe_condvar;
  * pipe_barrier
  */
 
-#if (defined(PIPE_OS_LINUX) || defined(PIPE_OS_BSD) || 
defined(PIPE_OS_SOLARIS) || defined(PIPE_OS_HAIKU))  !defined(PIPE_OS_ANDROID)
+#if (defined(PIPE_OS_LINUX) || defined(PIPE_OS_BSD) || 
defined(PIPE_OS_SOLARIS))  !defined(PIPE_OS_ANDROID)
 
 typedef pthread_barrier_t pipe_barrier;
 
diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h
index b5ea405..677e478 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -75,7 +75,6 @@ _debug_printf(const char *format, ...)
  * - avoid outputing large strings (512 bytes is the current maximum length
  * that is guaranteed to be printed in all platforms)
  */
-#if !defined(PIPE_OS_HAIKU)
 static INLINE void
 debug_printf(const char *format, ...) _util_printf_format(1,2);
 
@@ -92,7 +91,6 @@ debug_printf(const char *format, ...)
 #endif
 }
 
-#endif /* !PIPE_OS_HAIKU */
 
 /*
  * ... isn't portable so we need to pass arguments in parentheses.
diff --git a/src/gallium/drivers/r300/Makefile 
b/src/gallium/drivers/r300/Makefile
index 5f56fc4..3e3a765 100644
--- a/src/gallium/drivers/r300/Makefile
+++ b/src/gallium/drivers/r300/Makefile
@@ -15,6 +15,7 @@ C_SOURCES += \
 LIBRARY_INCLUDES = \
-I$(TOP)/include \
-I$(TOP)/src/mesa \
+   -I$(TOP)/src/mapi \
-I$(TOP)/src/glsl
 
 include ../../Makefile.template
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index c7de480..f2e6648 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -336,9 +336,8 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)
   rzalloc_array(prog, struct gl_uniform_storage, num_user_uniforms);
union gl_constant_value *data =
   rzalloc_array(uniforms, union gl_constant_value, num_data_slots);
-#ifndef NDEBUG
+

Re: [Mesa-dev] [PATCH] gallivm: Close a memory leak

2011-12-22 Thread Jose Fonseca
Commited. Thanks.

Jose

- Original Message -
 Hi all
 
 This fixes a memory leak of 32 bytes on exit.
 
 From 924f8fdccb41b011f372bc57252005bcdb096105 Mon Sep 17 00:00:00
 2001
 From: Lauri Kasanen cur...@operamail.com
 Date: Thu, 22 Dec 2011 21:28:33 +0200
 Subject: [PATCH] gallivm: Close a memory leak
 
 As reported by valgrind --leak-check=full glxgears.
 
 Signed-off-by: Lauri Kasanen cur...@operamail.com
 ---
  src/gallium/auxiliary/gallivm/lp_bld_init.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c
 b/src/gallium/auxiliary/gallivm/lp_bld_init.c
 index 45addee..503c04e 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
 @@ -345,6 +345,7 @@
 gallivm_remove_garbage_collector_callback(garbage_collect_callback_func
 func,
if (cb-func == func  cb-cb_data == cb_data) {
   /* found, remove it */
   remove_from_list(cb);
 + FREE(cb);
   return;
}
 }
 --
 1.7.2.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i965/gen7 transform feedback

2011-12-22 Thread Eric Anholt
Here's today's patch series for gen7 transform feedback.  It runs on
top of a kernel patch at people.freedesktop.org:~anholt/linux on the
gen7-reset-sol branch.  I expected it to be easy, but not this easy.

Remaining test failures:

tessellation polygon flat_lastwarn
tessellation quad_strip flat_last warn
tessellation quads flat_last  warn
tessellation triangle_fan flat_first  fail

Also, it looks like none of the current piglit tests test transform
feedback across batchbuffers.  I don't expect major problems there,
given that we have a userland count of verts emitted.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965/gen7: Move SOL stage disable to gen7_sol_state.c

2011-12-22 Thread Eric Anholt
We'll be growing more code in here as we actually enable the unit.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/Makefile.sources   |1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c |1 +
 src/mesa/drivers/dri/i965/gen7_disable.c |7 ---
 src/mesa/drivers/dri/i965/gen7_sol_state.c   |   56 ++
 4 files changed, 58 insertions(+), 7 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/gen7_sol_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index e50f9c3..3eeac6f 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -104,6 +104,7 @@ i965_C_SOURCES := \
gen7_misc_state.c \
gen7_sampler_state.c \
gen7_sf_state.c \
+   gen7_sol_state.c \
gen7_urb.c \
gen7_viewport_state.c \
gen7_vs_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 74d01d8..66382b7 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -220,6 +220,7 @@ const struct brw_tracked_state *gen7_atoms[] =
 
gen7_disable_stages,
gen7_vs_state,
+   gen7_sol_state,
gen7_clip_state,
gen7_sbe_state,
gen7_sf_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_disable.c 
b/src/mesa/drivers/dri/i965/gen7_disable.c
index a44d315..b37aa6c 100644
--- a/src/mesa/drivers/dri/i965/gen7_disable.c
+++ b/src/mesa/drivers/dri/i965/gen7_disable.c
@@ -122,13 +122,6 @@ disable_stages(struct brw_context *brw)
OUT_BATCH(_3DSTATE_BINDING_TABLE_POINTERS_DS  16 | (2 - 2));
OUT_BATCH(0);
ADVANCE_BATCH();
-
-   /* Disable the SOL stage */
-   BEGIN_BATCH(3);
-   OUT_BATCH(_3DSTATE_STREAMOUT  16 | (3 - 2));
-   OUT_BATCH(0);
-   OUT_BATCH(0);
-   ADVANCE_BATCH();
 }
 
 const struct brw_tracked_state gen7_disable_stages = {
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
new file mode 100644
index 000..fcda08d
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright © 2011 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/**
+ * @file gen7_sol_state.c
+ *
+ * Controls the stream output logic (SOL) stage of the gen7 hardware, which is
+ * used to implement GL_EXT_transform_feedback.
+ */
+
+#include brw_context.h
+#include brw_state.h
+#include brw_defines.h
+#include intel_batchbuffer.h
+
+static void
+upload_sol_state(struct brw_context *brw)
+{
+   struct intel_context *intel = brw-intel;
+
+   /* Disable the SOL stage */
+   BEGIN_BATCH(3);
+   OUT_BATCH(_3DSTATE_STREAMOUT  16 | (3 - 2));
+   OUT_BATCH(0);
+   OUT_BATCH(0);
+   ADVANCE_BATCH();
+}
+
+const struct brw_tracked_state gen7_sol_state = {
+   .dirty = {
+  .mesa  = 0,
+  .brw   = BRW_NEW_BATCH,
+  .cache = 0,
+   },
+   .emit = upload_sol_state,
+};
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.

2011-12-22 Thread Eric Anholt
The code was relying on gs.prog_data's copy of the
number-of-verts-per-prim, which segfaulted on gen7 since it doesn't
make a GS program.  We can easily calculate that value right here.
---
 src/mesa/drivers/dri/i965/brw_draw.c |   33 +++--
 1 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 082bb9a..c116d39 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -379,6 +379,30 @@ static void brw_postdraw_set_buffers_need_resolve(struct 
brw_context *brw)
}
 }
 
+static int
+verts_per_prim(GLenum mode)
+{
+   switch (mode) {
+   case GL_POINTS:
+  return 1;
+   case GL_LINE_STRIP:
+   case GL_LINE_LOOP:
+   case GL_LINES:
+  return 2;
+   case GL_TRIANGLE_STRIP:
+   case GL_TRIANGLE_FAN:
+   case GL_POLYGON:
+   case GL_TRIANGLES:
+   case GL_QUADS:
+   case GL_QUAD_STRIP:
+  return 3;
+   default:
+  _mesa_problem(NULL,
+   unknown prim type in transform feedback primitive count);
+  return 0;
+   }
+}
+
 /**
  * Update internal counters based on the the drawing operation described in
  * prim.
@@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw,
* able to reload SVBI 0 with the correct value in case we have to start
* a new batch buffer.
*/
-  unsigned svbi_postincrement_value =
- brw-gs.prog_data-svbi_postincrement_value;
+  unsigned verts = verts_per_prim(prim-mode);
   uint32_t space_avail =
- (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index)
- / svbi_postincrement_value;
+ (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) / verts;
   uint32_t primitives_written = MIN2 (space_avail, count);
-  brw-sol.svbi_0_starting_index +=
- svbi_postincrement_value * primitives_written;
+  brw-sol.svbi_0_starting_index += verts;
 
   /* And update the TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query. */
   brw-sol.primitives_written += primitives_written;
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] i965/gen7: Add register definitions for GL_EXT_transform_feedback.

2011-12-22 Thread Eric Anholt
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_defines.h |   76 ++-
 src/mesa/drivers/dri/intel/intel_reg.h  |   15 ++
 2 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 4edfaf7..4bb7f00 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1307,6 +1307,42 @@ enum brw_wm_barycentric_interp_mode {
 #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
 #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */
 
+#define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
+/* DW1 */
+# define SO_FUNCTION_ENABLE(1  31)
+# define SO_RENDERING_DISABLE  (1  30)
+/* This selects which incoming rendering stream goes down the pipeline.  The
+ * rendering stream is 0 if not defined by special cases in the GS state.
+ */
+# define SO_RENDER_STREAM_SELECT_SHIFT 27
+# define SO_RENDER_STREAM_SELECT_MASK  INTEL_MASK(28, 27)
+/* Controls reordering of TRISTRIP_* elements in stream output (not rendering).
+ */
+# define SO_REORDER_TRAILING   (1  26)
+/* Controls SO_NUM_PRIMS_WRITTEN_* and SO_PRIM_STORAGE_* */
+# define SO_STATISTICS_ENABLE  (1  25)
+# define SO_BUFFER_ENABLE_3(1  11)
+# define SO_BUFFER_ENABLE_2(1  10)
+# define SO_BUFFER_ENABLE_1(1  9)
+# define SO_BUFFER_ENABLE_0(1  8)
+/* DW2 */
+# define SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT  29
+# define SO_STREAM_3_VERTEX_READ_OFFSET_MASK   INTEL_MASK(29, 29)
+# define SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT  24
+# define SO_STREAM_3_VERTEX_READ_LENGTH_MASK   INTEL_MASK(28, 24)
+# define SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT  21
+# define SO_STREAM_2_VERTEX_READ_OFFSET_MASK   INTEL_MASK(21, 21)
+# define SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT  16
+# define SO_STREAM_2_VERTEX_READ_LENGTH_MASK   INTEL_MASK(20, 16)
+# define SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT  13
+# define SO_STREAM_1_VERTEX_READ_OFFSET_MASK   INTEL_MASK(13, 13)
+# define SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT  8
+# define SO_STREAM_1_VERTEX_READ_LENGTH_MASK   INTEL_MASK(12, 8)
+# define SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT  5
+# define SO_STREAM_0_VERTEX_READ_OFFSET_MASK   INTEL_MASK(5, 5)
+# define SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT  0
+# define SO_STREAM_0_VERTEX_READ_LENGTH_MASK   INTEL_MASK(4, 0)
+
 /* 3DSTATE_WM for Gen7 */
 /* DW1 */
 # define GEN7_WM_STATISTICS_ENABLE (1  31)
@@ -1373,8 +1409,6 @@ enum brw_wm_barycentric_interp_mode {
 /* DW6: kernel 1 pointer */
 /* DW7: kernel 2 pointer */
 
-#define _3DSTATE_STREAMOUT  0x781e /* GEN7+ */
-
 #define _3DSTATE_SAMPLE_MASK   0x7818 /* GEN6+ */
 
 #define _3DSTATE_DRAWING_RECTANGLE 0x7900
@@ -1414,6 +1448,44 @@ enum brw_wm_barycentric_interp_mode {
 # define DEPTH_CLEAR_VALID (1  15)
 /* DW1: depth clear value */
 
+#define _3DSTATE_SO_DECL_LIST  0x7917 /* GEN7+ */
+/* DW1 */
+# define SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT   12
+# define SO_STREAM_TO_BUFFER_SELECTS_3_MASKINTEL_MASK(15, 12)
+# define SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT   8
+# define SO_STREAM_TO_BUFFER_SELECTS_2_MASKINTEL_MASK(11, 8)
+# define SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT   4
+# define SO_STREAM_TO_BUFFER_SELECTS_1_MASKINTEL_MASK(7, 4)
+# define SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT   0
+# define SO_STREAM_TO_BUFFER_SELECTS_0_MASKINTEL_MASK(3, 0)
+/* DW2 */
+# define SO_NUM_ENTRIES_3_SHIFT24
+# define SO_NUM_ENTRIES_3_MASK INTEL_MASK(31, 24)
+# define SO_NUM_ENTRIES_2_SHIFT16
+# define SO_NUM_ENTRIES_2_MASK INTEL_MASK(23, 16)
+# define SO_NUM_ENTRIES_1_SHIFT8
+# define SO_NUM_ENTRIES_1_MASK INTEL_MASK(15, 8)
+# define SO_NUM_ENTRIES_0_SHIFT0
+# define SO_NUM_ENTRIES_0_MASK INTEL_MASK(7, 0)
+
+/* SO_DECL DW0 */
+# define SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT  12
+# define SO_DECL_OUTPUT_BUFFER_SLOT_MASK   INTEL_MASK(13, 12)
+# define SO_DECL_HOLE_FLAG (1  11)
+# define SO_DECL_REGISTER_INDEX_SHIFT  4
+# define SO_DECL_REGISTER_INDEX_MASK   INTEL_MASK(9, 4)
+# define SO_DECL_COMPONENT_MASK_SHIFT  0
+# define SO_DECL_COMPONENT_MASK_MASK   INTEL_MASK(3, 0)
+

[Mesa-dev] [PATCH 1/7] i965/gen7: Enable EXT_transform_feedback extension under 3.0 override.

2011-12-22 Thread Eric Anholt
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/intel/intel_extensions.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c 
b/src/mesa/drivers/dri/intel/intel_extensions.c
index 7ab5d90..09ee9ba 100644
--- a/src/mesa/drivers/dri/intel/intel_extensions.c
+++ b/src/mesa/drivers/dri/intel/intel_extensions.c
@@ -104,7 +104,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx-Const.GLSLVersion = 120;
_mesa_override_glsl_version(ctx);
 
-   if (intel-gen == 6)
+   if (intel-gen == 6 || (intel-gen == 7  override_version = 30))
   ctx-Extensions.EXT_transform_feedback = true;
 
if (intel-gen = 5)
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.

2011-12-22 Thread Eric Anholt
Fixes the piglit discard-* tests.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index fcda08d..650f625 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -37,6 +37,12 @@ static void
 upload_sol_state(struct brw_context *brw)
 {
struct intel_context *intel = brw-intel;
+   struct gl_context *ctx = intel-ctx;
+   uint32_t dw1 = 0;
+
+   /* _NEW_RASTERIZER_DISCARD */
+   if (ctx-RasterDiscard)
+  dw1 |= SO_RENDERING_DISABLE;
 
/* Disable the SOL stage */
BEGIN_BATCH(3);
@@ -48,7 +54,7 @@ upload_sol_state(struct brw_context *brw)
 
 const struct brw_tracked_state gen7_sol_state = {
.dirty = {
-  .mesa  = 0,
+  .mesa  = _NEW_RASTERIZER_DISCARD,
   .brw   = BRW_NEW_BATCH,
   .cache = 0,
},
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.

2011-12-22 Thread Eric Anholt
Fixes almost all of the transform feedback piglit tests.  Remaining
are a few tests related to tesselation for
quads/trifans/tristrips/polygons with flat shading.
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c |  199 ++-
 1 files changed, 191 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 650f625..a5e28b6 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -32,31 +32,214 @@
 #include brw_state.h
 #include brw_defines.h
 #include intel_batchbuffer.h
+#include intel_buffer_objects.h
 
 static void
-upload_sol_state(struct brw_context *brw)
+upload_3dstate_so_buffers(struct brw_context *brw)
+{
+   struct intel_context *intel = brw-intel;
+   struct gl_context *ctx = intel-ctx;
+   /* BRW_NEW_VERTEX_PROGRAM */
+   const struct gl_shader_program *vs_prog =
+  ctx-Shader.CurrentVertexProgram;
+   const struct gl_transform_feedback_info *linked_xfb_info =
+  vs_prog-LinkedTransformFeedback;
+   struct gl_transform_feedback_object *xfb_obj =
+  ctx-TransformFeedback.CurrentObject;
+   int i;
+
+   /* Set up the up to 4 output buffers.  These are the ranges defined in the
+* gl_transform_feedback_object.
+*/
+   for (i = 0; i  4; i++) {
+  struct gl_buffer_object *bufferobj = xfb_obj-Buffers[i];
+  drm_intel_bo *bo;
+  uint32_t start, end;
+
+  if (!xfb_obj-Buffers[i]) {
+/* The pitch of 0 in this command indicates that the buffer is
+ * unbound and won't be written to.
+ */
+BEGIN_BATCH(4);
+OUT_BATCH(_3DSTATE_SO_BUFFER  16 | (4 - 2));
+OUT_BATCH((i  SO_BUFFER_INDEX_SHIFT));
+OUT_BATCH(0);
+OUT_BATCH(0);
+ADVANCE_BATCH();
+
+continue;
+  }
+
+  bo = intel_buffer_object(bufferobj)-buffer;
+
+  start = xfb_obj-Offset[i];
+  assert(start % 4 == 0);
+  end = ALIGN(start + xfb_obj-Size[i], 4);
+  assert(end = bo-size);
+
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_SO_BUFFER  16 | (4 - 2));
+  OUT_BATCH((i  SO_BUFFER_INDEX_SHIFT) |
+   ((linked_xfb_info-BufferStride[i] * 4) 
+SO_BUFFER_PITCH_SHIFT));
+  OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, start);
+  OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, end);
+  ADVANCE_BATCH();
+   }
+}
+
+/**
+ * Outputs the 3DSTATE_SO_DECL_LIST command.
+ *
+ * The data output is a series of 64-bit entries containing a SO_DECL per
+ * stream.  We only have one stream of rendering coming out of the GS unit, so
+ * we only emit stream 0 (low 16 bits) SO_DECLs.
+ */
+static void
+upload_3dstate_so_decl_list(struct brw_context *brw,
+   struct brw_vue_map *vue_map)
+{
+   struct intel_context *intel = brw-intel;
+   struct gl_context *ctx = intel-ctx;
+   /* BRW_NEW_VERTEX_PROGRAM */
+   const struct gl_shader_program *vs_prog =
+  ctx-Shader.CurrentVertexProgram;
+   /* NEW_TRANSFORM_FEEDBACK */
+   const struct gl_transform_feedback_info *linked_xfb_info =
+  vs_prog-LinkedTransformFeedback;
+   int i;
+   uint16_t so_decl[128];
+   int buffer_mask = 0;
+   int next_offset[4] = {0, 0, 0, 0};
+
+   /* Construct the list of SO_DECLs to be emitted.  The formatting of the
+* command is feels strange -- each dword pair contains a SO_DECL per 
stream.
+*/
+   for (i = 0; i  linked_xfb_info-NumOutputs; i++) {
+  int buffer = linked_xfb_info-Outputs[i].OutputBuffer;
+  uint16_t decl = 0;
+  int vert_result = linked_xfb_info-Outputs[i].OutputRegister;
+
+  buffer_mask |= 1  buffer;
+
+  decl |= buffer  SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT;
+  decl |= vue_map-vert_result_to_slot[vert_result] 
+SO_DECL_REGISTER_INDEX_SHIFT;
+  decl |= ((1  linked_xfb_info-Outputs[i].NumComponents) - 1) 
+SO_DECL_COMPONENT_MASK_SHIFT;
+
+  /* FINISHME */
+  assert(linked_xfb_info-Outputs[i].DstOffset == next_offset[buffer]);
+
+  next_offset[buffer] += linked_xfb_info-Outputs[i].NumComponents;
+
+  so_decl[i] = decl;
+   }
+
+   BEGIN_BATCH(linked_xfb_info-NumOutputs * 2 + 3);
+   OUT_BATCH(_3DSTATE_SO_DECL_LIST  16 |
+(linked_xfb_info-NumOutputs * 2 + 1));
+
+   OUT_BATCH((buffer_mask  SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT) |
+(0  SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT) |
+(0  SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT) |
+(0  SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT));
+
+   OUT_BATCH((linked_xfb_info-NumOutputs  SO_NUM_ENTRIES_0_SHIFT) |
+(0  SO_NUM_ENTRIES_1_SHIFT) |
+(0  SO_NUM_ENTRIES_2_SHIFT) |
+(0  SO_NUM_ENTRIES_3_SHIFT));
+
+   for (i = 0; i  linked_xfb_info-NumOutputs; i++) {
+  OUT_BATCH(so_decl[i]);
+  OUT_BATCH(0);
+   }
+
+   ADVANCE_BATCH();
+}
+
+static void
+upload_3dstate_streamout(struct brw_context *brw, bool active,
+   

[Mesa-dev] [PATCH 7/7] i965/gen7: Fix feedback for flat-shaded tristrips versus provoking vertex.

2011-12-22 Thread Eric Anholt
Fixes piglit tesselation triangle_strip flat_last.
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index a5e28b6..93ca868 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -182,6 +182,10 @@ upload_3dstate_streamout(struct brw_context *brw, bool 
active,
   dw1 |= SO_FUNCTION_ENABLE;
   dw1 |= SO_STATISTICS_ENABLE;
 
+  /* _NEW_LIGHT */
+  if (ctx-Light.ProvokingVertex != GL_FIRST_VERTEX_CONVENTION)
+dw1 |= SO_REORDER_TRAILING;
+
   for (i = 0; i  4; i++) {
 if (xfb_obj-Buffers[i]) {
dw1 |= SO_BUFFER_ENABLE_0  i;
@@ -235,6 +239,7 @@ upload_sol_state(struct brw_context *brw)
 const struct brw_tracked_state gen7_sol_state = {
.dirty = {
   .mesa  = (_NEW_RASTERIZER_DISCARD |
+   _NEW_LIGHT |
_NEW_TRANSFORM_FEEDBACK |
_NEW_TRANSFORM),
   .brw   = (BRW_NEW_BATCH |
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] i965: dynamic eu instruction store size

2011-12-22 Thread Yuanhan Liu
On Thu, Dec 22, 2011 at 02:37:58PM -0800, Kenneth Graunke wrote:
 On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
  Hi, this is a new series of patches for dynamic eu instruction store
  size. The first 4 is from Eric. I just grabed it to make it rebase to
  current repo. The last 4 patch is from mine which some are based on
  those patches from Eric.
  
  Please help to review it.
  
  BTW, I checked those patches with all oglc test cases, and found
  no regression. (Sandybridge only).
  
  Thanks,
  Yuanhan Liu
  
  
  --
  Eric Anholt (4):
i965: Drop unused do_insn argument from gen6_CONT().
i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start
i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT
  patching
i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in
  loop
  
  Yuanhan Liu (4):
i965: let the if_stack just store the instruction index
i965: get the jmp distance by instruction index
i965: call next_insn() before referencing a instruction by index
 
 
 Patches 1-7 (v2 of 6 and after changing to bool in 7) are:
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Thanks.

 
i965: increase the brw eu instruction store size dynamically
 
 Patch 8 does not get a R-b just yet.
Ok , will fix it.

 
 Thanks for doing this, Yuanhan, I'm really glad to see the arbitrary
 1 limit die.

Welcome and it's my pleasure.

 And Eric, thanks for cleaning up the rest of the
 control flow stack code---it's /so/ much nicer now!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.

2011-12-22 Thread Paul Berry
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote:

 The code was relying on gs.prog_data's copy of the
 number-of-verts-per-prim, which segfaulted on gen7 since it doesn't
 make a GS program.  We can easily calculate that value right here.
 ---
  src/mesa/drivers/dri/i965/brw_draw.c |   33
 +++--
  1 files changed, 27 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
 b/src/mesa/drivers/dri/i965/brw_draw.c
 index 082bb9a..c116d39 100644
 --- a/src/mesa/drivers/dri/i965/brw_draw.c
 +++ b/src/mesa/drivers/dri/i965/brw_draw.c
 @@ -379,6 +379,30 @@ static void
 brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
}
  }

 +static int
 +verts_per_prim(GLenum mode)
 +{
 +   switch (mode) {
 +   case GL_POINTS:
 +  return 1;
 +   case GL_LINE_STRIP:
 +   case GL_LINE_LOOP:
 +   case GL_LINES:
 +  return 2;
 +   case GL_TRIANGLE_STRIP:
 +   case GL_TRIANGLE_FAN:
 +   case GL_POLYGON:
 +   case GL_TRIANGLES:
 +   case GL_QUADS:
 +   case GL_QUAD_STRIP:
 +  return 3;
 +   default:
 +  _mesa_problem(NULL,
 +   unknown prim type in transform feedback primitive
 count);
 +  return 0;
 +   }
 +}
 +
  /**
  * Update internal counters based on the the drawing operation described in
  * prim.
 @@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw,
* able to reload SVBI 0 with the correct value in case we have to
 start
* a new batch buffer.
*/
 -  unsigned svbi_postincrement_value =
 - brw-gs.prog_data-svbi_postincrement_value;
 +  unsigned verts = verts_per_prim(prim-mode);
   uint32_t space_avail =
 - (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index)
 - / svbi_postincrement_value;
 + (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) /
 verts;
   uint32_t primitives_written = MIN2 (space_avail, count);
 -  brw-sol.svbi_0_starting_index +=
 - svbi_postincrement_value * primitives_written;
 +  brw-sol.svbi_0_starting_index += verts;


This should be brw-sol.svbi_0_starting_index += verts *
primitives_written.

With that change, this is
Reviewed-by: Paul Berry stereotype...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 04:54 PM, Eric Anholt wrote:
 The code was relying on gs.prog_data's copy of the
 number-of-verts-per-prim, which segfaulted on gen7 since it doesn't
 make a GS program.  We can easily calculate that value right here.
 ---
  src/mesa/drivers/dri/i965/brw_draw.c |   33 +++--
  1 files changed, 27 insertions(+), 6 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
 b/src/mesa/drivers/dri/i965/brw_draw.c
 index 082bb9a..c116d39 100644
 --- a/src/mesa/drivers/dri/i965/brw_draw.c
 +++ b/src/mesa/drivers/dri/i965/brw_draw.c
 @@ -379,6 +379,30 @@ static void brw_postdraw_set_buffers_need_resolve(struct 
 brw_context *brw)
 }
  }
  
 +static int
 +verts_per_prim(GLenum mode)
 +{
 +   switch (mode) {
 +   case GL_POINTS:
 +  return 1;
 +   case GL_LINE_STRIP:
 +   case GL_LINE_LOOP:
 +   case GL_LINES:
 +  return 2;
 +   case GL_TRIANGLE_STRIP:
 +   case GL_TRIANGLE_FAN:
 +   case GL_POLYGON:
 +   case GL_TRIANGLES:
 +   case GL_QUADS:
 +   case GL_QUAD_STRIP:
 +  return 3;
 +   default:
 +  _mesa_problem(NULL,
 + unknown prim type in transform feedback primitive count);
 +  return 0;
 +   }
 +}
 +
  /**
   * Update internal counters based on the the drawing operation described in
   * prim.
 @@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw,
 * able to reload SVBI 0 with the correct value in case we have to 
 start
 * a new batch buffer.
 */
 -  unsigned svbi_postincrement_value =
 - brw-gs.prog_data-svbi_postincrement_value;
 +  unsigned verts = verts_per_prim(prim-mode);
uint32_t space_avail =
 - (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index)
 - / svbi_postincrement_value;
 + (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) / 
 verts;
uint32_t primitives_written = MIN2 (space_avail, count);
 -  brw-sol.svbi_0_starting_index +=
 - svbi_postincrement_value * primitives_written;
 +  brw-sol.svbi_0_starting_index += verts;

Don't you mean
 brw-sol.svbi_0_starting_index += verts * primitives_written;

/* And update the TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query. */
brw-sol.primitives_written += primitives_written;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ff_fragment_shader: Don't generate swizzles for scalar combiner inputs

2011-12-22 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

There are a couple scenarios where the source could be zero and the
operand could be either SRC_ALPHA or ONE_MINUS_SRC_ALPHA.  For
example, if the source was ZERO.  This would result in something like
(0).w, and a later call to ir_validate would get angry.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42517
---
 src/mesa/main/ff_fragment_shader.cpp |   16 ++--
 1 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 008da0d..3e736fa 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -632,15 +632,19 @@ emit_combine_source(struct texenv_fragment_program *p,
   new(p-mem_ctx) ir_constant(1.0f),
   src);
 
-   case OPR_SRC_ALPHA: 
-  return new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1);
+   case OPR_SRC_ALPHA:
+  return src-type-is_scalar()
+? src : (ir_rvalue *) new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1);
+
+   case OPR_ONE_MINUS_SRC_ALPHA: {
+  ir_rvalue *const scalar = (src-type-is_scalar())
+? src : (ir_rvalue *) new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1);
 
-   case OPR_ONE_MINUS_SRC_ALPHA: 
   return new(p-mem_ctx) ir_expression(ir_binop_sub,
   new(p-mem_ctx) ir_constant(1.0f),
-  new(p-mem_ctx) ir_swizzle(src,
- 3, 3,
- 3, 3, 1));
+  scalar);
+   }
+
case OPR_ZERO:
   return new(p-mem_ctx) ir_constant(0.0f);
case OPR_ONE:
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.

2011-12-22 Thread Paul Berry
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote:

 Fixes the piglit discard-* tests.

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 index fcda08d..650f625 100644
 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 @@ -37,6 +37,12 @@ static void
  upload_sol_state(struct brw_context *brw)
  {
struct intel_context *intel = brw-intel;
 +   struct gl_context *ctx = intel-ctx;
 +   uint32_t dw1 = 0;
 +
 +   /* _NEW_RASTERIZER_DISCARD */
 +   if (ctx-RasterDiscard)
 +  dw1 |= SO_RENDERING_DISABLE;


It looks like dw1 is set here but not used until patch 6/7.



/* Disable the SOL stage */
BEGIN_BATCH(3);
 @@ -48,7 +54,7 @@ upload_sol_state(struct brw_context *brw)

  const struct brw_tracked_state gen7_sol_state = {
.dirty = {
 -  .mesa  = 0,
 +  .mesa  = _NEW_RASTERIZER_DISCARD,
   .brw   = BRW_NEW_BATCH,
   .cache = 0,
},
 --
 1.7.7.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically

2011-12-22 Thread Yuanhan Liu
On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote:
 On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
  Here is the final patch to enable dynamic eu instruction store size:
  increase the brw eu instruction store size dynamically instead of just
  allocating it statically with a constant limit. This would fix something
  that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would
  limit it to 1'.
  
  Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_eu.c  |7 +++
   src/mesa/drivers/dri/i965/brw_eu.h  |7 ---
   src/mesa/drivers/dri/i965/brw_eu_emit.c |   12 +++-
   3 files changed, 22 insertions(+), 4 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_eu.c 
  b/src/mesa/drivers/dri/i965/brw_eu.c
  index 9b4dde8..7d206f3 100644
  --- a/src/mesa/drivers/dri/i965/brw_eu.c
  +++ b/src/mesa/drivers/dri/i965/brw_eu.c
  @@ -174,6 +174,13 @@ void
   brw_init_compile(struct brw_context *brw, struct brw_compile *p, void 
  *mem_ctx)
   {
  p-brw = brw;
  +   /*
  +* Set the initial instruction store array size to 1024, if found that
  +* isn't enough, then it will double the store size at brw_next_insn()
  +* until it meet the BRW_EU_MAX_INSN
  +*/
  +   p-store_size = 1024;
  +   p-store = rzalloc_array(mem_ctx, struct brw_instruction, 
  p-store_size);
  p-nr_insn = 0;
  p-current = p-stack;
  p-compressed = false;
  diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
  b/src/mesa/drivers/dri/i965/brw_eu.h
  index 9d3d7de..52567c2 100644
  --- a/src/mesa/drivers/dri/i965/brw_eu.h
  +++ b/src/mesa/drivers/dri/i965/brw_eu.h
  @@ -100,11 +100,12 @@ struct brw_glsl_call;
   
   
   
  -#define BRW_EU_MAX_INSN_STACK 5
  -#define BRW_EU_MAX_INSN 1
  +#define BRW_EU_MAX_INSN_STACK   5
  +#define BRW_EU_MAX_INSN (1024 * 1024)
 
 I'm actually surprised to see BRW_EU_MAX_INSN at all.  As far as I know,
 there isn't an actual hardware limit on the number of instructions,

Glad to know that. Thanks.

 so
 I'm not sure why we should cap it at all.  Especially not to some
 arbitrary number.  (I'm assuming that 1024 * 1024 is just something you
 came up with arbitrarily...)

Aha, yes, you are right, I made it. :)



Here is the fixed patch, please help to review it:

From 66c30acdeae88cdba07ed85443b04d4bc6c56792 Mon Sep 17 00:00:00 2001
From: Yuanhan Liu yuanhan@linux.intel.com
Date: Wed, 21 Dec 2011 15:38:44 +0800
Subject: [PATCH] i965: increase the brw eu instruction store size dynamically

Here is the final patch to enable dynamic eu instruction store size:
increase the brw eu instruction store size dynamically instead of just
allocating it statically with a constant limit. This would fix something
that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would
limit it to 1'.

v2: comments from ken, do not hardcode the eu limit to (1024 * 1024)

Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_eu.c  |7 +++
 src/mesa/drivers/dri/i965/brw_eu.h  |4 ++--
 src/mesa/drivers/dri/i965/brw_eu_emit.c |   10 +-
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu.c 
b/src/mesa/drivers/dri/i965/brw_eu.c
index 9b4dde8..2b0593a 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.c
+++ b/src/mesa/drivers/dri/i965/brw_eu.c
@@ -174,6 +174,13 @@ void
 brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx)
 {
p-brw = brw;
+   /*
+* Set the initial instruction store array size to 1024, if found that
+* isn't enough, then it will double the store size at brw_next_insn()
+* until out of memory.
+*/
+   p-store_size = 1024;
+   p-store = rzalloc_array(mem_ctx, struct brw_instruction, p-store_size);
p-nr_insn = 0;
p-current = p-stack;
p-compressed = false;
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index cc2f618..a41e988 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -101,10 +101,10 @@ struct brw_glsl_call;
 
 
 #define BRW_EU_MAX_INSN_STACK 5
-#define BRW_EU_MAX_INSN 1
 
 struct brw_compile {
-   struct brw_instruction store[BRW_EU_MAX_INSN];
+   struct brw_instruction *store;
+   int store_size;
GLuint nr_insn;
 
void *mem_ctx;
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 829d92c..9288f9b 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -691,7 +691,15 @@ brw_next_insn(struct brw_compile *p, GLuint opcode)
 {
struct brw_instruction *insn;
 
-   assert(p-nr_insn + 1  BRW_EU_MAX_INSN);
+   if (p-nr_insn + 1  p-store_size) {
+  if (0)
+ printf(incresing the store size to %d\n, p-store_size  1);
+  p-store_size = 1;
+  p-store = reralloc(p-mem_ctx, p-store,
+  struct 

Re: [Mesa-dev] [PATCH 7/8] i965: call next_insn() before referencing a instruction by index

2011-12-22 Thread Yuanhan Liu
On Thu, Dec 22, 2011 at 11:09:12AM -0800, Kenneth Graunke wrote:
 On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
 [snip]
  +   int emit_endif = 1;
 
 Please use bool and true/false rather than int.

Yes, right. Will fix it.

 
  /* In single program flow mode, we can express IF and ELSE instructions
   * equivalently as ADD instructions that operate on IP.  On platforms 
  prior
  @@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p)
   * instructions to conditional ADDs.  So we only do this trick on Gen4 
  and
   * Gen5.
   */
  -   if (intel-gen  6  p-single_program_flow) {
  +   if (intel-gen  6  p-single_program_flow)
  +  emit_endif = 0;
 
 You could actually just do this:
 
 /* In single program flow mode, we can express IF and ELSE ...
  */
 bool emit_endif = !(intel-gen  6  p-single_program_flow);
 
 But I'm fine with bool emit_endif = true and emit_endif = false if
 you prefer that.

Yes, I prefer that. From my point, in this case, with the comments, it
can tell us why we can't emit endif clearly.

Here is the fixed patch:

From 7c8b8bc87846df9513a0c32cc8a388fb62f5476a Mon Sep 17 00:00:00 2001
From: Yuanhan Liu yuanhan@linux.intel.com
Date: Wed, 21 Dec 2011 15:32:02 +0800
Subject: [PATCH] i965: call next_insn() before referencing a instruction by
 index

A single next_insn may change the base address of instruction store
memory(p-store), so call it first before referencing the instruction
store pointer from an index.

This the final prepare work to enable the dynamic store size.

v2: comments from Ken, define emit_endif as bool type

Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c |   40 ---
 1 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index b2ab013..843d12f 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -1197,15 +1197,7 @@ brw_ENDIF(struct brw_compile *p)
struct brw_instruction *else_inst = NULL;
struct brw_instruction *if_inst = NULL;
struct brw_instruction *tmp;
-
-   /* Pop the IF and (optional) ELSE instructions from the stack */
-   p-if_depth_in_loop[p-loop_stack_depth]--;
-   tmp = pop_if_stack(p);
-   if (tmp-header.opcode == BRW_OPCODE_ELSE) {
-  else_inst = tmp;
-  tmp = pop_if_stack(p);
-   }
-   if_inst = tmp;
+   bool emit_endif = true;
 
/* In single program flow mode, we can express IF and ELSE instructions
 * equivalently as ADD instructions that operate on IP.  On platforms prior
@@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p)
 * instructions to conditional ADDs.  So we only do this trick on Gen4 and
 * Gen5.
 */
-   if (intel-gen  6  p-single_program_flow) {
+   if (intel-gen  6  p-single_program_flow)
+  emit_endif = false;
+
+   /*
+* A single next_insn() may change the base adress of instruction store
+* memory(p-store), so call it first before referencing the instruction
+* store pointer from an index
+*/
+   if (emit_endif)
+  insn = next_insn(p, BRW_OPCODE_ENDIF);
+
+   /* Pop the IF and (optional) ELSE instructions from the stack */
+   p-if_depth_in_loop[p-loop_stack_depth]--;
+   tmp = pop_if_stack(p);
+   if (tmp-header.opcode == BRW_OPCODE_ELSE) {
+  else_inst = tmp;
+  tmp = pop_if_stack(p);
+   }
+   if_inst = tmp;
+
+   if (!emit_endif) {
   /* ENDIF is useless; don't bother emitting it. */
   convert_IF_ELSE_to_ADD(p, if_inst, else_inst);
   return;
}
 
-   insn = next_insn(p, BRW_OPCODE_ENDIF);
-
if (intel-gen  6) {
   brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
   brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
@@ -1393,13 +1403,12 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
struct brw_instruction *insn, *do_insn;
GLuint br = 1;
 
-   do_insn = get_inner_do_insn(p);
-
if (intel-gen = 5)
   br = 2;
 
if (intel-gen = 7) {
   insn = next_insn(p, BRW_OPCODE_WHILE);
+  do_insn = get_inner_do_insn(p);
 
   brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
   brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
@@ -1409,6 +1418,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
   insn-header.execution_size = BRW_EXECUTE_8;
} else if (intel-gen == 6) {
   insn = next_insn(p, BRW_OPCODE_WHILE);
+  do_insn = get_inner_do_insn(p);
 
   brw_set_dest(p, insn, brw_imm_w(0));
   insn-bits1.branch_gen6.jump_count = br * (do_insn - insn);
@@ -1419,6 +1429,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
} else {
   if (p-single_program_flow) {
 insn = next_insn(p, BRW_OPCODE_ADD);
+ do_insn = get_inner_do_insn(p);
 
 brw_set_dest(p, 

Re: [Mesa-dev] [PATCH 3/7] i965/gen7: Add register definitions for GL_EXT_transform_feedback.

2011-12-22 Thread Paul Berry
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote:

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_defines.h |   76
 ++-
  src/mesa/drivers/dri/intel/intel_reg.h  |   15 ++
  2 files changed, 89 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 4edfaf7..4bb7f00 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -1307,6 +1307,42 @@ enum brw_wm_barycentric_interp_mode {
  #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
  #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */

 +#define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
 +/* DW1 */
 +# define SO_FUNCTION_ENABLE(1  31)
 +# define SO_RENDERING_DISABLE  (1  30)
 +/* This selects which incoming rendering stream goes down the pipeline.
  The
 + * rendering stream is 0 if not defined by special cases in the GS state.
 + */
 +# define SO_RENDER_STREAM_SELECT_SHIFT 27
 +# define SO_RENDER_STREAM_SELECT_MASK  INTEL_MASK(28, 27)
 +/* Controls reordering of TRISTRIP_* elements in stream output (not
 rendering).
 + */
 +# define SO_REORDER_TRAILING   (1  26)
 +/* Controls SO_NUM_PRIMS_WRITTEN_* and SO_PRIM_STORAGE_* */
 +# define SO_STATISTICS_ENABLE  (1  25)
 +# define SO_BUFFER_ENABLE_3(1  11)
 +# define SO_BUFFER_ENABLE_2(1  10)
 +# define SO_BUFFER_ENABLE_1(1  9)
 +# define SO_BUFFER_ENABLE_0(1  8)


Considering how these are used in patch 6/7, I'd prefer if we did this:

#define SO_BUFFER_ENABLE(n)  (1  (8 + (n)))

Then in patch 6/7 we could do

dw1 |= SO_BUFFER_ENABLE(i);

instead of

dw1 |= SO_BUFFER_ENABLE_0  i;


 +/* DW2 */
 +# define SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT  29
 +# define SO_STREAM_3_VERTEX_READ_OFFSET_MASK   INTEL_MASK(29, 29)
 +# define SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT  24
 +# define SO_STREAM_3_VERTEX_READ_LENGTH_MASK   INTEL_MASK(28, 24)
 +# define SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT  21
 +# define SO_STREAM_2_VERTEX_READ_OFFSET_MASK   INTEL_MASK(21, 21)
 +# define SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT  16
 +# define SO_STREAM_2_VERTEX_READ_LENGTH_MASK   INTEL_MASK(20, 16)
 +# define SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT  13
 +# define SO_STREAM_1_VERTEX_READ_OFFSET_MASK   INTEL_MASK(13, 13)
 +# define SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT  8
 +# define SO_STREAM_1_VERTEX_READ_LENGTH_MASK   INTEL_MASK(12, 8)
 +# define SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT  5
 +# define SO_STREAM_0_VERTEX_READ_OFFSET_MASK   INTEL_MASK(5, 5)
 +# define SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT  0
 +# define SO_STREAM_0_VERTEX_READ_LENGTH_MASK   INTEL_MASK(4, 0)
 +
  /* 3DSTATE_WM for Gen7 */
  /* DW1 */
  # define GEN7_WM_STATISTICS_ENABLE (1  31)
 @@ -1373,8 +1409,6 @@ enum brw_wm_barycentric_interp_mode {
  /* DW6: kernel 1 pointer */
  /* DW7: kernel 2 pointer */

 -#define _3DSTATE_STREAMOUT  0x781e /* GEN7+ */
 -
  #define _3DSTATE_SAMPLE_MASK   0x7818 /* GEN6+ */

  #define _3DSTATE_DRAWING_RECTANGLE 0x7900
 @@ -1414,6 +1448,44 @@ enum brw_wm_barycentric_interp_mode {
  # define DEPTH_CLEAR_VALID (1  15)
  /* DW1: depth clear value */

 +#define _3DSTATE_SO_DECL_LIST  0x7917 /* GEN7+ */
 +/* DW1 */
 +# define SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT   12
 +# define SO_STREAM_TO_BUFFER_SELECTS_3_MASKINTEL_MASK(15, 12)
 +# define SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT   8
 +# define SO_STREAM_TO_BUFFER_SELECTS_2_MASKINTEL_MASK(11, 8)
 +# define SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT   4
 +# define SO_STREAM_TO_BUFFER_SELECTS_1_MASKINTEL_MASK(7, 4)
 +# define SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT   0
 +# define SO_STREAM_TO_BUFFER_SELECTS_0_MASKINTEL_MASK(3, 0)
 +/* DW2 */
 +# define SO_NUM_ENTRIES_3_SHIFT24
 +# define SO_NUM_ENTRIES_3_MASK INTEL_MASK(31, 24)
 +# define SO_NUM_ENTRIES_2_SHIFT16
 +# define SO_NUM_ENTRIES_2_MASK INTEL_MASK(23, 16)
 +# define SO_NUM_ENTRIES_1_SHIFT8
 +# define SO_NUM_ENTRIES_1_MASK INTEL_MASK(15, 8)
 +# define SO_NUM_ENTRIES_0_SHIFT0
 +# define SO_NUM_ENTRIES_0_MASK INTEL_MASK(7, 0)
 +
 +/* SO_DECL DW0 */
 +# define SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT  12
 +# define 

Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.

2011-12-22 Thread Paul Berry
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote:

 Fixes almost all of the transform feedback piglit tests.  Remaining
 are a few tests related to tesselation for
 quads/trifans/tristrips/polygons with flat shading.
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c |  199
 ++-
  1 files changed, 191 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 index 650f625..a5e28b6 100644
 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 @@ -32,31 +32,214 @@
  #include brw_state.h
  #include brw_defines.h
  #include intel_batchbuffer.h
 +#include intel_buffer_objects.h

  static void
 -upload_sol_state(struct brw_context *brw)
 +upload_3dstate_so_buffers(struct brw_context *brw)
 +{
 +   struct intel_context *intel = brw-intel;
 +   struct gl_context *ctx = intel-ctx;
 +   /* BRW_NEW_VERTEX_PROGRAM */
 +   const struct gl_shader_program *vs_prog =
 +  ctx-Shader.CurrentVertexProgram;
 +   const struct gl_transform_feedback_info *linked_xfb_info =
 +  vs_prog-LinkedTransformFeedback;
 +   struct gl_transform_feedback_object *xfb_obj =
 +  ctx-TransformFeedback.CurrentObject;


Can we have a /* NEW_TRANSFORM_FEEDBACK */ comment here?


 +   int i;
 +
 +   /* Set up the up to 4 output buffers.  These are the ranges defined in
 the
 +* gl_transform_feedback_object.
 +*/
 +   for (i = 0; i  4; i++) {
 +  struct gl_buffer_object *bufferobj = xfb_obj-Buffers[i];
 +  drm_intel_bo *bo;
 +  uint32_t start, end;
 +
 +  if (!xfb_obj-Buffers[i]) {
 +/* The pitch of 0 in this command indicates that the buffer is
 + * unbound and won't be written to.
 + */
 +BEGIN_BATCH(4);
 +OUT_BATCH(_3DSTATE_SO_BUFFER  16 | (4 - 2));
 +OUT_BATCH((i  SO_BUFFER_INDEX_SHIFT));
 +OUT_BATCH(0);
 +OUT_BATCH(0);
 +ADVANCE_BATCH();
 +
 +continue;
 +  }
 +
 +  bo = intel_buffer_object(bufferobj)-buffer;
 +
 +  start = xfb_obj-Offset[i];
 +  assert(start % 4 == 0);
 +  end = ALIGN(start + xfb_obj-Size[i], 4);
 +  assert(end = bo-size);
 +
 +  BEGIN_BATCH(4);
 +  OUT_BATCH(_3DSTATE_SO_BUFFER  16 | (4 - 2));
 +  OUT_BATCH((i  SO_BUFFER_INDEX_SHIFT) |
 +   ((linked_xfb_info-BufferStride[i] * 4) 
 +SO_BUFFER_PITCH_SHIFT));


It looks like we're not setting SO Buffer Object Control State.  Is that
ok?  I'm not too familiar with memory object control states so I'm not
sure, but it seemed to me that it might be sensible to mark the stream
output as L3 cacheable.


 +  OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
 start);
 +  OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, end);
 +  ADVANCE_BATCH();
 +   }
 +}
 +
 +/**
 + * Outputs the 3DSTATE_SO_DECL_LIST command.
 + *
 + * The data output is a series of 64-bit entries containing a SO_DECL per
 + * stream.  We only have one stream of rendering coming out of the GS
 unit, so
 + * we only emit stream 0 (low 16 bits) SO_DECLs.
 + */
 +static void
 +upload_3dstate_so_decl_list(struct brw_context *brw,
 +   struct brw_vue_map *vue_map)
 +{
 +   struct intel_context *intel = brw-intel;
 +   struct gl_context *ctx = intel-ctx;
 +   /* BRW_NEW_VERTEX_PROGRAM */
 +   const struct gl_shader_program *vs_prog =
 +  ctx-Shader.CurrentVertexProgram;
 +   /* NEW_TRANSFORM_FEEDBACK */
 +   const struct gl_transform_feedback_info *linked_xfb_info =
 +  vs_prog-LinkedTransformFeedback;
 +   int i;
 +   uint16_t so_decl[128];


Can we add an assertion to verify that there is no danger of overflowing
this array?  I think STATIC_ASSERT(ARRAY_SIZE(so_decl) =
MAX_PROGRAM_OUTPUTS) ought to do the trick.


 +   int buffer_mask = 0;
 +   int next_offset[4] = {0, 0, 0, 0};

+
 +   /* Construct the list of SO_DECLs to be emitted.  The formatting of the
 +* command is feels strange -- each dword pair contains a SO_DECL per
 stream.
 +*/
 +   for (i = 0; i  linked_xfb_info-NumOutputs; i++) {
 +  int buffer = linked_xfb_info-Outputs[i].OutputBuffer;
 +  uint16_t decl = 0;
 +  int vert_result = linked_xfb_info-Outputs[i].OutputRegister;
 +
 +  buffer_mask |= 1  buffer;
 +
 +  decl |= buffer  SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT;
 +  decl |= vue_map-vert_result_to_slot[vert_result] 
 +SO_DECL_REGISTER_INDEX_SHIFT;
 +  decl |= ((1  linked_xfb_info-Outputs[i].NumComponents) - 1) 
 +SO_DECL_COMPONENT_MASK_SHIFT;
 +
 +  /* FINISHME */
 +  assert(linked_xfb_info-Outputs[i].DstOffset ==
 next_offset[buffer]);


FYI, this assertion should hold true until we implement
ARB_transfrom_feedback3 (which allows holes in the transform feedback
structure).  I think Marek has some plans to implement that for Gallium
(not sure of his timeframe though), so we may want to keep an eye out.

Re: [Mesa-dev] i965/gen7 transform feedback

2011-12-22 Thread Paul Berry
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote:

 Here's today's patch series for gen7 transform feedback.  It runs on
 top of a kernel patch at people.freedesktop.org:~anholt/linux on the
 gen7-reset-sol branch.  I expected it to be easy, but not this easy.


This is fantastic, Eric.  I'm really pleased how quickly this is coming
together.

Other than a few minor comments that I've already sent out, the series is:
Reviewed-by: Paul Berry stereotype...@gmail.com



 Remaining test failures:

tessellation polygon flat_lastwarn
tessellation quad_strip flat_last warn
tessellation quads flat_last  warn
tessellation triangle_fan flat_first  fail


I'm sorry to hear that triangle_fan flat_first fails--both AMD and nVidia
pass that test, so it's a bit embarrassing for Intel to fail it.  But it's
an obscure case (who flatshades trifans anyhow, especially when using
transform feedback?) and I can't see any way of fixing it on Gen7 without
firing up the GS, which seems like *way* overkill.  I'm far less bothered
by the 3 warnings, because nVidia gets those exact same warnings too.

So IMHO, it's ok to leave these 4 tests failing.

Incidentally, Gen6 gets the exact same 4 failures, plus 4 additional
failures for tessellation triangle_strip.  I'll try to fix tessellation
triangle_strip tomorrow.



 Also, it looks like none of the current piglit tests test transform
 feedback across batchbuffers.  I don't expect major problems there,
 given that we have a userland count of verts emitted.


I have 3 more work days left in the year, so I'll try to implement those
tests before you need them.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 06:22 PM, Paul Berry wrote:
 On 22 December 2011 16:54, Eric Anholt e...@anholt.net
 mailto:e...@anholt.net wrote:
 
 Fixes the piglit discard-* tests.
 
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 mailto:kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 index fcda08d..650f625 100644
 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 @@ -37,6 +37,12 @@ static void
  upload_sol_state(struct brw_context *brw)
  {
struct intel_context *intel = brw-intel;
 +   struct gl_context *ctx = intel-ctx;
 +   uint32_t dw1 = 0;
 +
 +   /* _NEW_RASTERIZER_DISCARD */
 +   if (ctx-RasterDiscard)
 +  dw1 |= SO_RENDERING_DISABLE;
 
 
 It looks like dw1 is set here but not used until patch 6/7.

Oops.  Yeah, good catch.  Eric, perhaps just squash 5 and 6?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.

2011-12-22 Thread Marek Olšák
On Fri, Dec 23, 2011 at 4:22 AM, Paul Berry stereotype...@gmail.com wrote:
 FYI, this assertion should hold true until we implement
 ARB_transfrom_feedback3 (which allows holes in the transform feedback
 structure).  I think Marek has some plans to implement that for Gallium (not
 sure of his timeframe though), so we may want to keep an eye out.

Already done:

http://cgit.freedesktop.org/~mareko/mesa/log/?h=transform-feedback3-instanced

It's completely untested though. The next step is to write some tests
and make any necessary modifications to pass them. I usually don't
send untested stuff immediately for review. Any code worth publishing
before being ready is available in my private repository.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 04:54 PM, Eric Anholt wrote:
 Fixes almost all of the transform feedback piglit tests.  Remaining
 are a few tests related to tesselation for
 quads/trifans/tristrips/polygons with flat shading.
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c |  199 ++-
  1 files changed, 191 insertions(+), 8 deletions(-)

The whole series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Really nice work.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically

2011-12-22 Thread Kenneth Graunke
On 12/22/2011 07:04 PM, Yuanhan Liu wrote:
 On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote:
 On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
[snip]
 -#define BRW_EU_MAX_INSN_STACK 5
 -#define BRW_EU_MAX_INSN 1
 +#define BRW_EU_MAX_INSN_STACK   5
 +#define BRW_EU_MAX_INSN (1024 * 1024)

 I'm actually surprised to see BRW_EU_MAX_INSN at all.  As far as I know,
 there isn't an actual hardware limit on the number of instructions,
 
 Glad to know that. Thanks.
 
 so
 I'm not sure why we should cap it at all.  Especially not to some
 arbitrary number.  (I'm assuming that 1024 * 1024 is just something you
 came up with arbitrarily...)
 
 Aha, yes, you are right, I made it. :)
 
 Here is the fixed patch, please help to review it:

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

I'd wait for an ack from Eric before pushing, though.

Thanks again!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically

2011-12-22 Thread Yuanhan Liu
On Thu, Dec 22, 2011 at 07:51:46PM -0800, Kenneth Graunke wrote:
 On 12/22/2011 07:04 PM, Yuanhan Liu wrote:
  On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote:
  On 12/21/2011 01:33 AM, Yuanhan Liu wrote:
 [snip]
  -#define BRW_EU_MAX_INSN_STACK 5
  -#define BRW_EU_MAX_INSN 1
  +#define BRW_EU_MAX_INSN_STACK   5
  +#define BRW_EU_MAX_INSN (1024 * 1024)
 
  I'm actually surprised to see BRW_EU_MAX_INSN at all.  As far as I know,
  there isn't an actual hardware limit on the number of instructions,
  
  Glad to know that. Thanks.
  
  so
  I'm not sure why we should cap it at all.  Especially not to some
  arbitrary number.  (I'm assuming that 1024 * 1024 is just something you
  came up with arbitrarily...)
  
  Aha, yes, you are right, I made it. :)
  
  Here is the fixed patch, please help to review it:
 
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 
 I'd wait for an ack from Eric before pushing, though.

It's OK to me. Eric, comments? Or, can I get your reviewed-by for this
series?

Thanks,
Yuanhan Liu
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: signal _NEW_ARRAY when transitioning between glBegin/End, glDrawArrays

2011-12-22 Thread Mathias Fröhlich

Hi,

On Thursday, December 22, 2011 18:30:44 Brian Paul wrote:
 I'm not sure if playback_vertex_list is more like DRAW_BEGIN_END or
 DRAW_ARRAYS.
 Maybe add a DRAW_DISPLAY_LIST enum value?

It's more like begin/end I think.
The begin/end code just sets the array state below the state tracking of the 
api function. And the way this happens is very much the same for save and draw 
when you look at the code that binds the vbos.

To me it makes perfectly sense that the vbo_{save,exec}_draw just disturbs the 
vbo_array_draw path. So, probably a simple flag that marks if we were drawing 
by vbo_{save,exec}_draw.c the last time would do the job also.

But if you think you want to distinguish this, go ahead...
I also believe that this difference we are talking about will only trigger in 
very few untypical cases.

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev