Re: [Mesa-dev] [PATCH] Add initial Haiku build support
Hey Alexander, On 12/21/2011 07:16 PM, Alexander von Gluck wrote: * Doesn't reintroduce legacy drivers * Adds Haiku mklib code * Removes some broken PIPE_OS_HAIKU defines * Removes an NDEBUG ifdef in link_uniforms.cpp, there is an item that uses the union without checking NDEBUG below. * Haiku has a opengl kit that will wrap all of these build binaries(pretty much an external beos mesa driver) Smells like this patch should be split up to address each point separately.. ~Maarten ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vbo: count min/max_index before vbo-draw_prims
For the case that index data is stored in element array buffer object, and user called glMultiDrawElements, count the min/max_index before calling vbo-draw_prims. vbo_get_minmax_index() isn't friendly to this case. So do it while building the prim info. Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com --- src/mesa/vbo/vbo_exec_array.c | 14 +- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c index a6e41e9..70efd3f 100644 --- a/src/mesa/vbo/vbo_exec_array.c +++ b/src/mesa/vbo/vbo_exec_array.c @@ -1147,11 +1147,18 @@ vbo_validated_multidrawelements(struct gl_context *ctx, GLenum mode, fallback = GL_TRUE; if (!fallback) { + struct _mesa_index_buffer tmp_ib; + GLuint min_index = ~0; + GLuint max_index = 0; + GLuint tmp_min, tmp_max; + ib.count = (max_index_ptr - min_index_ptr) / index_type_size; ib.type = type; ib.obj = ctx-Array.ArrayObj-ElementArrayBufferObj; ib.ptr = (void *)min_index_ptr; + tmp_ib = ib; + for (i = 0; i primcount; i++) { prim[i].begin = (i == 0); prim[i].end = (i == primcount - 1); @@ -1166,11 +1173,16 @@ vbo_validated_multidrawelements(struct gl_context *ctx, GLenum mode, prim[i].basevertex = basevertex[i]; else prim[i].basevertex = 0; + + tmp_ib.ptr = indices[i]; + vbo_get_minmax_index(ctx, prim[i], tmp_ib, tmp_min, tmp_max); + min_index = MIN2(min_index, tmp_min); + max_index = MAX2(max_index, tmp_max); } check_buffers_are_unmapped(exec-array.inputs); vbo-draw_prims(ctx, exec-array.inputs, prim, primcount, ib, - GL_FALSE, ~0, ~0, NULL); + GL_TRUE, min_index, max_index, NULL); } else { /* render one prim at a time */ for (i = 0; i primcount; i++) { -- 1.7.4.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] vl: Only initialize vlc once
Hi Maarten, On 21.12.2011 21:52, Maarten Lankhorst wrote: It would be nice if you inlined patches for easier reviewing. :) Well I can try, but I can't promise that Thunderbird isn't badly fucking up all whitespaces, newest version of the patch is in-lined below. I'm spotting an overflow that could be triggered with 64 single-byte unaligned buffers, maybe this is better: Should be fixed. With all the pointer math, maybe change the type for 'end' and 'data' to uint8_t? Then you would only need that single cast in fillbits (which I did above) and you can kill all the casts everywhere. Yeah, we now use it as int8_t more often, that saves at least some casts. Another thought, this would prevent the need to read past end of file which could show up erroneously in valgrind, with something like this: if (end-data= 4) { // Nom nom 4 bytes } else { // Read at most 3 bytes } Ok, take a look below. I now use if(bytes_left ==0) {...} else if (bytes_left = 4) {...} else {...} construct, should this work or do you see any more corner cases I missed? I also inverted the valid_bits handling to invalid_bits and moved range where invalid_bits is now moving in between to -32 and 32, that should save a few more CPU cycles. Last but not least I added at least one sentence of documentation to each function. Any more thoughts or should we commit that now? Only initialize vlc in MPEG2 decoding once for all slices, add more sanity checks to vlc decoding functions, support multiple vlc input buffer, improve documentation of the vlc functions. v2: also implement multiple inputs for the vlc functions v3: some bug fixes for buffer size and alignment corner cases v4: rework of the patch, add some more improvements and documentation Signed-off-by: Maarten Lankhorstm.b.lankho...@gmail.com Signed-off-by: Christian Königdeathsim...@vodafone.de --- src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c | 46 +++ src/gallium/auxiliary/vl/vl_vlc.h | 169 +++- 2 files changed, 156 insertions(+), 59 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c index 936cf2c..7e20d71 100644 --- a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c +++ b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c @@ -786,7 +786,7 @@ entry: } } -static INLINE bool +static INLINE void decode_slice(struct vl_mpg12_bs *bs) { struct pipe_mpeg12_macroblock mb; @@ -800,6 +800,7 @@ decode_slice(struct vl_mpg12_bs *bs) mb.blocks = dct_blocks; reset_predictor(bs); + vl_vlc_fillbits(bs-vlc); dct_scale = quant_scale[bs-desc.q_scale_type][vl_vlc_get_uimsbf(bs-vlc, 5)]; if (vl_vlc_get_uimsbf(bs-vlc, 1)) @@ -807,13 +808,15 @@ decode_slice(struct vl_mpg12_bs *bs) vl_vlc_fillbits(bs-vlc); vl_vlc_fillbits(bs-vlc); + assert(vl_vlc_bits_left(bs-vlc) 23 vl_vlc_peekbits(bs-vlc, 23)); do { int inc = 0; - while (vl_vlc_peekbits(bs-vlc, 11) == 15) { - vl_vlc_eatbits(bs-vlc, 11); - vl_vlc_fillbits(bs-vlc); - } + if (bs-decoder-profile == PIPE_VIDEO_PROFILE_MPEG1) + while (vl_vlc_peekbits(bs-vlc, 11) == 15) { +vl_vlc_eatbits(bs-vlc, 11); +vl_vlc_fillbits(bs-vlc); + } while (vl_vlc_peekbits(bs-vlc, 11) == 8) { vl_vlc_eatbits(bs-vlc, 11); @@ -928,7 +931,6 @@ decode_slice(struct vl_mpg12_bs *bs) mb.num_skipped_macroblocks = 0; bs-decoder-decode_macroblock(bs-decoder,mb.base, 1); - return true; } void @@ -959,32 +961,22 @@ void vl_mpg12_bs_decode(struct vl_mpg12_bs *bs, unsigned num_bytes, const uint8_t *buffer) { assert(bs); - assert(buffer num_bytes); - while(num_bytes 2) { - if (buffer[0] == 0x00 buffer[1] == 0x00 buffer[2] == 0x01 - buffer[3]= 0x01 buffer[3] 0xAF) { - unsigned consumed; + vl_vlc_init(bs-vlc, 1, (const void * const *)buffer,num_bytes); + while (vl_vlc_bits_left(bs-vlc) 32) { + uint32_t code = vl_vlc_peekbits(bs-vlc, 32); - buffer += 3; - num_bytes -= 3; + if (code= 0x101 code= 0x1AF) { + vl_vlc_eatbits(bs-vlc, 24); + decode_slice(bs); - vl_vlc_init(bs-vlc, buffer, num_bytes); - - if (!decode_slice(bs)) -return; - - consumed = num_bytes - vl_vlc_bits_left(bs-vlc) / 8; - - /* crap, this is a bug we have consumed more bytes than left in the buffer */ - assert(consumed= num_bytes); - - num_bytes -= consumed; - buffer += consumed; + /* align to a byte again */ + vl_vlc_eatbits(bs-vlc, vl_vlc_valid_bits(bs-vlc) 7); } else { - ++buffer; - --num_bytes; + vl_vlc_eatbits(bs-vlc, 8); } + + vl_vlc_fillbits(bs-vlc); } } diff --git a/src/gallium/auxiliary/vl/vl_vlc.h b/src/gallium/auxiliary/vl/vl_vlc.h index dc4faed..5e5e64c 100644 --- a/src/gallium/auxiliary/vl/vl_vlc.h +++
[Mesa-dev] [PATCH] glsl_to_tgsi: v2 Invalidate and revalidate uniform backing storage
If glUniform1i and friends are going to dump data directly in driver-allocated, the pointers have to be updated when the storage moves. This should fix the regressions seen with commit 7199096. I'm not sure if this is the only place that needs this treatment. I'm a little uncertain about the various functions in st_glsl_to_tgsi that modify the TGSI IR and try to propagate changes about that up to the gl_program. That seems sketchy to me. Signed-off-by: Ian Romanick ian.d.roman...@intel.com v2: Revalidate when shader_program is not NULL. Update the pointers for all _LinkedShaders. Init glsl_to_tgsi_visitor::shader_program to NULL in the get_pixel_transfer_visitor get_bitmap_visitor. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Based on the patch from Ian Romanick: http://lists.freedesktop.org/archives/mesa-dev/2011-November/014675.html Fixes uniform regressions with r600g (and probably other drivers) after commit 719909698c67c287a393d2380278e7b7495ae018 Tested on evergreen with r600.tests: no regressions. src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 25 + 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 77aa0d1..fce92bb 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3708,6 +3708,7 @@ get_pixel_transfer_visitor(struct st_fragment_program *fp, /* Copy attributes of the glsl_to_tgsi_visitor in the original shader. */ v-ctx = original-ctx; v-prog = prog; + v-shader_program = NULL; v-glsl_version = original-glsl_version; v-native_integers = original-native_integers; v-options = original-options; @@ -3837,6 +3838,7 @@ get_bitmap_visitor(struct st_fragment_program *fp, /* Copy attributes of the glsl_to_tgsi_visitor in the original shader. */ v-ctx = original-ctx; v-prog = prog; + v-shader_program = NULL; v-glsl_version = original-glsl_version; v-native_integers = original-native_integers; v-options = original-options; @@ -4550,6 +4552,15 @@ st_translate_program( t-pointSizeOutIndex = -1; t-prevInstWrotePointSize = GL_FALSE; + if (program-shader_program) { + for (i = 0; i program-shader_program-NumUserUniformStorage; i++) { + struct gl_uniform_storage *const storage = + program-shader_program-UniformStorage[i]; + + _mesa_uniform_detach_all_driver_storage(storage); + } + } + /* * Declare input attributes. */ @@ -4776,6 +4787,20 @@ st_translate_program( t-insn[t-labels[i].branch_target]); } + if (program-shader_program) { + /* This has to be done last. Any operation the can cause + * prog-ParameterValues to get reallocated (e.g., anything that adds a + * program constant) has to happen before creating this linkage. + */ + for (unsigned i = 0; i MESA_SHADER_TYPES; i++) { + if (program-shader_program-_LinkedShaders[i] == NULL) +continue; + + _mesa_associate_uniform_storage(ctx, program-shader_program, + program-shader_program-_LinkedShaders[i]-Program-Parameters); + } + } + out: if (t) { FREE(t-insn); -- 1.7.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: consolidate texstore functions
From: Brian Paul bri...@vmware.com The code for storing 1D, 2D and 3D tex images (whole or sub-images) was all pretty similar. This consolidates those six paths. v2: rework switch statement to catch unexpected targets --- src/mesa/main/texstore.c | 484 +++--- 1 files changed, 153 insertions(+), 331 deletions(-) diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c index fb1ad04..86c35d3 100644 --- a/src/mesa/main/texstore.c +++ b/src/mesa/main/texstore.c @@ -4565,9 +4565,137 @@ get_read_write_mode(GLenum userFormat, gl_format texFormat) return GL_MAP_WRITE_BIT; } + +/** + * Helper function for storing 1D, 2D, 3D whole and subimages into texture + * memory. + * The source of the image data may be user memory or a PBO. In the later + * case, we'll map the PBO, copy from it, then unmap it. + */ +static void +store_texsubimage(struct gl_context *ctx, + struct gl_texture_image *texImage, + GLint xoffset, GLint yoffset, GLint zoffset, + GLint width, GLint height, GLint depth, + GLenum format, GLenum type, const GLvoid *pixels, + const struct gl_pixelstore_attrib *packing, + const char *caller) + +{ + const GLbitfield mapMode = get_read_write_mode(format, texImage-TexFormat); + const GLenum target = texImage-TexObject-Target; + GLboolean success = GL_FALSE; + GLuint dims, slice, numSlices = 1, sliceOffset = 0; + GLint srcImageStride = 0; + const GLubyte *src; + + assert(xoffset + width = texImage-Width); + assert(yoffset + height = texImage-Height); + assert(zoffset + depth = texImage-Depth); + + switch (target) { + case GL_TEXTURE_1D: + dims = 1; + break; + case GL_TEXTURE_2D_ARRAY: + case GL_TEXTURE_3D: + dims = 3; + break; + default: + dims = 2; + } + + /* get pointer to src pixels (may be in a pbo which we'll map here) */ + src = (const GLubyte *) + _mesa_validate_pbo_teximage(ctx, dims, width, height, depth, + format, type, pixels, packing, caller); + if (!src) + return; + + /* compute slice info (and do some sanity checks) */ + switch (target) { + case GL_TEXTURE_2D: + case GL_TEXTURE_RECTANGLE: + case GL_TEXTURE_CUBE_MAP: + /* one image slice, nothing special needs to be done */ + break; + case GL_TEXTURE_1D: + assert(height == 1); + assert(depth == 1); + assert(yoffset == 0); + assert(zoffset == 0); + break; + case GL_TEXTURE_1D_ARRAY: + assert(depth == 1); + assert(zoffset == 0); + numSlices = height; + sliceOffset = yoffset; + height = 1; + yoffset = 0; + srcImageStride = _mesa_image_row_stride(packing, width, format, type); + break; + case GL_TEXTURE_2D_ARRAY: + numSlices = depth; + sliceOffset = zoffset; + depth = 1; + zoffset = 0; + srcImageStride = _mesa_image_image_stride(packing, width, height, +format, type); + break; + case GL_TEXTURE_3D: + /* we'll store 3D images as a series of slices */ + numSlices = depth; + sliceOffset = zoffset; + srcImageStride = _mesa_image_image_stride(packing, width, height, +format, type); + break; + default: + _mesa_warning(ctx, Unexpected target 0x%x in store_texsubimage(), target); + return; + } + + assert(numSlices == 1 || srcImageStride != 0); + + for (slice = 0; slice numSlices; slice++) { + GLubyte *dstMap; + GLint dstRowStride; + + ctx-Driver.MapTextureImage(ctx, texImage, + slice + sliceOffset, + xoffset, yoffset, width, height, + mapMode, dstMap, dstRowStride); + if (dstMap) { + /* Note: we're only storing a 2D (or 1D) slice at a time but we need + * to pass the right 'dims' value so that GL_UNPACK_SKIP_IMAGES is + * used for 3D images. + */ + success = _mesa_texstore(ctx, dims, texImage-_BaseFormat, + texImage-TexFormat, + 0, 0, 0, /* dstX/Y/Zoffset */ + dstRowStride, + dstMap, + width, height, 1, /* w, h, d */ + format, type, src, packing); + + ctx-Driver.UnmapTextureImage(ctx, texImage, slice + sliceOffset); + } + + src += srcImageStride; + + if (!success) + break; + } + + if (!success) + _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, caller); + + _mesa_unmap_teximage_pbo(ctx, packing); +} + + + /** - * This is the software fallback for Driver.TexImage1D(). - * \sa _mesa_store_teximage2d() + * This is the fallback for
Re: [Mesa-dev] [PATCH] mesa: consolidate texstore functions
Looks good to me. Jose - Original Message - From: Brian Paul bri...@vmware.com The code for storing 1D, 2D and 3D tex images (whole or sub-images) was all pretty similar. This consolidates those six paths. v2: rework switch statement to catch unexpected targets --- src/mesa/main/texstore.c | 484 +++--- 1 files changed, 153 insertions(+), 331 deletions(-) diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c index fb1ad04..86c35d3 100644 --- a/src/mesa/main/texstore.c +++ b/src/mesa/main/texstore.c @@ -4565,9 +4565,137 @@ get_read_write_mode(GLenum userFormat, gl_format texFormat) return GL_MAP_WRITE_BIT; } + +/** + * Helper function for storing 1D, 2D, 3D whole and subimages into texture + * memory. + * The source of the image data may be user memory or a PBO. In the later + * case, we'll map the PBO, copy from it, then unmap it. + */ +static void +store_texsubimage(struct gl_context *ctx, + struct gl_texture_image *texImage, + GLint xoffset, GLint yoffset, GLint zoffset, + GLint width, GLint height, GLint depth, + GLenum format, GLenum type, const GLvoid *pixels, + const struct gl_pixelstore_attrib *packing, + const char *caller) + +{ + const GLbitfield mapMode = get_read_write_mode(format, texImage-TexFormat); + const GLenum target = texImage-TexObject-Target; + GLboolean success = GL_FALSE; + GLuint dims, slice, numSlices = 1, sliceOffset = 0; + GLint srcImageStride = 0; + const GLubyte *src; + + assert(xoffset + width = texImage-Width); + assert(yoffset + height = texImage-Height); + assert(zoffset + depth = texImage-Depth); + + switch (target) { + case GL_TEXTURE_1D: + dims = 1; + break; + case GL_TEXTURE_2D_ARRAY: + case GL_TEXTURE_3D: + dims = 3; + break; + default: + dims = 2; + } + + /* get pointer to src pixels (may be in a pbo which we'll map here) */ + src = (const GLubyte *) + _mesa_validate_pbo_teximage(ctx, dims, width, height, depth, + format, type, pixels, packing, caller); + if (!src) + return; + + /* compute slice info (and do some sanity checks) */ + switch (target) { + case GL_TEXTURE_2D: + case GL_TEXTURE_RECTANGLE: + case GL_TEXTURE_CUBE_MAP: + /* one image slice, nothing special needs to be done */ + break; + case GL_TEXTURE_1D: + assert(height == 1); + assert(depth == 1); + assert(yoffset == 0); + assert(zoffset == 0); + break; + case GL_TEXTURE_1D_ARRAY: + assert(depth == 1); + assert(zoffset == 0); + numSlices = height; + sliceOffset = yoffset; + height = 1; + yoffset = 0; + srcImageStride = _mesa_image_row_stride(packing, width, format, type); + break; + case GL_TEXTURE_2D_ARRAY: + numSlices = depth; + sliceOffset = zoffset; + depth = 1; + zoffset = 0; + srcImageStride = _mesa_image_image_stride(packing, width, height, +format, type); + break; + case GL_TEXTURE_3D: + /* we'll store 3D images as a series of slices */ + numSlices = depth; + sliceOffset = zoffset; + srcImageStride = _mesa_image_image_stride(packing, width, height, +format, type); + break; + default: + _mesa_warning(ctx, Unexpected target 0x%x in store_texsubimage(), target); + return; + } + + assert(numSlices == 1 || srcImageStride != 0); + + for (slice = 0; slice numSlices; slice++) { + GLubyte *dstMap; + GLint dstRowStride; + + ctx-Driver.MapTextureImage(ctx, texImage, + slice + sliceOffset, + xoffset, yoffset, width, height, + mapMode, dstMap, dstRowStride); + if (dstMap) { + /* Note: we're only storing a 2D (or 1D) slice at a time but we need + * to pass the right 'dims' value so that GL_UNPACK_SKIP_IMAGES is + * used for 3D images. + */ + success = _mesa_texstore(ctx, dims, texImage-_BaseFormat, + texImage-TexFormat, + 0, 0, 0, /* dstX/Y/Zoffset */ + dstRowStride, + dstMap, + width, height, 1, /* w, h, d */ + format, type, src, packing); + + ctx-Driver.UnmapTextureImage(ctx, texImage, slice + sliceOffset); + } + + src += srcImageStride; + + if (!success) + break; + } + + if (!success) + _mesa_error(ctx, GL_OUT_OF_MEMORY,
Re: [Mesa-dev] [PATCH 02/22] swrast: do fast_copy_pixels() with Map/UnmapRenderbuffer()
On Wed, Dec 21, 2011 at 12:58 PM, Eric Anholt e...@anholt.net wrote: - temp = malloc(width * MAX_PIXEL_BYTES); - if (!temp) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels); - return GL_FALSE; + /* different src/dst buffers */ + ctx-Driver.MapRenderbuffer(ctx, srcRb, srcX, srcY, + width, height, + GL_MAP_READ_BIT, srcMap, srcRowStride); + if (!srcMap) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels); + return GL_TRUE; /* don't retry with slow path */ + } + ctx-Driver.MapRenderbuffer(ctx, dstRb, dstX, dstY, + width, height, + GL_MAP_WRITE_BIT, dstMap, dstRowStride); + if (!dstMap) { + ctx-Driver.UnmapRenderbuffer(ctx, srcRb); + _mesa_error(ctx, GL_OUT_OF_MEMORY, glCopyPixels); + return GL_TRUE; /* don't retry with slow path */ + } } for (row = 0; row height; row++) { - srcRb-GetRow(ctx, srcRb, width, srcX, srcY, temp); - dstRb-PutRow(ctx, dstRb, width, dstX, dstY, temp, NULL); - srcY += yStep; - dstY += yStep; + memcpy(dstMap, srcMap, widthInBytes); + dstMap += dstRowStride; + srcMap += srcRowStride; } So, previously we didn't have to worry about X direction for overlap because we used a temp between the Get and Put. Now, I think you need to use memmove instead of memcpy. Patch 1, and 3-7 are: Reviewed-by: Eric Anholt e...@anholt.net this one is too if memmove is the solution. I'll fix that. Thanks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/22] swrast: rewrite _swrast_read_stencil_span()
On Wed, Dec 21, 2011 at 1:01 PM, Eric Anholt e...@anholt.net wrote: On Sun, 18 Dec 2011 20:08:13 -0700, Brian Paul bri...@vmware.com wrote: Use format pack/unpack functions instead of deprecated renderbuffer GetRow/PutRow functions. --- src/mesa/swrast/s_stencil.c | 31 ++- 1 files changed, 26 insertions(+), 5 deletions(-) diff --git a/src/mesa/swrast/s_stencil.c b/src/mesa/swrast/s_stencil.c index 17b3b12..1d78e97 100644 --- a/src/mesa/swrast/s_stencil.c +++ b/src/mesa/swrast/s_stencil.c /** + * Return the address of a stencil value in a renderbuffer. + */ +static inline GLubyte * +get_stencil_address(struct gl_renderbuffer *rb, GLint x, GLint y) +{ + const GLint bpp = _mesa_get_format_bytes(rb-Format); + const GLint rowStride = rb-RowStride * bpp; + assert(rb-Data); + return (GLubyte *) rb-Data + y * rowStride + x * bpp; +} + + + +/** * Apply the given stencil operator to the array of stencil values. * Don't touch stencil[i] if mask[i] is zero. * Input: n - size of stencil array @@ -1075,6 +1090,8 @@ _swrast_read_stencil_span(struct gl_context *ctx, struct gl_renderbuffer *rb, GLint n, GLint x, GLint y, GLubyte stencil[]) { GLubyte *src; + const GLuint bpp = _mesa_get_format_bytes(rb-Format); + const GLuint rowStride = rb-RowStride * bpp; if (y 0 || y = (GLint) rb-Height || x + n = 0 || x = (GLint) rb-Width) { @@ -1096,7 +1113,7 @@ _swrast_read_stencil_span(struct gl_context *ctx, struct gl_renderbuffer *rb, return; } - src = (GLubyte *) rb-Data + y * rb-RowStride +x; + src = (GLubyte *) rb-Data + y * rowStride + x * bpp; _mesa_unpack_ubyte_stencil_row(rb-Format, n, src, stencil); } Don't you want to just reuse get_stencil_address here? Yup. @@ -1115,9 +1132,10 @@ _swrast_write_stencil_span(struct gl_context *ctx, GLint n, GLint x, GLint y, const GLubyte stencil[] ) { struct gl_framebuffer *fb = ctx-DrawBuffer; - struct gl_renderbuffer *rb = fb-_StencilBuffer; + struct gl_renderbuffer *rb = fb-Attachment[BUFFER_STENCIL].Renderbuffer; const GLuint stencilMax = (1 fb-Visual.stencilBits) - 1; const GLuint stencilMask = ctx-Stencil.WriteMask[0]; + GLubyte *stencilBuf; if (y 0 || y = (GLint) rb-Height || x + n = 0 || x = (GLint) rb-Width) { @@ -1138,19 +1156,22 @@ _swrast_write_stencil_span(struct gl_context *ctx, GLint n, GLint x, GLint y, return; } + stencilBuf = get_stencil_address(rb, x, y); + if ((stencilMask stencilMax) != stencilMax) { /* need to apply writemask */ GLubyte destVals[MAX_WIDTH], newVals[MAX_WIDTH]; GLint i; - rb-GetRow(ctx, rb, n, x, y, destVals); + + _mesa_unpack_ubyte_stencil_row(rb-Format, n, stencilBuf, destVals); for (i = 0; i n; i++) { newVals[i] = (stencil[i] stencilMask) | (destVals[i] ~stencilMask); } - rb-PutRow(ctx, rb, n, x, y, newVals, NULL); + _mesa_pack_ubyte_stencil_row(rb-Format, n, destVals, stencilBuf); s/destVals/newVals/ ? Will do. R-b? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Drooping multiple driver support in EGL?
Hi list, Multiple driver support in EGL is hard to get right, if not impossible. On Linux desktop, we almost always want to use egl_dri2. It allows EGL to loads DRI2 drivers, thus allowing it to share DRI2 drivers with libGL. In one case where the app wants to use OpenVG, libEGL needs to load egl_gallium instead. The problem comes from that we cannot know that an OpenVG context is to be created until it is created. But before a context can be created, EGL needs to be initialized and an EGLConfig needs to be chosen. So when EGL is to be initialized, we need to load and initilaize all EGL drivers. When an EGLConfig is to be picked, we need to pick it from all drivers. But this also introduces new problems. For example, when the vendor string or the extension string is queried, whose string of all EGL drivers should be returned? My proposal is to simply drop multiple driver support from EGL. Instead, we will provide four libEGL implementations: - libEGL_dri2: derived from egl_dri2 - libEGL_gallium: derived from egl_gallium - libEGL_glx: derived from egl_glx - libEGL_loader: see below All of them are conformant EGL implementations. That is, any one of them can be installed as /usr/lib/libEGL.so. libEGL_loader is new. It is basically a wrapper that loads another implementation to do the real work. As such, the problems we face with multiple driver support will remain in libEGL_loader. Distros may choose to install libEGL_loader as libEGL and let it pick the real implementation. Or they may choose to have the first three installed as libEGL, and package them separately. Thoughts? -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/22] swrast: fast_draw_depth_stencil() for glDrawPixels(GL_DEPTH_STENCIL)
On Wed, Dec 21, 2011 at 1:30 PM, Eric Anholt e...@anholt.net wrote: On Sun, 18 Dec 2011 20:08:21 -0700, Brian Paul bri...@vmware.com wrote: Stop using deprecated renderbuffer PutRow() function. Note that we aren't using Map/UnmapRenderbuffer() yet because this call is inside a swrast_render_start/finish() pair. --- src/mesa/swrast/s_drawpix.c | 64 --- 1 files changed, 48 insertions(+), 16 deletions(-) diff --git a/src/mesa/swrast/s_drawpix.c b/src/mesa/swrast/s_drawpix.c index 4a661a0..19b43f6 100644 --- a/src/mesa/swrast/s_drawpix.c +++ b/src/mesa/swrast/s_drawpix.c @@ -551,6 +551,49 @@ draw_rgba_pixels( struct gl_context *ctx, GLint x, GLint y, /** + * Draw depth+stencil values into a MESA_FORAMT_Z24_S8 or MESA_FORMAT_S8_Z24 + * renderbuffer. No masking, zooming, scaling, etc. + */ +static void +fast_draw_depth_stencil(struct gl_context *ctx, GLint x, GLint y, + GLsizei width, GLsizei height, + const struct gl_pixelstore_attrib *unpack, + const GLvoid *pixels) +{ + for (i = 0; i height; i++) { + if (rb-Format == MESA_FORMAT_Z24_S8) { + memcpy(dst, src, width * 4); + } + else { + /* swap Z24_S8 - S8_Z24 */ + GLuint j, *dst4 = (GLuint *) dst, *src4 = (GLuint *) src; + for (j = 0; j width; j++) { + dst4[j] = (src4[j] 24) | (src4[j] 8); + } + } Reuse _mesa_pack_uint_24_8_depth_stencil_row() here? Other than that, looks good. Yeah, I'll fix that. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Save and restore GL_RASTERIZER_DISCARD state during meta ops.
On Wed, Dec 21, 2011 at 2:39 PM, Paul Berry stereotype...@gmail.com wrote: During meta-operations (such as _mesa_meta_GenerateMipmap()), we need to be able to draw even if GL_RASTERIZER_DISCARD is enabled. This patch causes _mesa_meta_begin() to save the state of GL_RASTERIZER_DISCARD and disable it (so that drawing can be done during the meta-op), and causes _mesa_meta_end() to restore it. Fixes piglit test EXT_transform_feedback/generatemipmap discard on i965 Gen6. Reviewed-by: Brian Paul bri...@vmare.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] mesa: Ensure that Paused is reset to false on EndTransformFeedback.
On Wed, Dec 21, 2011 at 2:39 PM, Paul Berry stereotype...@gmail.com wrote: If a client calls BeginTransformFeedback(), then PauseTransformFeedback(), then EndTransformFeedback(), we need to make sure that the transform feedback object is not left in a paused state, otherwise the next call to BeginTransformFeedback() will leave transform feedback paused. --- src/mesa/main/transformfeedback.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/mesa/main/transformfeedback.c b/src/mesa/main/transformfeedback.c index 53c09e2..fea711a 100644 --- a/src/mesa/main/transformfeedback.c +++ b/src/mesa/main/transformfeedback.c @@ -387,6 +387,7 @@ _mesa_EndTransformFeedback(void) FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK); ctx-TransformFeedback.CurrentObject-Active = GL_FALSE; + ctx-TransformFeedback.CurrentObject-Paused = GL_FALSE; ctx-TransformFeedback.CurrentObject-EndedAnytime = GL_TRUE; assert(ctx-Driver.EndTransformFeedback); Reviewed-by: Brian Paul bri...@vmare.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: signal _NEW_ARRAY when transitioning between glBegin/End, glDrawArrays
On Wed, Dec 21, 2011 at 10:36 AM, Marek Olšák mar...@gmail.com wrote: Hi Brian, Is there a reason to set _NEW_ARRAY when transitioning between DrawArrays and DrawElements? Probably not, actually. I was just being overly paranoid. I can fix that. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/22] swrast: rewrite _swrast_read_stencil_span()
On Thu, 22 Dec 2011 09:31:59 -0700, Brian Paul brian.e.p...@gmail.com wrote: On Wed, Dec 21, 2011 at 1:01 PM, Eric Anholt e...@anholt.net wrote: On Sun, 18 Dec 2011 20:08:13 -0700, Brian Paul bri...@vmware.com wrote: if ((stencilMask stencilMax) != stencilMax) { /* need to apply writemask */ GLubyte destVals[MAX_WIDTH], newVals[MAX_WIDTH]; GLint i; - rb-GetRow(ctx, rb, n, x, y, destVals); + + _mesa_unpack_ubyte_stencil_row(rb-Format, n, stencilBuf, destVals); for (i = 0; i n; i++) { newVals[i] = (stencil[i] stencilMask) | (destVals[i] ~stencilMask); } - rb-PutRow(ctx, rb, n, x, y, newVals, NULL); + _mesa_pack_ubyte_stencil_row(rb-Format, n, destVals, stencilBuf); s/destVals/newVals/ ? Will do. R-b? Yeah. pgpJHjiNbVCav.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/22] swrast: stop using depth/stencil wrappers in CopyPixels code
On Thu, 22 Dec 2011 09:49:33 -0700, Brian Paul brian.e.p...@gmail.com wrote: On Wed, Dec 21, 2011 at 1:20 PM, Eric Anholt e...@anholt.net wrote: On Sun, 18 Dec 2011 20:08:19 -0700, Brian Paul bri...@vmware.com wrote: The functions that read depth/stencil values understand all (packed) depth/stencil buffer formats now so there's no reason to use the wrappers. Also, improve the format checks in fast_copy_pixels() to catch mismatched depth/stencil cases. + if (type == GL_STENCIL || type == GL_DEPTH_COMPONENT) { + /* can't handle packed depth+stencil here */ + if (_mesa_is_format_packed_depth_stencil(srcRb-Format) || + _mesa_is_format_packed_depth_stencil(dstRb-Format)) + return GL_FALSE; + } + else if (type == GL_DEPTH_STENCIL) { + /* can't handle separate depth/stencil buffers */ + if (!_mesa_is_format_packed_depth_stencil(srcRb-Format) || + !_mesa_is_format_packed_depth_stencil(dstRb-Format)) + return GL_FALSE; + } I think the GL_DEPTH_STENCIL test here wants srcRb != srcFb-Attachment[BUFFER_STENCIL].Renderbuffer and same for dst. Other than that, looks good. And remove the _mesa_is_format_packed_depth_stencil() calls, right? If Att[BUFFER_DEPTH] == Att[BUFFER_STENCIL] we clearly have a combined depth+stencil buffer. Yeah. pgpJLkmeg8XVe.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/22] swrast: stop using _DepthBuffer in triangle code
On Wed, Dec 21, 2011 at 2:16 PM, Eric Anholt e...@anholt.net wrote: On Sun, 18 Dec 2011 20:08:26 -0700, Brian Paul bri...@vmware.com wrote: The only consequence is we can only use the occlusion_zless_16_triangle() function with MESA_FORMAT_Z16. I'm not following that conclusion, probably due to ignorance of swrast spans code. I would think that Z32 would still work for that other path -- or is span.z stored as something other than 32 bits of Z in that case, too? Z32 would still work, but in practice that format of depth buffer is seldom used with swrast. Z16 is the default. I was tempted to remove the function entirely. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] meta: Disable GL_TEXTURE_EXTERNAL_OES in meta_begin()
On Wed, Dec 21, 2011 at 7:34 PM, Chad Versace chad.vers...@linux.intel.com wrote: If the meta flag MESA_META_TEXTURE is present, then disable the texture target GL_TEXTURE_EXTERNAL_OES. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/common/meta.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index c5c59eb..5673205 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -576,6 +576,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) _mesa_set_enable(ctx, GL_TEXTURE_CUBE_MAP, GL_FALSE); if (ctx-Extensions.NV_texture_rectangle) _mesa_set_enable(ctx, GL_TEXTURE_RECTANGLE, GL_FALSE); + if (ctx-Extensions.OES_EGL_image_external) + _mesa_set_enable(ctx, GL_TEXTURE_EXTERNAL_OES, GL_FALSE); _mesa_set_enable(ctx, GL_TEXTURE_GEN_S, GL_FALSE); _mesa_set_enable(ctx, GL_TEXTURE_GEN_T, GL_FALSE); _mesa_set_enable(ctx, GL_TEXTURE_GEN_R, GL_FALSE); Reviewed-by: Brian Paul bri...@vmare.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] swrast: stop using depth/stencil wrappers in CopyPixels code
From: Brian Paul bri...@vmware.com The functions that read depth/stencil values understand all (packed) depth/stencil buffer formats now so there's no reason to use the wrappers. Also, improve the format checks in fast_copy_pixels() to catch mismatched depth/stencil cases. v2: fix the test for combined depth+stencil buffers, per Eric. --- src/mesa/swrast/s_copypix.c | 29 + 1 files changed, 21 insertions(+), 8 deletions(-) diff --git a/src/mesa/swrast/s_copypix.c b/src/mesa/swrast/s_copypix.c index 9769a47..907645e 100644 --- a/src/mesa/swrast/s_copypix.c +++ b/src/mesa/swrast/s_copypix.c @@ -245,7 +245,7 @@ copy_depth_pixels( struct gl_context *ctx, GLint srcx, GLint srcy, GLint destx, GLint desty ) { struct gl_framebuffer *fb = ctx-ReadBuffer; - struct gl_renderbuffer *readRb = fb-_DepthBuffer; + struct gl_renderbuffer *readRb = fb-Attachment[BUFFER_DEPTH].Renderbuffer; GLfloat *p, *tmpImage; GLint sy, dy, stepy; GLint j; @@ -339,7 +339,7 @@ copy_stencil_pixels( struct gl_context *ctx, GLint srcx, GLint srcy, GLint destx, GLint desty ) { struct gl_framebuffer *fb = ctx-ReadBuffer; - struct gl_renderbuffer *rb = fb-_StencilBuffer; + struct gl_renderbuffer *rb = fb-Attachment[BUFFER_STENCIL].Renderbuffer; GLint sy, dy, stepy; GLint j; GLubyte *p, *tmpImage; @@ -446,7 +446,7 @@ copy_depth_stencil_pixels(struct gl_context *ctx, depthDrawRb = ctx-DrawBuffer-_DepthBuffer; depthReadRb = ctx-ReadBuffer-_DepthBuffer; - stencilReadRb = ctx-ReadBuffer-_StencilBuffer; + stencilReadRb = ctx-ReadBuffer-Attachment[BUFFER_STENCIL].Renderbuffer; ASSERT(depthDrawRb); ASSERT(depthReadRb); @@ -599,7 +599,7 @@ copy_depth_stencil_pixels(struct gl_context *ctx, /** - * Try to do a fast copy pixels. + * Try to do a fast copy pixels with memcpy. * \return GL_TRUE if successful, GL_FALSE otherwise. */ static GLboolean @@ -630,12 +630,12 @@ fast_copy_pixels(struct gl_context *ctx, dstRb = dstFb-_ColorDrawBuffers[0]; } else if (type == GL_STENCIL) { - srcRb = srcFb-_StencilBuffer; - dstRb = dstFb-_StencilBuffer; + srcRb = srcFb-Attachment[BUFFER_STENCIL].Renderbuffer; + dstRb = dstFb-Attachment[BUFFER_STENCIL].Renderbuffer; } else if (type == GL_DEPTH) { - srcRb = srcFb-_DepthBuffer; - dstRb = dstFb-_DepthBuffer; + srcRb = srcFb-Attachment[BUFFER_DEPTH].Renderbuffer; + dstRb = dstFb-Attachment[BUFFER_DEPTH].Renderbuffer; } else { ASSERT(type == GL_DEPTH_STENCIL_EXT); @@ -649,6 +649,19 @@ fast_copy_pixels(struct gl_context *ctx, return GL_FALSE; } + if (type == GL_STENCIL || type == GL_DEPTH_COMPONENT) { + /* can't handle packed depth+stencil here */ + if (_mesa_is_format_packed_depth_stencil(srcRb-Format) || + _mesa_is_format_packed_depth_stencil(dstRb-Format)) + return GL_FALSE; + } + else if (type == GL_DEPTH_STENCIL) { + /* can't handle separate depth/stencil buffers */ + if (srcRb != srcFb-Attachment[BUFFER_STENCIL].Renderbuffer || + dstRb != dstFb-Attachment[BUFFER_STENCIL].Renderbuffer) + return GL_FALSE; + } + /* clipping not supported */ if (srcX 0 || srcX + width (GLint) srcFb-Width || srcY 0 || srcY + height (GLint) srcFb-Height || -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] swrast: new fast_draw_depth_stencil() for glDrawPixels(GL_DEPTH_STENCIL)
From: Brian Paul bri...@vmware.com Stop using deprecated renderbuffer PutRow() function. Note that we aren't using Map/UnmapRenderbuffer() yet because this call is inside a swrast_render_start/finish() pair. v2: use _mesa_pack_uint_24_8_depth_stencil_row(), per Eric. --- src/mesa/swrast/s_drawpix.c | 56 ++ 1 files changed, 40 insertions(+), 16 deletions(-) diff --git a/src/mesa/swrast/s_drawpix.c b/src/mesa/swrast/s_drawpix.c index 4a661a0..e9136d5 100644 --- a/src/mesa/swrast/s_drawpix.c +++ b/src/mesa/swrast/s_drawpix.c @@ -551,6 +551,41 @@ draw_rgba_pixels( struct gl_context *ctx, GLint x, GLint y, /** + * Draw depth+stencil values into a MESA_FORAMT_Z24_S8 or MESA_FORMAT_S8_Z24 + * renderbuffer. No masking, zooming, scaling, etc. + */ +static void +fast_draw_depth_stencil(struct gl_context *ctx, GLint x, GLint y, +GLsizei width, GLsizei height, +const struct gl_pixelstore_attrib *unpack, +const GLvoid *pixels) +{ + const GLenum format = GL_DEPTH_STENCIL_EXT; + const GLenum type = GL_UNSIGNED_INT_24_8; + struct gl_renderbuffer *rb = + ctx-DrawBuffer-Attachment[BUFFER_DEPTH].Renderbuffer; + GLubyte *src, *dst; + GLint srcRowStride, dstRowStride; + GLint i; + + src = _mesa_image_address2d(unpack, pixels, width, height, + format, type, 0, 0); + srcRowStride = _mesa_image_row_stride(unpack, width, format, type); + + dst = _swrast_pixel_address(rb, x, y); + dstRowStride = rb-RowStride * 4; + + for (i = 0; i height; i++) { + _mesa_pack_uint_24_8_depth_stencil_row(rb-Format, width, + (const GLuint *) src, dst); + dst += dstRowStride; + src += srcRowStride; + } +} + + + +/** * This is a bit different from drawing GL_DEPTH_COMPONENT pixels. * The only per-pixel operations that apply are depth scale/bias, * stencil offset/shift, GL_DEPTH_WRITEMASK and GL_STENCIL_WRITEMASK, @@ -587,27 +622,16 @@ draw_depth_stencil_pixels(struct gl_context *ctx, GLint x, GLint y, ASSERT(depthRb); ASSERT(stencilRb); - if (depthRb-_BaseFormat == GL_DEPTH_STENCIL_EXT - depthRb-Format == MESA_FORMAT_Z24_S8 + if (depthRb == stencilRb + (depthRb-Format == MESA_FORMAT_Z24_S8 || +depthRb-Format == MESA_FORMAT_S8_Z24) type == GL_UNSIGNED_INT_24_8 - depthRb == stencilRb - depthRb-GetRow /* May be null if depthRb is a wrapper around - * separate depth and stencil buffers. */ !scaleOrBias !zoom ctx-Depth.Mask (stencilMask 0xff) == 0xff) { - /* This is the ideal case. - * Drawing GL_DEPTH_STENCIL pixels into a combined depth/stencil buffer. - * Plus, no pixel transfer ops, zooming, or masking needed. - */ - GLint i; - for (i = 0; i height; i++) { - const GLuint *src = (const GLuint *) -_mesa_image_address2d(clippedUnpack, pixels, width, height, - GL_DEPTH_STENCIL_EXT, type, i, 0); - depthRb-PutRow(ctx, depthRb, width, x, y + i, src, NULL); - } + fast_draw_depth_stencil(ctx, x, y, width, height, + clippedUnpack, pixels); } else { /* sub-optimal cases: -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] i965: get the jmp distance by instruction index
On Thu, 22 Dec 2011 10:18:05 +0800, Yuanhan Liu yuanhan@linux.intel.com wrote: On Wed, Dec 21, 2011 at 05:57:35AM -0800, Eric Anholt wrote: On Wed, 21 Dec 2011 17:33:41 +0800, Yuanhan Liu yuanhan@linux.intel.com wrote: If dynamic instruction store size is enabled, while after the brw_JMPI() and before the brw_land_fwd_jump() function, the eu instruction store base address(p-store) may change. Thus, the safe way to reference the jmp instruction is by index instead of by the instruction address. Our other instructions return the instruction pointer, I don't think jmpi should be special in that respect. Right. Fixed and how about the following patch? Reviewed-by: Eric Anholt e...@anholt.net pgpke5a5pBpVc.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Don't use BRW_DEPTHFORMAT_D24_UNORM_X8_UINT on Gen4.
On Wed, 21 Dec 2011 16:36:45 -0800, Kenneth Graunke kenn...@whitecape.org wrote: X8 depth formats weren't supported until Ironlake (Gen 5). Fixes GPU hangs introduced in d84a180417d1eabd680554970f1eaaa93abcd41e. One example test case was fbo-missing-attachment-blit from. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Eric Anholt e...@anholt.net pgpUOToCT6G49.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] i965 gen6: Fix interactions between transform feedback and meta-ops.
On Wed, 21 Dec 2011 13:39:05 -0800, Paul Berry stereotype...@gmail.com wrote: This patch series ensures that meta-ops (such as glClear or glGenerateMipmapEXT) function properly when transform feedback or rasterizer discard is enabled. Most of the code changes necessary to make this work are in core mesa: patches 1/5 and 5/5 ensure that meta ops properly pause transform feedback and disable rasterizer discard (and restore the state properly when the meta op is over). Patch 2/5 ensures that PauseTransformFeedback interacts properly with BeginTransformFeedback and EndTransformFeedback (so that there is no danger of transform feedback being in a paused state after a call to BeginTransformFeedback). Patch 3/5 ensures that that while transform feedback is paused, it's possible to switch programs and do drawing that isn't compatible with the transform feedback mode. Patch 4/5 implements transform feedback pause/resume functionality in the i965 driver. We don't expose this functionality to the user yet, but we need it for meta ops to work correctly. The series is Reviewed-by: Eric Anholt e...@anholt.net pgpNBmEPw3ABl.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Drooping multiple driver support in EGL?
On 12/22/2011 09:01 AM, Chia-I Wu wrote: Hi list, Multiple driver support in EGL is hard to get right, if not impossible. On Linux desktop, we almost always want to use egl_dri2. It allows EGL to loads DRI2 drivers, thus allowing it to share DRI2 drivers with libGL. In one case where the app wants to use OpenVG, libEGL needs to load egl_gallium instead. The problem comes from that we cannot know that an OpenVG context is to be created until it is created. But before a context can be created, EGL needs to be initialized and an EGLConfig needs to be chosen. So when EGL is to be initialized, we need to load and initilaize all EGL drivers. When an EGLConfig is to be picked, we need to pick it from all drivers. But this also introduces new problems. For example, when the vendor string or the extension string is queried, whose string of all EGL drivers should be returned? My proposal is to simply drop multiple driver support from EGL. Instead, we will provide four libEGL implementations: - libEGL_dri2: derived from egl_dri2 - libEGL_gallium: derived from egl_gallium - libEGL_glx: derived from egl_glx - libEGL_loader: see below Somewhat tangentially...what is the advantage of egl_glx? Does anybody use it? Why? Is it being tested? I'm mostly curious, as I've always used egl_dri2. All of them are conformant EGL implementations. That is, any one of them can be installed as /usr/lib/libEGL.so. libEGL_loader is new. It is basically a wrapper that loads another implementation to do the real work. As such, the problems we face with multiple driver support will remain in libEGL_loader. Distros may choose to install libEGL_loader as libEGL and let it pick the real implementation. Or they may choose to have the first three installed as libEGL, and package them separately. Thoughts? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] i965 gen6: Fix interactions between transform feedback and meta-ops.
On 12/21/2011 01:39 PM, Paul Berry wrote: This patch series ensures that meta-ops (such as glClear or glGenerateMipmapEXT) function properly when transform feedback or rasterizer discard is enabled. Most of the code changes necessary to make this work are in core mesa: patches 1/5 and 5/5 ensure that meta ops properly pause transform feedback and disable rasterizer discard (and restore the state properly when the meta op is over). Patch 2/5 ensures that PauseTransformFeedback interacts properly with BeginTransformFeedback and EndTransformFeedback (so that there is no danger of transform feedback being in a paused state after a call to BeginTransformFeedback). Patch 3/5 ensures that that while transform feedback is paused, it's possible to switch programs and do drawing that isn't compatible with the transform feedback mode. Patch 4/5 implements transform feedback pause/resume functionality in the i965 driver. We don't expose this functionality to the user yet, but we need it for meta ops to work correctly. [PATCH 1/5] mesa: Save and restore GL_RASTERIZER_DISCARD state during meta ops. [PATCH 2/5] mesa: Ensure that Paused is reset to false on EndTransformFeedback. [PATCH 3/5] mesa: Disable certain error checks when transform feedback is paused [PATCH 4/5] i965 gen6: Implement transform feedback pause/resume functionality. [PATCH 5/5] mesa: Pause transform feedback during meta ops. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Patches 1-3 and 5 are Reviewed-by: Kenneth Graunke kenn...@whitecape.org I have questions about patch 4. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965 gen6: Implement transform feedback pause/resume functionality.
On 12/21/2011 01:39 PM, Paul Berry wrote: Although i965 gen6 does not yet support ARB_transform_feedback2 or NV_transform_feedback2, it needs to support pause/resume functionality so that meta-ops will work correctly. --- src/mesa/drivers/dri/i965/brw_draw.c |3 ++- src/mesa/drivers/dri/i965/brw_gs.c |3 ++- src/mesa/drivers/dri/i965/gen6_sol.c |3 ++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 082bb9a..93f27d7 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -389,7 +389,8 @@ brw_update_primitive_count(struct brw_context *brw, { uint32_t count = count_tessellated_primitives(prim); brw-sol.primitives_generated += count; - if (brw-intel.ctx.TransformFeedback.CurrentObject-Active) { + if (brw-intel.ctx.TransformFeedback.CurrentObject-Active + !brw-intel.ctx.TransformFeedback.CurrentObject-Paused) { /* Update brw-sol.svbi_0_max_index to reflect the amount by which the * hardware is going to increment SVBI 0 when this drawing operation * occurs. This is necessary because the kernel does not (yet) save and diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index 886bf98..850d7b4 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -183,7 +183,8 @@ static void populate_key( struct brw_context *brw, } else if (intel-gen == 6) { /* On Gen6, GS is used for transform feedback. */ /* _NEW_TRANSFORM_FEEDBACK */ - if (ctx-TransformFeedback.CurrentObject-Active) { + if (ctx-TransformFeedback.CurrentObject-Active + !ctx-TransformFeedback.CurrentObject-Paused) { const struct gl_shader_program *shaderprog = ctx-Shader.CurrentVertexProgram; const struct gl_transform_feedback_info *linked_xfb_info = Nevermind, I answered my own question. I was wondering if Paused needed to be in the key, and how you got updates about it changing. But no, putting it in the key would be bizarre, and the _NEW_TRANSFORM_FEEDBACK dirty bit covers this. For the whole series: Reviewed-by: Kenneth Graunke kenn...@whitecape.org diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c b/src/mesa/drivers/dri/i965/gen6_sol.c index 5d11481..32f56d3 100644 --- a/src/mesa/drivers/dri/i965/gen6_sol.c +++ b/src/mesa/drivers/dri/i965/gen6_sol.c @@ -47,7 +47,8 @@ gen6_update_sol_surfaces(struct brw_context *brw) for (i = 0; i BRW_MAX_SOL_BINDINGS; ++i) { const int surf_index = SURF_INDEX_SOL_BINDING(i); - if (xfb_obj-Active i linked_xfb_info-NumOutputs) { + if (xfb_obj-Active !xfb_obj-Paused + i linked_xfb_info-NumOutputs) { unsigned buffer = linked_xfb_info-Outputs[i].OutputBuffer; unsigned buffer_offset = xfb_obj-Offset[buffer] / 4 + ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] i965: call next_insn() before referencing a instruction by index
On 12/21/2011 01:33 AM, Yuanhan Liu wrote: [snip] + int emit_endif = 1; Please use bool and true/false rather than int. /* In single program flow mode, we can express IF and ELSE instructions * equivalently as ADD instructions that operate on IP. On platforms prior @@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p) * instructions to conditional ADDs. So we only do this trick on Gen4 and * Gen5. */ - if (intel-gen 6 p-single_program_flow) { + if (intel-gen 6 p-single_program_flow) + emit_endif = 0; You could actually just do this: /* In single program flow mode, we can express IF and ELSE ... */ bool emit_endif = !(intel-gen 6 p-single_program_flow); But I'm fine with bool emit_endif = true and emit_endif = false if you prefer that. Assuming you make one of those changes, this patch is Reviewed-by: Kenneth Graunke kenn...@whitecape.org + /* +* A single next_insn() may change the base adress of instruction store +* memory(p-store), so call it first before referencing the instruction +* store pointer from an index +*/ + if (emit_endif) + insn = next_insn(p, BRW_OPCODE_ENDIF); + + /* Pop the IF and (optional) ELSE instructions from the stack */ + p-if_depth_in_loop[p-loop_stack_depth]--; + tmp = pop_if_stack(p); + if (tmp-header.opcode == BRW_OPCODE_ELSE) { + else_inst = tmp; + tmp = pop_if_stack(p); + } + if_inst = tmp; + + if (!emit_endif) { /* ENDIF is useless; don't bother emitting it. */ convert_IF_ELSE_to_ADD(p, if_inst, else_inst); return; } - insn = next_insn(p, BRW_OPCODE_ENDIF); - if (intel-gen 6) { brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); @@ -1393,13 +1403,12 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) struct brw_instruction *insn, *do_insn; GLuint br = 1; - do_insn = get_inner_do_insn(p); - if (intel-gen = 5) br = 2; if (intel-gen = 7) { insn = next_insn(p, BRW_OPCODE_WHILE); + do_insn = get_inner_do_insn(p); brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); @@ -1409,6 +1418,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) insn-header.execution_size = BRW_EXECUTE_8; } else if (intel-gen == 6) { insn = next_insn(p, BRW_OPCODE_WHILE); + do_insn = get_inner_do_insn(p); brw_set_dest(p, insn, brw_imm_w(0)); insn-bits1.branch_gen6.jump_count = br * (do_insn - insn); @@ -1419,6 +1429,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) } else { if (p-single_program_flow) { insn = next_insn(p, BRW_OPCODE_ADD); + do_insn = get_inner_do_insn(p); brw_set_dest(p, insn, brw_ip_reg()); brw_set_src0(p, insn, brw_ip_reg()); @@ -1426,6 +1437,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) insn-header.execution_size = BRW_EXECUTE_1; } else { insn = next_insn(p, BRW_OPCODE_WHILE); + do_insn = get_inner_do_insn(p); assert(do_insn-header.opcode == BRW_OPCODE_DO); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: Close a memory leak
Hi all This fixes a memory leak of 32 bytes on exit. From 924f8fdccb41b011f372bc57252005bcdb096105 Mon Sep 17 00:00:00 2001 From: Lauri Kasanen cur...@operamail.com Date: Thu, 22 Dec 2011 21:28:33 +0200 Subject: [PATCH] gallivm: Close a memory leak As reported by valgrind --leak-check=full glxgears. Signed-off-by: Lauri Kasanen cur...@operamail.com --- src/gallium/auxiliary/gallivm/lp_bld_init.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 45addee..503c04e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -345,6 +345,7 @@ gallivm_remove_garbage_collector_callback(garbage_collect_callback_func func, if (cb-func == func cb-cb_data == cb_data) { /* found, remove it */ remove_from_list(cb); + FREE(cb); return; } } -- 1.7.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: add back glGetnUniformfv() overflow error reporting
The error was erroneously removed in this commit: 719909698c67c287a393d2380278e7b7495ae018 mesa: Rewrite the way uniforms are tracked and handled You also aren't even supposed to truncate the output to 'bufSize', so just return like before. Also fixup an old comment and add an assert. --- (This function has a random mixture of tabs+spaces and pure spaces for indentation, so I had no idea which style to use...) src/mesa/main/uniform_query.cpp | 16 src/mesa/main/uniforms.c|2 +- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 33ba53c..8e58fc0 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -203,10 +203,18 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, GLint location, const union gl_constant_value *const src = uni-storage[offset * elements]; - unsigned bytes = sizeof(uni-storage[0]) * elements; - if (bytes (unsigned) bufSize) { -elements = bufSize / sizeof(uni-storage[0]); -bytes = bufSize; + assert(returnType == GLSL_TYPE_FLOAT || returnType == GLSL_TYPE_INT || + returnType == GLSL_TYPE_UINT); + /* The three (currently) supported types all have the same size, + * which is of course the same as their union. That'll change + * with glGetUniformdv()... + */ + unsigned bytes = sizeof(src[0]) * elements; + if (bufSize 0 || bytes (unsigned) bufSize) { +_mesa_error( ctx, GL_INVALID_OPERATION, +glGetnUniformfvARB(out of bounds: bufSize is %d, + but %u bytes are required), bufSize, bytes ); +return; } /* If the return type and the uniform's native type are compatible, diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index 685c0f1..981874e 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -478,7 +478,7 @@ _mesa_GetnUniformdvARB(GLhandleARB program, GLint location, (void) params; /* - _mesa_get_uniform(ctx, program, location, bufSize, GL_DOUBLE, params); + _mesa_get_uniform(ctx, program, location, bufSize, GLSL_TYPE_DOUBLE, params); */ _mesa_error(ctx, GL_INVALID_OPERATION, glGetUniformdvARB (GL_ARB_gpu_shader_fp64 not implemented)); -- 1.7.4.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] i965: Rename BRW_NEW_WM_SURFACES to BRW_NEW_SURFACES.
The surface states tracked by BRW_NEW_WM_SURFACES are no longer used for just WM. They are also used for vertex texturing and transform feedback. To avoid confusion, this patch renames BRW_NEW_WM_SURFACES to BRW_NEW_SURFACES. --- src/mesa/drivers/dri/i965/brw_context.h |4 ++-- src/mesa/drivers/dri/i965/brw_state_upload.c |2 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 15a781b..fb41fd1 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -131,7 +131,7 @@ enum brw_state_id { BRW_STATE_CONTEXT, BRW_STATE_WM_INPUT_DIMENSIONS, BRW_STATE_PSP, - BRW_STATE_WM_SURFACES, + BRW_STATE_SURFACES, BRW_STATE_VS_BINDING_TABLE, BRW_STATE_GS_BINDING_TABLE, BRW_STATE_PS_BINDING_TABLE, @@ -158,7 +158,7 @@ enum brw_state_id { #define BRW_NEW_CONTEXT (1 BRW_STATE_CONTEXT) #define BRW_NEW_WM_INPUT_DIMENSIONS (1 BRW_STATE_WM_INPUT_DIMENSIONS) #define BRW_NEW_PSP (1 BRW_STATE_PSP) -#define BRW_NEW_WM_SURFACES(1 BRW_STATE_WM_SURFACES) +#define BRW_NEW_SURFACES (1 BRW_STATE_SURFACES) #define BRW_NEW_VS_BINDING_TABLE (1 BRW_STATE_VS_BINDING_TABLE) #define BRW_NEW_GS_BINDING_TABLE (1 BRW_STATE_GS_BINDING_TABLE) #define BRW_NEW_PS_BINDING_TABLE (1 BRW_STATE_PS_BINDING_TABLE) diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 74d01d8..a8bda5a 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -360,7 +360,7 @@ static struct dirty_bit_map brw_bits[] = { DEFINE_BIT(BRW_NEW_WM_INPUT_DIMENSIONS), DEFINE_BIT(BRW_NEW_PROGRAM_CACHE), DEFINE_BIT(BRW_NEW_PSP), - DEFINE_BIT(BRW_NEW_WM_SURFACES), + DEFINE_BIT(BRW_NEW_SURFACES), DEFINE_BIT(BRW_NEW_INDICES), DEFINE_BIT(BRW_NEW_INDEX_BUFFER), DEFINE_BIT(BRW_NEW_VERTICES), diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 3801c09..e908430 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -828,7 +828,7 @@ brw_upload_wm_pull_constants(struct brw_context *brw) drm_intel_bo_unreference(brw-wm.const_bo); brw-wm.const_bo = NULL; brw-bind.surf_offset[surf_index] = 0; -brw-state.dirty.brw |= BRW_NEW_WM_SURFACES; +brw-state.dirty.brw |= BRW_NEW_SURFACES; } return; } @@ -850,7 +850,7 @@ brw_upload_wm_pull_constants(struct brw_context *brw) params-NumParameters, brw-bind.surf_offset[surf_index]); - brw-state.dirty.brw |= BRW_NEW_WM_SURFACES; + brw-state.dirty.brw |= BRW_NEW_SURFACES; } const struct brw_tracked_state brw_wm_pull_constants = { @@ -1004,7 +1004,7 @@ brw_update_renderbuffer_surfaces(struct brw_context *brw) } else { intel-vtbl.update_null_renderbuffer_surface(brw, 0); } - brw-state.dirty.brw |= BRW_NEW_WM_SURFACES; + brw-state.dirty.brw |= BRW_NEW_SURFACES; } const struct brw_tracked_state brw_renderbuffer_surfaces = { @@ -1046,7 +1046,7 @@ brw_update_texture_surfaces(struct brw_context *brw) } } - brw-state.dirty.brw |= BRW_NEW_WM_SURFACES; + brw-state.dirty.brw |= BRW_NEW_SURFACES; } const struct brw_tracked_state brw_texture_surfaces = { @@ -1075,7 +1075,7 @@ brw_upload_binding_table(struct brw_context *brw) sizeof(uint32_t) * BRW_MAX_SURFACES, 32, brw-bind.bo_offset); - /* BRW_NEW_WM_SURFACES and BRW_NEW_VS_CONSTBUF */ + /* BRW_NEW_SURFACES and BRW_NEW_VS_CONSTBUF */ for (i = 0; i BRW_MAX_SURFACES; i++) { bind[i] = brw-bind.surf_offset[i]; } @@ -1089,7 +1089,7 @@ const struct brw_tracked_state brw_binding_table = { .mesa = 0, .brw = (BRW_NEW_BATCH | BRW_NEW_VS_CONSTBUF | - BRW_NEW_WM_SURFACES), + BRW_NEW_SURFACES), .cache = 0 }, .emit = brw_upload_binding_table, -- 1.7.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965 gen6: Resend binding table pointer after updating SOL bindings.
After creating new binding table entries for transform feedback, we need to set the dirty flag BRW_NEW_SURFACES, so that a new binding table pointer will be sent to the hardware. Otherwise the new binding table entries will not take effect. --- src/mesa/drivers/dri/i965/gen6_sol.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c b/src/mesa/drivers/dri/i965/gen6_sol.c index 32f56d3..437b3ae 100644 --- a/src/mesa/drivers/dri/i965/gen6_sol.c +++ b/src/mesa/drivers/dri/i965/gen6_sol.c @@ -61,6 +61,8 @@ gen6_update_sol_surfaces(struct brw_context *brw) brw-bind.surf_offset[surf_index] = 0; } } + + brw-state.dirty.brw |= BRW_NEW_SURFACES; } const struct brw_tracked_state gen6_sol_surface = { -- 1.7.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965 Gen6+: Invalidate VF address-based cache on flush
Although there is not much documentation of this fact, there are in fact two separate VF caches: - an index-based cache (described in the Sandy Bridge PRM, vol 2 part 1, section 2.1.2 Vertex Cache). This cache stores URB handles of vertex shader outputs; its purpose is to avoid redundant invocations of the vertex shader when drawing in random access mode (e.g. glDrawElements()), and the same vertex index is specified multiple times. It is automatically invalidated between 3D_PRIMITIVE commands and between instances within a single 3D_PRIMITIVE command. - an address-based cache (mentioned briefly in vol 2 part 1, section 1.7.4 PIPE_CONTROL Command). This cache stores the data read from vertex buffers; its purpose is to avoid redundant memory accesses when doing instanced drawing or when multiple 3D_PRIMITIVE commands access the same vertex data. It needs to be manually invalidated whenever new data is written to a buffer that is used for vertex data. Previous to this patch, it was not necessary for Mesa to explicitly invalidate the address-based cache, because there were no reasonable use cases in which the GPU would write to a vertex data buffer during a batch, and inter-batch flushing was taken care of by the kernel. However, with transform feedback, there is now a reasonable use case: vertex data is written to a buffer using transform feedback, and then that data is immediately re-used as vertex input in the next drawing operation. To make this use case work, we need to flush the address-based VF cache between transform feedback and the next draw operation. Since we are already calling intel_batchbuffer_emit_mi_flush() when transform feedback completes, and intel_batchbuffer_emit_mi_flush() is intended to invalidate all caches, it seems reasonable to add VF cache invalidation to this function. As with commit 63cf7fad13fc9cfdd2ae7b031426f79107000300 (i965: Flush pipeline on EndTransformFeedback), this is not an ideal solution. It would be preferable to only invalidate the VF cache if the next draw call was about to consume data generated by a previous draw call in the same batch. However, since we don't have the necessary dependency tracking infrastructure to figure that out right now, we have to overzealously invalidate the cache. Fixes Piglit test EXT_transform_feedback/immediate-reuse. --- src/mesa/drivers/dri/intel/intel_batchbuffer.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c b/src/mesa/drivers/dri/intel/intel_batchbuffer.c index 4ff098a..cb23dbc 100644 --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c @@ -460,6 +460,7 @@ intel_batchbuffer_emit_mi_flush(struct intel_context *intel) OUT_BATCH(PIPE_CONTROL_INSTRUCTION_FLUSH | PIPE_CONTROL_WRITE_FLUSH | PIPE_CONTROL_DEPTH_CACHE_FLUSH | + PIPE_CONTROL_VF_CACHE_INVALIDATE | PIPE_CONTROL_TC_FLUSH | PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_CS_STALL); -- 1.7.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically
On 12/21/2011 01:33 AM, Yuanhan Liu wrote: Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 1'. Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com --- src/mesa/drivers/dri/i965/brw_eu.c |7 +++ src/mesa/drivers/dri/i965/brw_eu.h |7 --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 12 +++- 3 files changed, 22 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.c b/src/mesa/drivers/dri/i965/brw_eu.c index 9b4dde8..7d206f3 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -174,6 +174,13 @@ void brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx) { p-brw = brw; + /* +* Set the initial instruction store array size to 1024, if found that +* isn't enough, then it will double the store size at brw_next_insn() +* until it meet the BRW_EU_MAX_INSN +*/ + p-store_size = 1024; + p-store = rzalloc_array(mem_ctx, struct brw_instruction, p-store_size); p-nr_insn = 0; p-current = p-stack; p-compressed = false; diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 9d3d7de..52567c2 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -100,11 +100,12 @@ struct brw_glsl_call; -#define BRW_EU_MAX_INSN_STACK 5 -#define BRW_EU_MAX_INSN 1 +#define BRW_EU_MAX_INSN_STACK 5 +#define BRW_EU_MAX_INSN (1024 * 1024) I'm actually surprised to see BRW_EU_MAX_INSN at all. As far as I know, there isn't an actual hardware limit on the number of instructions, so I'm not sure why we should cap it at all. Especially not to some arbitrary number. (I'm assuming that 1024 * 1024 is just something you came up with arbitrarily...) struct brw_compile { - struct brw_instruction store[BRW_EU_MAX_INSN]; + struct brw_instruction *store; + int store_size; GLuint nr_insn; void *mem_ctx; diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index bd5fe6a..4396a0c 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -691,7 +691,17 @@ brw_next_insn(struct brw_compile *p, GLuint opcode) { struct brw_instruction *insn; - assert(p-nr_insn + 1 BRW_EU_MAX_INSN); + if (p-nr_insn + 1 p-store_size) { + if (p-nr_insn + 1 BRW_EU_MAX_INSN) { + assert(!exceed max brw allowed eu instructions); + } else { + if (0) +printf(incresing the store size to %d\n, p-store_size 1); + p-store_size = 1; + p-store = reralloc(p-mem_ctx, p-store, + struct brw_instruction, p-store_size); + } + } insn = p-store[p-nr_insn++]; memcpy(insn, p-current, sizeof(*insn)); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] i965: dynamic eu instruction store size
On 12/21/2011 01:33 AM, Yuanhan Liu wrote: Hi, this is a new series of patches for dynamic eu instruction store size. The first 4 is from Eric. I just grabed it to make it rebase to current repo. The last 4 patch is from mine which some are based on those patches from Eric. Please help to review it. BTW, I checked those patches with all oglc test cases, and found no regression. (Sandybridge only). Thanks, Yuanhan Liu -- Eric Anholt (4): i965: Drop unused do_insn argument from gen6_CONT(). i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT patching i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop Yuanhan Liu (4): i965: let the if_stack just store the instruction index i965: get the jmp distance by instruction index i965: call next_insn() before referencing a instruction by index Patches 1-7 (v2 of 6 and after changing to bool in 7) are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org i965: increase the brw eu instruction store size dynamically Patch 8 does not get a R-b just yet. Thanks for doing this, Yuanhan, I'm really glad to see the arbitrary 1 limit die. And Eric, thanks for cleaning up the rest of the control flow stack code---it's /so/ much nicer now! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965 Gen6+: Invalidate VF address-based cache on flush
On 12/22/2011 02:06 PM, Paul Berry wrote: Although there is not much documentation of this fact, there are in fact two separate VF caches: - an index-based cache (described in the Sandy Bridge PRM, vol 2 part 1, section 2.1.2 Vertex Cache). This cache stores URB handles of vertex shader outputs; its purpose is to avoid redundant invocations of the vertex shader when drawing in random access mode (e.g. glDrawElements()), and the same vertex index is specified multiple times. It is automatically invalidated between 3D_PRIMITIVE commands and between instances within a single 3D_PRIMITIVE command. - an address-based cache (mentioned briefly in vol 2 part 1, section 1.7.4 PIPE_CONTROL Command). This cache stores the data read from vertex buffers; its purpose is to avoid redundant memory accesses when doing instanced drawing or when multiple 3D_PRIMITIVE commands access the same vertex data. It needs to be manually invalidated whenever new data is written to a buffer that is used for vertex data. Previous to this patch, it was not necessary for Mesa to explicitly invalidate the address-based cache, because there were no reasonable use cases in which the GPU would write to a vertex data buffer during a batch, and inter-batch flushing was taken care of by the kernel. However, with transform feedback, there is now a reasonable use case: vertex data is written to a buffer using transform feedback, and then that data is immediately re-used as vertex input in the next drawing operation. To make this use case work, we need to flush the address-based VF cache between transform feedback and the next draw operation. Since we are already calling intel_batchbuffer_emit_mi_flush() when transform feedback completes, and intel_batchbuffer_emit_mi_flush() is intended to invalidate all caches, it seems reasonable to add VF cache invalidation to this function. As with commit 63cf7fad13fc9cfdd2ae7b031426f79107000300 (i965: Flush pipeline on EndTransformFeedback), this is not an ideal solution. It would be preferable to only invalidate the VF cache if the next draw call was about to consume data generated by a previous draw call in the same batch. However, since we don't have the necessary dependency tracking infrastructure to figure that out right now, we have to overzealously invalidate the cache. Fixes Piglit test EXT_transform_feedback/immediate-reuse. --- src/mesa/drivers/dri/intel/intel_batchbuffer.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c b/src/mesa/drivers/dri/intel/intel_batchbuffer.c index 4ff098a..cb23dbc 100644 --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c @@ -460,6 +460,7 @@ intel_batchbuffer_emit_mi_flush(struct intel_context *intel) OUT_BATCH(PIPE_CONTROL_INSTRUCTION_FLUSH | PIPE_CONTROL_WRITE_FLUSH | PIPE_CONTROL_DEPTH_CACHE_FLUSH | + PIPE_CONTROL_VF_CACHE_INVALIDATE | PIPE_CONTROL_TC_FLUSH | PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_CS_STALL); I checked the workaround list, and it doesn't look like there are any workarounds needed for VF (address based) Cache invalidation. Plus, we now do that in the kernel inbetween every batch, so I'm not concerned about adding it here. This series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Add Haiku build support
From: Alexander von Gluck IV kallis...@unixzen.com * Add Haiku as a platform to mklib * Fix GLU to allow building static libGLU * Remove a few existing Haiku defines that break the build --- Makefile |1 + acinclude.m4 |2 +- bin/mklib| 37 ++ src/gallium/auxiliary/os/os_thread.h |2 +- src/gallium/auxiliary/util/u_debug.h |2 - src/gallium/drivers/r300/Makefile|1 + src/glsl/link_uniforms.cpp |3 +- src/glu/sgi/Makefile | 15 +++-- src/mesa/main/querymatrix.c |2 +- 9 files changed, 51 insertions(+), 14 deletions(-) diff --git a/Makefile b/Makefile index cf6555c..4caa8ce 100644 --- a/Makefile +++ b/Makefile @@ -90,6 +90,7 @@ freebsd \ freebsd-dri \ freebsd-dri-amd64 \ freebsd-dri-x86 \ +haiku \ hpux10 \ hpux10-gcc \ hpux10-static \ diff --git a/acinclude.m4 b/acinclude.m4 index a5b389d..33ed8a8 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -34,7 +34,7 @@ if test $enable_pic != no; then # see if we're using GCC if test x$GCC = xyes; then case $host_os in -aix*|beos*|cygwin*|irix5*|irix6*|osf3*|osf4*|osf5*) +aix*|cygwin*|haiku*|irix5*|irix6*|osf3*|osf4*|osf5*) # PIC is the default for these OSes. ;; mingw*|os2*|pw32*) diff --git a/bin/mklib b/bin/mklib index 70bd1a2..ca4b62c 100755 --- a/bin/mklib +++ b/bin/mklib @@ -959,6 +959,43 @@ case $ARCH in fi ;; +'Haiku') +if [ $STATIC = 1 ] ; then +LIBNAME=lib${LIBNAME}.a +if [ x$LINK = x ] ; then +# -linker was not specified so set default link command now +if [ $CPLUSPLUS = 1 ] ; then +LINK=g++ +else +LINK=gcc +fi +fi + +OPTS=-ru +if [ ${ALTOPTS} ] ; then +OPTS=${ALTOPTS} +fi + +echo mklib: Making static library for Haiku: ${LIBNAME} + +# expand .a into .o files +NEW_OBJECTS=`expand_archives ${LIBNAME}.obj $OBJECTS` + +# make static lib +FINAL_LIBS=`make_ar_static_lib ${OPTS} 1 ${LIBNAME} ${NEW_OBJECTS}` + +# remove temporary extracted .o files +rm -rf ${LIBNAME}.obj +else +LIBNAME=lib${LIBNAME}.so # prefix with lib, suffix with .so +OPTS=-shared + +echo mklib: Making shared library for Haiku: ${LIBNAME} + ${LINK} ${OPTS} ${LDFLAGS} ${OBJECTS} ${DEPS} -o ${LIBNAME} +FINAL_LIBS=${LIBNAME} +fi +;; + 'example') # If you're adding support for a new architecture, you can # start with this: diff --git a/src/gallium/auxiliary/os/os_thread.h b/src/gallium/auxiliary/os/os_thread.h index d830129..3e1c273 100644 --- a/src/gallium/auxiliary/os/os_thread.h +++ b/src/gallium/auxiliary/os/os_thread.h @@ -314,7 +314,7 @@ typedef int64_t pipe_condvar; * pipe_barrier */ -#if (defined(PIPE_OS_LINUX) || defined(PIPE_OS_BSD) || defined(PIPE_OS_SOLARIS) || defined(PIPE_OS_HAIKU)) !defined(PIPE_OS_ANDROID) +#if (defined(PIPE_OS_LINUX) || defined(PIPE_OS_BSD) || defined(PIPE_OS_SOLARIS)) !defined(PIPE_OS_ANDROID) typedef pthread_barrier_t pipe_barrier; diff --git a/src/gallium/auxiliary/util/u_debug.h b/src/gallium/auxiliary/util/u_debug.h index b5ea405..677e478 100644 --- a/src/gallium/auxiliary/util/u_debug.h +++ b/src/gallium/auxiliary/util/u_debug.h @@ -75,7 +75,6 @@ _debug_printf(const char *format, ...) * - avoid outputing large strings (512 bytes is the current maximum length * that is guaranteed to be printed in all platforms) */ -#if !defined(PIPE_OS_HAIKU) static INLINE void debug_printf(const char *format, ...) _util_printf_format(1,2); @@ -92,7 +91,6 @@ debug_printf(const char *format, ...) #endif } -#endif /* !PIPE_OS_HAIKU */ /* * ... isn't portable so we need to pass arguments in parentheses. diff --git a/src/gallium/drivers/r300/Makefile b/src/gallium/drivers/r300/Makefile index 5f56fc4..3e3a765 100644 --- a/src/gallium/drivers/r300/Makefile +++ b/src/gallium/drivers/r300/Makefile @@ -15,6 +15,7 @@ C_SOURCES += \ LIBRARY_INCLUDES = \ -I$(TOP)/include \ -I$(TOP)/src/mesa \ + -I$(TOP)/src/mapi \ -I$(TOP)/src/glsl include ../../Makefile.template diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index c7de480..f2e6648 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -336,9 +336,8 @@ link_assign_uniform_locations(struct gl_shader_program *prog) rzalloc_array(prog, struct gl_uniform_storage, num_user_uniforms); union gl_constant_value *data = rzalloc_array(uniforms, union gl_constant_value, num_data_slots); -#ifndef NDEBUG +
Re: [Mesa-dev] [PATCH] gallivm: Close a memory leak
Commited. Thanks. Jose - Original Message - Hi all This fixes a memory leak of 32 bytes on exit. From 924f8fdccb41b011f372bc57252005bcdb096105 Mon Sep 17 00:00:00 2001 From: Lauri Kasanen cur...@operamail.com Date: Thu, 22 Dec 2011 21:28:33 +0200 Subject: [PATCH] gallivm: Close a memory leak As reported by valgrind --leak-check=full glxgears. Signed-off-by: Lauri Kasanen cur...@operamail.com --- src/gallium/auxiliary/gallivm/lp_bld_init.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 45addee..503c04e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -345,6 +345,7 @@ gallivm_remove_garbage_collector_callback(garbage_collect_callback_func func, if (cb-func == func cb-cb_data == cb_data) { /* found, remove it */ remove_from_list(cb); + FREE(cb); return; } } -- 1.7.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] i965/gen7 transform feedback
Here's today's patch series for gen7 transform feedback. It runs on top of a kernel patch at people.freedesktop.org:~anholt/linux on the gen7-reset-sol branch. I expected it to be easy, but not this easy. Remaining test failures: tessellation polygon flat_lastwarn tessellation quad_strip flat_last warn tessellation quads flat_last warn tessellation triangle_fan flat_first fail Also, it looks like none of the current piglit tests test transform feedback across batchbuffers. I don't expect major problems there, given that we have a userland count of verts emitted. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] i965/gen7: Move SOL stage disable to gen7_sol_state.c
We'll be growing more code in here as we actually enable the unit. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/Makefile.sources |1 + src/mesa/drivers/dri/i965/brw_state_upload.c |1 + src/mesa/drivers/dri/i965/gen7_disable.c |7 --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 56 ++ 4 files changed, 58 insertions(+), 7 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen7_sol_state.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index e50f9c3..3eeac6f 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -104,6 +104,7 @@ i965_C_SOURCES := \ gen7_misc_state.c \ gen7_sampler_state.c \ gen7_sf_state.c \ + gen7_sol_state.c \ gen7_urb.c \ gen7_viewport_state.c \ gen7_vs_state.c \ diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 74d01d8..66382b7 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -220,6 +220,7 @@ const struct brw_tracked_state *gen7_atoms[] = gen7_disable_stages, gen7_vs_state, + gen7_sol_state, gen7_clip_state, gen7_sbe_state, gen7_sf_state, diff --git a/src/mesa/drivers/dri/i965/gen7_disable.c b/src/mesa/drivers/dri/i965/gen7_disable.c index a44d315..b37aa6c 100644 --- a/src/mesa/drivers/dri/i965/gen7_disable.c +++ b/src/mesa/drivers/dri/i965/gen7_disable.c @@ -122,13 +122,6 @@ disable_stages(struct brw_context *brw) OUT_BATCH(_3DSTATE_BINDING_TABLE_POINTERS_DS 16 | (2 - 2)); OUT_BATCH(0); ADVANCE_BATCH(); - - /* Disable the SOL stage */ - BEGIN_BATCH(3); - OUT_BATCH(_3DSTATE_STREAMOUT 16 | (3 - 2)); - OUT_BATCH(0); - OUT_BATCH(0); - ADVANCE_BATCH(); } const struct brw_tracked_state gen7_disable_stages = { diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c new file mode 100644 index 000..fcda08d --- /dev/null +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -0,0 +1,56 @@ +/* + * Copyright © 2011 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** + * @file gen7_sol_state.c + * + * Controls the stream output logic (SOL) stage of the gen7 hardware, which is + * used to implement GL_EXT_transform_feedback. + */ + +#include brw_context.h +#include brw_state.h +#include brw_defines.h +#include intel_batchbuffer.h + +static void +upload_sol_state(struct brw_context *brw) +{ + struct intel_context *intel = brw-intel; + + /* Disable the SOL stage */ + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_STREAMOUT 16 | (3 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} + +const struct brw_tracked_state gen7_sol_state = { + .dirty = { + .mesa = 0, + .brw = BRW_NEW_BATCH, + .cache = 0, + }, + .emit = upload_sol_state, +}; -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.
The code was relying on gs.prog_data's copy of the number-of-verts-per-prim, which segfaulted on gen7 since it doesn't make a GS program. We can easily calculate that value right here. --- src/mesa/drivers/dri/i965/brw_draw.c | 33 +++-- 1 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 082bb9a..c116d39 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -379,6 +379,30 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) } } +static int +verts_per_prim(GLenum mode) +{ + switch (mode) { + case GL_POINTS: + return 1; + case GL_LINE_STRIP: + case GL_LINE_LOOP: + case GL_LINES: + return 2; + case GL_TRIANGLE_STRIP: + case GL_TRIANGLE_FAN: + case GL_POLYGON: + case GL_TRIANGLES: + case GL_QUADS: + case GL_QUAD_STRIP: + return 3; + default: + _mesa_problem(NULL, + unknown prim type in transform feedback primitive count); + return 0; + } +} + /** * Update internal counters based on the the drawing operation described in * prim. @@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw, * able to reload SVBI 0 with the correct value in case we have to start * a new batch buffer. */ - unsigned svbi_postincrement_value = - brw-gs.prog_data-svbi_postincrement_value; + unsigned verts = verts_per_prim(prim-mode); uint32_t space_avail = - (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) - / svbi_postincrement_value; + (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) / verts; uint32_t primitives_written = MIN2 (space_avail, count); - brw-sol.svbi_0_starting_index += - svbi_postincrement_value * primitives_written; + brw-sol.svbi_0_starting_index += verts; /* And update the TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query. */ brw-sol.primitives_written += primitives_written; -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] i965/gen7: Add register definitions for GL_EXT_transform_feedback.
Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_defines.h | 76 ++- src/mesa/drivers/dri/intel/intel_reg.h | 15 ++ 2 files changed, 89 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 4edfaf7..4bb7f00 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1307,6 +1307,42 @@ enum brw_wm_barycentric_interp_mode { #define _3DSTATE_CONSTANT_HS 0x7819 /* GEN7+ */ #define _3DSTATE_CONSTANT_DS 0x781A /* GEN7+ */ +#define _3DSTATE_STREAMOUT0x781e /* GEN7+ */ +/* DW1 */ +# define SO_FUNCTION_ENABLE(1 31) +# define SO_RENDERING_DISABLE (1 30) +/* This selects which incoming rendering stream goes down the pipeline. The + * rendering stream is 0 if not defined by special cases in the GS state. + */ +# define SO_RENDER_STREAM_SELECT_SHIFT 27 +# define SO_RENDER_STREAM_SELECT_MASK INTEL_MASK(28, 27) +/* Controls reordering of TRISTRIP_* elements in stream output (not rendering). + */ +# define SO_REORDER_TRAILING (1 26) +/* Controls SO_NUM_PRIMS_WRITTEN_* and SO_PRIM_STORAGE_* */ +# define SO_STATISTICS_ENABLE (1 25) +# define SO_BUFFER_ENABLE_3(1 11) +# define SO_BUFFER_ENABLE_2(1 10) +# define SO_BUFFER_ENABLE_1(1 9) +# define SO_BUFFER_ENABLE_0(1 8) +/* DW2 */ +# define SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT 29 +# define SO_STREAM_3_VERTEX_READ_OFFSET_MASK INTEL_MASK(29, 29) +# define SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT 24 +# define SO_STREAM_3_VERTEX_READ_LENGTH_MASK INTEL_MASK(28, 24) +# define SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT 21 +# define SO_STREAM_2_VERTEX_READ_OFFSET_MASK INTEL_MASK(21, 21) +# define SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT 16 +# define SO_STREAM_2_VERTEX_READ_LENGTH_MASK INTEL_MASK(20, 16) +# define SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT 13 +# define SO_STREAM_1_VERTEX_READ_OFFSET_MASK INTEL_MASK(13, 13) +# define SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT 8 +# define SO_STREAM_1_VERTEX_READ_LENGTH_MASK INTEL_MASK(12, 8) +# define SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT 5 +# define SO_STREAM_0_VERTEX_READ_OFFSET_MASK INTEL_MASK(5, 5) +# define SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT 0 +# define SO_STREAM_0_VERTEX_READ_LENGTH_MASK INTEL_MASK(4, 0) + /* 3DSTATE_WM for Gen7 */ /* DW1 */ # define GEN7_WM_STATISTICS_ENABLE (1 31) @@ -1373,8 +1409,6 @@ enum brw_wm_barycentric_interp_mode { /* DW6: kernel 1 pointer */ /* DW7: kernel 2 pointer */ -#define _3DSTATE_STREAMOUT 0x781e /* GEN7+ */ - #define _3DSTATE_SAMPLE_MASK 0x7818 /* GEN6+ */ #define _3DSTATE_DRAWING_RECTANGLE 0x7900 @@ -1414,6 +1448,44 @@ enum brw_wm_barycentric_interp_mode { # define DEPTH_CLEAR_VALID (1 15) /* DW1: depth clear value */ +#define _3DSTATE_SO_DECL_LIST 0x7917 /* GEN7+ */ +/* DW1 */ +# define SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT 12 +# define SO_STREAM_TO_BUFFER_SELECTS_3_MASKINTEL_MASK(15, 12) +# define SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT 8 +# define SO_STREAM_TO_BUFFER_SELECTS_2_MASKINTEL_MASK(11, 8) +# define SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT 4 +# define SO_STREAM_TO_BUFFER_SELECTS_1_MASKINTEL_MASK(7, 4) +# define SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT 0 +# define SO_STREAM_TO_BUFFER_SELECTS_0_MASKINTEL_MASK(3, 0) +/* DW2 */ +# define SO_NUM_ENTRIES_3_SHIFT24 +# define SO_NUM_ENTRIES_3_MASK INTEL_MASK(31, 24) +# define SO_NUM_ENTRIES_2_SHIFT16 +# define SO_NUM_ENTRIES_2_MASK INTEL_MASK(23, 16) +# define SO_NUM_ENTRIES_1_SHIFT8 +# define SO_NUM_ENTRIES_1_MASK INTEL_MASK(15, 8) +# define SO_NUM_ENTRIES_0_SHIFT0 +# define SO_NUM_ENTRIES_0_MASK INTEL_MASK(7, 0) + +/* SO_DECL DW0 */ +# define SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT 12 +# define SO_DECL_OUTPUT_BUFFER_SLOT_MASK INTEL_MASK(13, 12) +# define SO_DECL_HOLE_FLAG (1 11) +# define SO_DECL_REGISTER_INDEX_SHIFT 4 +# define SO_DECL_REGISTER_INDEX_MASK INTEL_MASK(9, 4) +# define SO_DECL_COMPONENT_MASK_SHIFT 0 +# define SO_DECL_COMPONENT_MASK_MASK INTEL_MASK(3, 0) +
[Mesa-dev] [PATCH 1/7] i965/gen7: Enable EXT_transform_feedback extension under 3.0 override.
Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_extensions.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c b/src/mesa/drivers/dri/intel/intel_extensions.c index 7ab5d90..09ee9ba 100644 --- a/src/mesa/drivers/dri/intel/intel_extensions.c +++ b/src/mesa/drivers/dri/intel/intel_extensions.c @@ -104,7 +104,7 @@ intelInitExtensions(struct gl_context *ctx) ctx-Const.GLSLVersion = 120; _mesa_override_glsl_version(ctx); - if (intel-gen == 6) + if (intel-gen == 6 || (intel-gen == 7 override_version = 30)) ctx-Extensions.EXT_transform_feedback = true; if (intel-gen = 5) -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.
Fixes the piglit discard-* tests. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index fcda08d..650f625 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -37,6 +37,12 @@ static void upload_sol_state(struct brw_context *brw) { struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + uint32_t dw1 = 0; + + /* _NEW_RASTERIZER_DISCARD */ + if (ctx-RasterDiscard) + dw1 |= SO_RENDERING_DISABLE; /* Disable the SOL stage */ BEGIN_BATCH(3); @@ -48,7 +54,7 @@ upload_sol_state(struct brw_context *brw) const struct brw_tracked_state gen7_sol_state = { .dirty = { - .mesa = 0, + .mesa = _NEW_RASTERIZER_DISCARD, .brw = BRW_NEW_BATCH, .cache = 0, }, -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.
Fixes almost all of the transform feedback piglit tests. Remaining are a few tests related to tesselation for quads/trifans/tristrips/polygons with flat shading. --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 199 ++- 1 files changed, 191 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 650f625..a5e28b6 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -32,31 +32,214 @@ #include brw_state.h #include brw_defines.h #include intel_batchbuffer.h +#include intel_buffer_objects.h static void -upload_sol_state(struct brw_context *brw) +upload_3dstate_so_buffers(struct brw_context *brw) +{ + struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + /* BRW_NEW_VERTEX_PROGRAM */ + const struct gl_shader_program *vs_prog = + ctx-Shader.CurrentVertexProgram; + const struct gl_transform_feedback_info *linked_xfb_info = + vs_prog-LinkedTransformFeedback; + struct gl_transform_feedback_object *xfb_obj = + ctx-TransformFeedback.CurrentObject; + int i; + + /* Set up the up to 4 output buffers. These are the ranges defined in the +* gl_transform_feedback_object. +*/ + for (i = 0; i 4; i++) { + struct gl_buffer_object *bufferobj = xfb_obj-Buffers[i]; + drm_intel_bo *bo; + uint32_t start, end; + + if (!xfb_obj-Buffers[i]) { +/* The pitch of 0 in this command indicates that the buffer is + * unbound and won't be written to. + */ +BEGIN_BATCH(4); +OUT_BATCH(_3DSTATE_SO_BUFFER 16 | (4 - 2)); +OUT_BATCH((i SO_BUFFER_INDEX_SHIFT)); +OUT_BATCH(0); +OUT_BATCH(0); +ADVANCE_BATCH(); + +continue; + } + + bo = intel_buffer_object(bufferobj)-buffer; + + start = xfb_obj-Offset[i]; + assert(start % 4 == 0); + end = ALIGN(start + xfb_obj-Size[i], 4); + assert(end = bo-size); + + BEGIN_BATCH(4); + OUT_BATCH(_3DSTATE_SO_BUFFER 16 | (4 - 2)); + OUT_BATCH((i SO_BUFFER_INDEX_SHIFT) | + ((linked_xfb_info-BufferStride[i] * 4) +SO_BUFFER_PITCH_SHIFT)); + OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, start); + OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, end); + ADVANCE_BATCH(); + } +} + +/** + * Outputs the 3DSTATE_SO_DECL_LIST command. + * + * The data output is a series of 64-bit entries containing a SO_DECL per + * stream. We only have one stream of rendering coming out of the GS unit, so + * we only emit stream 0 (low 16 bits) SO_DECLs. + */ +static void +upload_3dstate_so_decl_list(struct brw_context *brw, + struct brw_vue_map *vue_map) +{ + struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + /* BRW_NEW_VERTEX_PROGRAM */ + const struct gl_shader_program *vs_prog = + ctx-Shader.CurrentVertexProgram; + /* NEW_TRANSFORM_FEEDBACK */ + const struct gl_transform_feedback_info *linked_xfb_info = + vs_prog-LinkedTransformFeedback; + int i; + uint16_t so_decl[128]; + int buffer_mask = 0; + int next_offset[4] = {0, 0, 0, 0}; + + /* Construct the list of SO_DECLs to be emitted. The formatting of the +* command is feels strange -- each dword pair contains a SO_DECL per stream. +*/ + for (i = 0; i linked_xfb_info-NumOutputs; i++) { + int buffer = linked_xfb_info-Outputs[i].OutputBuffer; + uint16_t decl = 0; + int vert_result = linked_xfb_info-Outputs[i].OutputRegister; + + buffer_mask |= 1 buffer; + + decl |= buffer SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT; + decl |= vue_map-vert_result_to_slot[vert_result] +SO_DECL_REGISTER_INDEX_SHIFT; + decl |= ((1 linked_xfb_info-Outputs[i].NumComponents) - 1) +SO_DECL_COMPONENT_MASK_SHIFT; + + /* FINISHME */ + assert(linked_xfb_info-Outputs[i].DstOffset == next_offset[buffer]); + + next_offset[buffer] += linked_xfb_info-Outputs[i].NumComponents; + + so_decl[i] = decl; + } + + BEGIN_BATCH(linked_xfb_info-NumOutputs * 2 + 3); + OUT_BATCH(_3DSTATE_SO_DECL_LIST 16 | +(linked_xfb_info-NumOutputs * 2 + 1)); + + OUT_BATCH((buffer_mask SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT) | +(0 SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT) | +(0 SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT) | +(0 SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT)); + + OUT_BATCH((linked_xfb_info-NumOutputs SO_NUM_ENTRIES_0_SHIFT) | +(0 SO_NUM_ENTRIES_1_SHIFT) | +(0 SO_NUM_ENTRIES_2_SHIFT) | +(0 SO_NUM_ENTRIES_3_SHIFT)); + + for (i = 0; i linked_xfb_info-NumOutputs; i++) { + OUT_BATCH(so_decl[i]); + OUT_BATCH(0); + } + + ADVANCE_BATCH(); +} + +static void +upload_3dstate_streamout(struct brw_context *brw, bool active, +
[Mesa-dev] [PATCH 7/7] i965/gen7: Fix feedback for flat-shaded tristrips versus provoking vertex.
Fixes piglit tesselation triangle_strip flat_last. --- src/mesa/drivers/dri/i965/gen7_sol_state.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index a5e28b6..93ca868 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -182,6 +182,10 @@ upload_3dstate_streamout(struct brw_context *brw, bool active, dw1 |= SO_FUNCTION_ENABLE; dw1 |= SO_STATISTICS_ENABLE; + /* _NEW_LIGHT */ + if (ctx-Light.ProvokingVertex != GL_FIRST_VERTEX_CONVENTION) +dw1 |= SO_REORDER_TRAILING; + for (i = 0; i 4; i++) { if (xfb_obj-Buffers[i]) { dw1 |= SO_BUFFER_ENABLE_0 i; @@ -235,6 +239,7 @@ upload_sol_state(struct brw_context *brw) const struct brw_tracked_state gen7_sol_state = { .dirty = { .mesa = (_NEW_RASTERIZER_DISCARD | + _NEW_LIGHT | _NEW_TRANSFORM_FEEDBACK | _NEW_TRANSFORM), .brw = (BRW_NEW_BATCH | -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] i965: dynamic eu instruction store size
On Thu, Dec 22, 2011 at 02:37:58PM -0800, Kenneth Graunke wrote: On 12/21/2011 01:33 AM, Yuanhan Liu wrote: Hi, this is a new series of patches for dynamic eu instruction store size. The first 4 is from Eric. I just grabed it to make it rebase to current repo. The last 4 patch is from mine which some are based on those patches from Eric. Please help to review it. BTW, I checked those patches with all oglc test cases, and found no regression. (Sandybridge only). Thanks, Yuanhan Liu -- Eric Anholt (4): i965: Drop unused do_insn argument from gen6_CONT(). i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT patching i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop Yuanhan Liu (4): i965: let the if_stack just store the instruction index i965: get the jmp distance by instruction index i965: call next_insn() before referencing a instruction by index Patches 1-7 (v2 of 6 and after changing to bool in 7) are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Thanks. i965: increase the brw eu instruction store size dynamically Patch 8 does not get a R-b just yet. Ok , will fix it. Thanks for doing this, Yuanhan, I'm really glad to see the arbitrary 1 limit die. Welcome and it's my pleasure. And Eric, thanks for cleaning up the rest of the control flow stack code---it's /so/ much nicer now! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote: The code was relying on gs.prog_data's copy of the number-of-verts-per-prim, which segfaulted on gen7 since it doesn't make a GS program. We can easily calculate that value right here. --- src/mesa/drivers/dri/i965/brw_draw.c | 33 +++-- 1 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 082bb9a..c116d39 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -379,6 +379,30 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) } } +static int +verts_per_prim(GLenum mode) +{ + switch (mode) { + case GL_POINTS: + return 1; + case GL_LINE_STRIP: + case GL_LINE_LOOP: + case GL_LINES: + return 2; + case GL_TRIANGLE_STRIP: + case GL_TRIANGLE_FAN: + case GL_POLYGON: + case GL_TRIANGLES: + case GL_QUADS: + case GL_QUAD_STRIP: + return 3; + default: + _mesa_problem(NULL, + unknown prim type in transform feedback primitive count); + return 0; + } +} + /** * Update internal counters based on the the drawing operation described in * prim. @@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw, * able to reload SVBI 0 with the correct value in case we have to start * a new batch buffer. */ - unsigned svbi_postincrement_value = - brw-gs.prog_data-svbi_postincrement_value; + unsigned verts = verts_per_prim(prim-mode); uint32_t space_avail = - (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) - / svbi_postincrement_value; + (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) / verts; uint32_t primitives_written = MIN2 (space_avail, count); - brw-sol.svbi_0_starting_index += - svbi_postincrement_value * primitives_written; + brw-sol.svbi_0_starting_index += verts; This should be brw-sol.svbi_0_starting_index += verts * primitives_written. With that change, this is Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] i965/gen7: Make primitives_written counting work.
On 12/22/2011 04:54 PM, Eric Anholt wrote: The code was relying on gs.prog_data's copy of the number-of-verts-per-prim, which segfaulted on gen7 since it doesn't make a GS program. We can easily calculate that value right here. --- src/mesa/drivers/dri/i965/brw_draw.c | 33 +++-- 1 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 082bb9a..c116d39 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -379,6 +379,30 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) } } +static int +verts_per_prim(GLenum mode) +{ + switch (mode) { + case GL_POINTS: + return 1; + case GL_LINE_STRIP: + case GL_LINE_LOOP: + case GL_LINES: + return 2; + case GL_TRIANGLE_STRIP: + case GL_TRIANGLE_FAN: + case GL_POLYGON: + case GL_TRIANGLES: + case GL_QUADS: + case GL_QUAD_STRIP: + return 3; + default: + _mesa_problem(NULL, + unknown prim type in transform feedback primitive count); + return 0; + } +} + /** * Update internal counters based on the the drawing operation described in * prim. @@ -397,14 +421,11 @@ brw_update_primitive_count(struct brw_context *brw, * able to reload SVBI 0 with the correct value in case we have to start * a new batch buffer. */ - unsigned svbi_postincrement_value = - brw-gs.prog_data-svbi_postincrement_value; + unsigned verts = verts_per_prim(prim-mode); uint32_t space_avail = - (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) - / svbi_postincrement_value; + (brw-sol.svbi_0_max_index - brw-sol.svbi_0_starting_index) / verts; uint32_t primitives_written = MIN2 (space_avail, count); - brw-sol.svbi_0_starting_index += - svbi_postincrement_value * primitives_written; + brw-sol.svbi_0_starting_index += verts; Don't you mean brw-sol.svbi_0_starting_index += verts * primitives_written; /* And update the TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query. */ brw-sol.primitives_written += primitives_written; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] ff_fragment_shader: Don't generate swizzles for scalar combiner inputs
From: Ian Romanick ian.d.roman...@intel.com There are a couple scenarios where the source could be zero and the operand could be either SRC_ALPHA or ONE_MINUS_SRC_ALPHA. For example, if the source was ZERO. This would result in something like (0).w, and a later call to ir_validate would get angry. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42517 --- src/mesa/main/ff_fragment_shader.cpp | 16 ++-- 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/ff_fragment_shader.cpp b/src/mesa/main/ff_fragment_shader.cpp index 008da0d..3e736fa 100644 --- a/src/mesa/main/ff_fragment_shader.cpp +++ b/src/mesa/main/ff_fragment_shader.cpp @@ -632,15 +632,19 @@ emit_combine_source(struct texenv_fragment_program *p, new(p-mem_ctx) ir_constant(1.0f), src); - case OPR_SRC_ALPHA: - return new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1); + case OPR_SRC_ALPHA: + return src-type-is_scalar() +? src : (ir_rvalue *) new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1); + + case OPR_ONE_MINUS_SRC_ALPHA: { + ir_rvalue *const scalar = (src-type-is_scalar()) +? src : (ir_rvalue *) new(p-mem_ctx) ir_swizzle(src, 3, 3, 3, 3, 1); - case OPR_ONE_MINUS_SRC_ALPHA: return new(p-mem_ctx) ir_expression(ir_binop_sub, new(p-mem_ctx) ir_constant(1.0f), - new(p-mem_ctx) ir_swizzle(src, - 3, 3, - 3, 3, 1)); + scalar); + } + case OPR_ZERO: return new(p-mem_ctx) ir_constant(0.0f); case OPR_ONE: -- 1.7.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote: Fixes the piglit discard-* tests. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index fcda08d..650f625 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -37,6 +37,12 @@ static void upload_sol_state(struct brw_context *brw) { struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + uint32_t dw1 = 0; + + /* _NEW_RASTERIZER_DISCARD */ + if (ctx-RasterDiscard) + dw1 |= SO_RENDERING_DISABLE; It looks like dw1 is set here but not used until patch 6/7. /* Disable the SOL stage */ BEGIN_BATCH(3); @@ -48,7 +54,7 @@ upload_sol_state(struct brw_context *brw) const struct brw_tracked_state gen7_sol_state = { .dirty = { - .mesa = 0, + .mesa = _NEW_RASTERIZER_DISCARD, .brw = BRW_NEW_BATCH, .cache = 0, }, -- 1.7.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically
On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote: On 12/21/2011 01:33 AM, Yuanhan Liu wrote: Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 1'. Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com --- src/mesa/drivers/dri/i965/brw_eu.c |7 +++ src/mesa/drivers/dri/i965/brw_eu.h |7 --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 12 +++- 3 files changed, 22 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.c b/src/mesa/drivers/dri/i965/brw_eu.c index 9b4dde8..7d206f3 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -174,6 +174,13 @@ void brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx) { p-brw = brw; + /* +* Set the initial instruction store array size to 1024, if found that +* isn't enough, then it will double the store size at brw_next_insn() +* until it meet the BRW_EU_MAX_INSN +*/ + p-store_size = 1024; + p-store = rzalloc_array(mem_ctx, struct brw_instruction, p-store_size); p-nr_insn = 0; p-current = p-stack; p-compressed = false; diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 9d3d7de..52567c2 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -100,11 +100,12 @@ struct brw_glsl_call; -#define BRW_EU_MAX_INSN_STACK 5 -#define BRW_EU_MAX_INSN 1 +#define BRW_EU_MAX_INSN_STACK 5 +#define BRW_EU_MAX_INSN (1024 * 1024) I'm actually surprised to see BRW_EU_MAX_INSN at all. As far as I know, there isn't an actual hardware limit on the number of instructions, Glad to know that. Thanks. so I'm not sure why we should cap it at all. Especially not to some arbitrary number. (I'm assuming that 1024 * 1024 is just something you came up with arbitrarily...) Aha, yes, you are right, I made it. :) Here is the fixed patch, please help to review it: From 66c30acdeae88cdba07ed85443b04d4bc6c56792 Mon Sep 17 00:00:00 2001 From: Yuanhan Liu yuanhan@linux.intel.com Date: Wed, 21 Dec 2011 15:38:44 +0800 Subject: [PATCH] i965: increase the brw eu instruction store size dynamically Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 1'. v2: comments from ken, do not hardcode the eu limit to (1024 * 1024) Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com --- src/mesa/drivers/dri/i965/brw_eu.c |7 +++ src/mesa/drivers/dri/i965/brw_eu.h |4 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 10 +- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.c b/src/mesa/drivers/dri/i965/brw_eu.c index 9b4dde8..2b0593a 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -174,6 +174,13 @@ void brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx) { p-brw = brw; + /* +* Set the initial instruction store array size to 1024, if found that +* isn't enough, then it will double the store size at brw_next_insn() +* until out of memory. +*/ + p-store_size = 1024; + p-store = rzalloc_array(mem_ctx, struct brw_instruction, p-store_size); p-nr_insn = 0; p-current = p-stack; p-compressed = false; diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index cc2f618..a41e988 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -101,10 +101,10 @@ struct brw_glsl_call; #define BRW_EU_MAX_INSN_STACK 5 -#define BRW_EU_MAX_INSN 1 struct brw_compile { - struct brw_instruction store[BRW_EU_MAX_INSN]; + struct brw_instruction *store; + int store_size; GLuint nr_insn; void *mem_ctx; diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 829d92c..9288f9b 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -691,7 +691,15 @@ brw_next_insn(struct brw_compile *p, GLuint opcode) { struct brw_instruction *insn; - assert(p-nr_insn + 1 BRW_EU_MAX_INSN); + if (p-nr_insn + 1 p-store_size) { + if (0) + printf(incresing the store size to %d\n, p-store_size 1); + p-store_size = 1; + p-store = reralloc(p-mem_ctx, p-store, + struct
Re: [Mesa-dev] [PATCH 7/8] i965: call next_insn() before referencing a instruction by index
On Thu, Dec 22, 2011 at 11:09:12AM -0800, Kenneth Graunke wrote: On 12/21/2011 01:33 AM, Yuanhan Liu wrote: [snip] + int emit_endif = 1; Please use bool and true/false rather than int. Yes, right. Will fix it. /* In single program flow mode, we can express IF and ELSE instructions * equivalently as ADD instructions that operate on IP. On platforms prior @@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p) * instructions to conditional ADDs. So we only do this trick on Gen4 and * Gen5. */ - if (intel-gen 6 p-single_program_flow) { + if (intel-gen 6 p-single_program_flow) + emit_endif = 0; You could actually just do this: /* In single program flow mode, we can express IF and ELSE ... */ bool emit_endif = !(intel-gen 6 p-single_program_flow); But I'm fine with bool emit_endif = true and emit_endif = false if you prefer that. Yes, I prefer that. From my point, in this case, with the comments, it can tell us why we can't emit endif clearly. Here is the fixed patch: From 7c8b8bc87846df9513a0c32cc8a388fb62f5476a Mon Sep 17 00:00:00 2001 From: Yuanhan Liu yuanhan@linux.intel.com Date: Wed, 21 Dec 2011 15:32:02 +0800 Subject: [PATCH] i965: call next_insn() before referencing a instruction by index A single next_insn may change the base address of instruction store memory(p-store), so call it first before referencing the instruction store pointer from an index. This the final prepare work to enable the dynamic store size. v2: comments from Ken, define emit_endif as bool type Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 40 --- 1 files changed, 26 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index b2ab013..843d12f 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -1197,15 +1197,7 @@ brw_ENDIF(struct brw_compile *p) struct brw_instruction *else_inst = NULL; struct brw_instruction *if_inst = NULL; struct brw_instruction *tmp; - - /* Pop the IF and (optional) ELSE instructions from the stack */ - p-if_depth_in_loop[p-loop_stack_depth]--; - tmp = pop_if_stack(p); - if (tmp-header.opcode == BRW_OPCODE_ELSE) { - else_inst = tmp; - tmp = pop_if_stack(p); - } - if_inst = tmp; + bool emit_endif = true; /* In single program flow mode, we can express IF and ELSE instructions * equivalently as ADD instructions that operate on IP. On platforms prior @@ -1219,14 +1211,32 @@ brw_ENDIF(struct brw_compile *p) * instructions to conditional ADDs. So we only do this trick on Gen4 and * Gen5. */ - if (intel-gen 6 p-single_program_flow) { + if (intel-gen 6 p-single_program_flow) + emit_endif = false; + + /* +* A single next_insn() may change the base adress of instruction store +* memory(p-store), so call it first before referencing the instruction +* store pointer from an index +*/ + if (emit_endif) + insn = next_insn(p, BRW_OPCODE_ENDIF); + + /* Pop the IF and (optional) ELSE instructions from the stack */ + p-if_depth_in_loop[p-loop_stack_depth]--; + tmp = pop_if_stack(p); + if (tmp-header.opcode == BRW_OPCODE_ELSE) { + else_inst = tmp; + tmp = pop_if_stack(p); + } + if_inst = tmp; + + if (!emit_endif) { /* ENDIF is useless; don't bother emitting it. */ convert_IF_ELSE_to_ADD(p, if_inst, else_inst); return; } - insn = next_insn(p, BRW_OPCODE_ENDIF); - if (intel-gen 6) { brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); @@ -1393,13 +1403,12 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) struct brw_instruction *insn, *do_insn; GLuint br = 1; - do_insn = get_inner_do_insn(p); - if (intel-gen = 5) br = 2; if (intel-gen = 7) { insn = next_insn(p, BRW_OPCODE_WHILE); + do_insn = get_inner_do_insn(p); brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); @@ -1409,6 +1418,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) insn-header.execution_size = BRW_EXECUTE_8; } else if (intel-gen == 6) { insn = next_insn(p, BRW_OPCODE_WHILE); + do_insn = get_inner_do_insn(p); brw_set_dest(p, insn, brw_imm_w(0)); insn-bits1.branch_gen6.jump_count = br * (do_insn - insn); @@ -1419,6 +1429,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) } else { if (p-single_program_flow) { insn = next_insn(p, BRW_OPCODE_ADD); + do_insn = get_inner_do_insn(p); brw_set_dest(p,
Re: [Mesa-dev] [PATCH 3/7] i965/gen7: Add register definitions for GL_EXT_transform_feedback.
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote: Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_defines.h | 76 ++- src/mesa/drivers/dri/intel/intel_reg.h | 15 ++ 2 files changed, 89 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 4edfaf7..4bb7f00 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1307,6 +1307,42 @@ enum brw_wm_barycentric_interp_mode { #define _3DSTATE_CONSTANT_HS 0x7819 /* GEN7+ */ #define _3DSTATE_CONSTANT_DS 0x781A /* GEN7+ */ +#define _3DSTATE_STREAMOUT0x781e /* GEN7+ */ +/* DW1 */ +# define SO_FUNCTION_ENABLE(1 31) +# define SO_RENDERING_DISABLE (1 30) +/* This selects which incoming rendering stream goes down the pipeline. The + * rendering stream is 0 if not defined by special cases in the GS state. + */ +# define SO_RENDER_STREAM_SELECT_SHIFT 27 +# define SO_RENDER_STREAM_SELECT_MASK INTEL_MASK(28, 27) +/* Controls reordering of TRISTRIP_* elements in stream output (not rendering). + */ +# define SO_REORDER_TRAILING (1 26) +/* Controls SO_NUM_PRIMS_WRITTEN_* and SO_PRIM_STORAGE_* */ +# define SO_STATISTICS_ENABLE (1 25) +# define SO_BUFFER_ENABLE_3(1 11) +# define SO_BUFFER_ENABLE_2(1 10) +# define SO_BUFFER_ENABLE_1(1 9) +# define SO_BUFFER_ENABLE_0(1 8) Considering how these are used in patch 6/7, I'd prefer if we did this: #define SO_BUFFER_ENABLE(n) (1 (8 + (n))) Then in patch 6/7 we could do dw1 |= SO_BUFFER_ENABLE(i); instead of dw1 |= SO_BUFFER_ENABLE_0 i; +/* DW2 */ +# define SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT 29 +# define SO_STREAM_3_VERTEX_READ_OFFSET_MASK INTEL_MASK(29, 29) +# define SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT 24 +# define SO_STREAM_3_VERTEX_READ_LENGTH_MASK INTEL_MASK(28, 24) +# define SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT 21 +# define SO_STREAM_2_VERTEX_READ_OFFSET_MASK INTEL_MASK(21, 21) +# define SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT 16 +# define SO_STREAM_2_VERTEX_READ_LENGTH_MASK INTEL_MASK(20, 16) +# define SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT 13 +# define SO_STREAM_1_VERTEX_READ_OFFSET_MASK INTEL_MASK(13, 13) +# define SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT 8 +# define SO_STREAM_1_VERTEX_READ_LENGTH_MASK INTEL_MASK(12, 8) +# define SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT 5 +# define SO_STREAM_0_VERTEX_READ_OFFSET_MASK INTEL_MASK(5, 5) +# define SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT 0 +# define SO_STREAM_0_VERTEX_READ_LENGTH_MASK INTEL_MASK(4, 0) + /* 3DSTATE_WM for Gen7 */ /* DW1 */ # define GEN7_WM_STATISTICS_ENABLE (1 31) @@ -1373,8 +1409,6 @@ enum brw_wm_barycentric_interp_mode { /* DW6: kernel 1 pointer */ /* DW7: kernel 2 pointer */ -#define _3DSTATE_STREAMOUT 0x781e /* GEN7+ */ - #define _3DSTATE_SAMPLE_MASK 0x7818 /* GEN6+ */ #define _3DSTATE_DRAWING_RECTANGLE 0x7900 @@ -1414,6 +1448,44 @@ enum brw_wm_barycentric_interp_mode { # define DEPTH_CLEAR_VALID (1 15) /* DW1: depth clear value */ +#define _3DSTATE_SO_DECL_LIST 0x7917 /* GEN7+ */ +/* DW1 */ +# define SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT 12 +# define SO_STREAM_TO_BUFFER_SELECTS_3_MASKINTEL_MASK(15, 12) +# define SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT 8 +# define SO_STREAM_TO_BUFFER_SELECTS_2_MASKINTEL_MASK(11, 8) +# define SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT 4 +# define SO_STREAM_TO_BUFFER_SELECTS_1_MASKINTEL_MASK(7, 4) +# define SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT 0 +# define SO_STREAM_TO_BUFFER_SELECTS_0_MASKINTEL_MASK(3, 0) +/* DW2 */ +# define SO_NUM_ENTRIES_3_SHIFT24 +# define SO_NUM_ENTRIES_3_MASK INTEL_MASK(31, 24) +# define SO_NUM_ENTRIES_2_SHIFT16 +# define SO_NUM_ENTRIES_2_MASK INTEL_MASK(23, 16) +# define SO_NUM_ENTRIES_1_SHIFT8 +# define SO_NUM_ENTRIES_1_MASK INTEL_MASK(15, 8) +# define SO_NUM_ENTRIES_0_SHIFT0 +# define SO_NUM_ENTRIES_0_MASK INTEL_MASK(7, 0) + +/* SO_DECL DW0 */ +# define SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT 12 +# define
Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote: Fixes almost all of the transform feedback piglit tests. Remaining are a few tests related to tesselation for quads/trifans/tristrips/polygons with flat shading. --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 199 ++- 1 files changed, 191 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 650f625..a5e28b6 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -32,31 +32,214 @@ #include brw_state.h #include brw_defines.h #include intel_batchbuffer.h +#include intel_buffer_objects.h static void -upload_sol_state(struct brw_context *brw) +upload_3dstate_so_buffers(struct brw_context *brw) +{ + struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + /* BRW_NEW_VERTEX_PROGRAM */ + const struct gl_shader_program *vs_prog = + ctx-Shader.CurrentVertexProgram; + const struct gl_transform_feedback_info *linked_xfb_info = + vs_prog-LinkedTransformFeedback; + struct gl_transform_feedback_object *xfb_obj = + ctx-TransformFeedback.CurrentObject; Can we have a /* NEW_TRANSFORM_FEEDBACK */ comment here? + int i; + + /* Set up the up to 4 output buffers. These are the ranges defined in the +* gl_transform_feedback_object. +*/ + for (i = 0; i 4; i++) { + struct gl_buffer_object *bufferobj = xfb_obj-Buffers[i]; + drm_intel_bo *bo; + uint32_t start, end; + + if (!xfb_obj-Buffers[i]) { +/* The pitch of 0 in this command indicates that the buffer is + * unbound and won't be written to. + */ +BEGIN_BATCH(4); +OUT_BATCH(_3DSTATE_SO_BUFFER 16 | (4 - 2)); +OUT_BATCH((i SO_BUFFER_INDEX_SHIFT)); +OUT_BATCH(0); +OUT_BATCH(0); +ADVANCE_BATCH(); + +continue; + } + + bo = intel_buffer_object(bufferobj)-buffer; + + start = xfb_obj-Offset[i]; + assert(start % 4 == 0); + end = ALIGN(start + xfb_obj-Size[i], 4); + assert(end = bo-size); + + BEGIN_BATCH(4); + OUT_BATCH(_3DSTATE_SO_BUFFER 16 | (4 - 2)); + OUT_BATCH((i SO_BUFFER_INDEX_SHIFT) | + ((linked_xfb_info-BufferStride[i] * 4) +SO_BUFFER_PITCH_SHIFT)); It looks like we're not setting SO Buffer Object Control State. Is that ok? I'm not too familiar with memory object control states so I'm not sure, but it seemed to me that it might be sensible to mark the stream output as L3 cacheable. + OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, start); + OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, end); + ADVANCE_BATCH(); + } +} + +/** + * Outputs the 3DSTATE_SO_DECL_LIST command. + * + * The data output is a series of 64-bit entries containing a SO_DECL per + * stream. We only have one stream of rendering coming out of the GS unit, so + * we only emit stream 0 (low 16 bits) SO_DECLs. + */ +static void +upload_3dstate_so_decl_list(struct brw_context *brw, + struct brw_vue_map *vue_map) +{ + struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + /* BRW_NEW_VERTEX_PROGRAM */ + const struct gl_shader_program *vs_prog = + ctx-Shader.CurrentVertexProgram; + /* NEW_TRANSFORM_FEEDBACK */ + const struct gl_transform_feedback_info *linked_xfb_info = + vs_prog-LinkedTransformFeedback; + int i; + uint16_t so_decl[128]; Can we add an assertion to verify that there is no danger of overflowing this array? I think STATIC_ASSERT(ARRAY_SIZE(so_decl) = MAX_PROGRAM_OUTPUTS) ought to do the trick. + int buffer_mask = 0; + int next_offset[4] = {0, 0, 0, 0}; + + /* Construct the list of SO_DECLs to be emitted. The formatting of the +* command is feels strange -- each dword pair contains a SO_DECL per stream. +*/ + for (i = 0; i linked_xfb_info-NumOutputs; i++) { + int buffer = linked_xfb_info-Outputs[i].OutputBuffer; + uint16_t decl = 0; + int vert_result = linked_xfb_info-Outputs[i].OutputRegister; + + buffer_mask |= 1 buffer; + + decl |= buffer SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT; + decl |= vue_map-vert_result_to_slot[vert_result] +SO_DECL_REGISTER_INDEX_SHIFT; + decl |= ((1 linked_xfb_info-Outputs[i].NumComponents) - 1) +SO_DECL_COMPONENT_MASK_SHIFT; + + /* FINISHME */ + assert(linked_xfb_info-Outputs[i].DstOffset == next_offset[buffer]); FYI, this assertion should hold true until we implement ARB_transfrom_feedback3 (which allows holes in the transform feedback structure). I think Marek has some plans to implement that for Gallium (not sure of his timeframe though), so we may want to keep an eye out.
Re: [Mesa-dev] i965/gen7 transform feedback
On 22 December 2011 16:54, Eric Anholt e...@anholt.net wrote: Here's today's patch series for gen7 transform feedback. It runs on top of a kernel patch at people.freedesktop.org:~anholt/linux on the gen7-reset-sol branch. I expected it to be easy, but not this easy. This is fantastic, Eric. I'm really pleased how quickly this is coming together. Other than a few minor comments that I've already sent out, the series is: Reviewed-by: Paul Berry stereotype...@gmail.com Remaining test failures: tessellation polygon flat_lastwarn tessellation quad_strip flat_last warn tessellation quads flat_last warn tessellation triangle_fan flat_first fail I'm sorry to hear that triangle_fan flat_first fails--both AMD and nVidia pass that test, so it's a bit embarrassing for Intel to fail it. But it's an obscure case (who flatshades trifans anyhow, especially when using transform feedback?) and I can't see any way of fixing it on Gen7 without firing up the GS, which seems like *way* overkill. I'm far less bothered by the 3 warnings, because nVidia gets those exact same warnings too. So IMHO, it's ok to leave these 4 tests failing. Incidentally, Gen6 gets the exact same 4 failures, plus 4 additional failures for tessellation triangle_strip. I'll try to fix tessellation triangle_strip tomorrow. Also, it looks like none of the current piglit tests test transform feedback across batchbuffers. I don't expect major problems there, given that we have a userland count of verts emitted. I have 3 more work days left in the year, so I'll try to implement those tests before you need them. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] i965/gen7: Add support for rasterization discard.
On 12/22/2011 06:22 PM, Paul Berry wrote: On 22 December 2011 16:54, Eric Anholt e...@anholt.net mailto:e...@anholt.net wrote: Fixes the piglit discard-* tests. Reviewed-by: Kenneth Graunke kenn...@whitecape.org mailto:kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_sol_state.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index fcda08d..650f625 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -37,6 +37,12 @@ static void upload_sol_state(struct brw_context *brw) { struct intel_context *intel = brw-intel; + struct gl_context *ctx = intel-ctx; + uint32_t dw1 = 0; + + /* _NEW_RASTERIZER_DISCARD */ + if (ctx-RasterDiscard) + dw1 |= SO_RENDERING_DISABLE; It looks like dw1 is set here but not used until patch 6/7. Oops. Yeah, good catch. Eric, perhaps just squash 5 and 6? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.
On Fri, Dec 23, 2011 at 4:22 AM, Paul Berry stereotype...@gmail.com wrote: FYI, this assertion should hold true until we implement ARB_transfrom_feedback3 (which allows holes in the transform feedback structure). I think Marek has some plans to implement that for Gallium (not sure of his timeframe though), so we may want to keep an eye out. Already done: http://cgit.freedesktop.org/~mareko/mesa/log/?h=transform-feedback3-instanced It's completely untested though. The next step is to write some tests and make any necessary modifications to pass them. I usually don't send untested stuff immediately for review. Any code worth publishing before being ready is available in my private repository. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] i965/gen7: Add support for transform feedback.
On 12/22/2011 04:54 PM, Eric Anholt wrote: Fixes almost all of the transform feedback piglit tests. Remaining are a few tests related to tesselation for quads/trifans/tristrips/polygons with flat shading. --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 199 ++- 1 files changed, 191 insertions(+), 8 deletions(-) The whole series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Really nice work. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically
On 12/22/2011 07:04 PM, Yuanhan Liu wrote: On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote: On 12/21/2011 01:33 AM, Yuanhan Liu wrote: [snip] -#define BRW_EU_MAX_INSN_STACK 5 -#define BRW_EU_MAX_INSN 1 +#define BRW_EU_MAX_INSN_STACK 5 +#define BRW_EU_MAX_INSN (1024 * 1024) I'm actually surprised to see BRW_EU_MAX_INSN at all. As far as I know, there isn't an actual hardware limit on the number of instructions, Glad to know that. Thanks. so I'm not sure why we should cap it at all. Especially not to some arbitrary number. (I'm assuming that 1024 * 1024 is just something you came up with arbitrarily...) Aha, yes, you are right, I made it. :) Here is the fixed patch, please help to review it: Reviewed-by: Kenneth Graunke kenn...@whitecape.org I'd wait for an ack from Eric before pushing, though. Thanks again! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] i965: increase the brw eu instruction store size dynamically
On Thu, Dec 22, 2011 at 07:51:46PM -0800, Kenneth Graunke wrote: On 12/22/2011 07:04 PM, Yuanhan Liu wrote: On Thu, Dec 22, 2011 at 02:33:03PM -0800, Kenneth Graunke wrote: On 12/21/2011 01:33 AM, Yuanhan Liu wrote: [snip] -#define BRW_EU_MAX_INSN_STACK 5 -#define BRW_EU_MAX_INSN 1 +#define BRW_EU_MAX_INSN_STACK 5 +#define BRW_EU_MAX_INSN (1024 * 1024) I'm actually surprised to see BRW_EU_MAX_INSN at all. As far as I know, there isn't an actual hardware limit on the number of instructions, Glad to know that. Thanks. so I'm not sure why we should cap it at all. Especially not to some arbitrary number. (I'm assuming that 1024 * 1024 is just something you came up with arbitrarily...) Aha, yes, you are right, I made it. :) Here is the fixed patch, please help to review it: Reviewed-by: Kenneth Graunke kenn...@whitecape.org I'd wait for an ack from Eric before pushing, though. It's OK to me. Eric, comments? Or, can I get your reviewed-by for this series? Thanks, Yuanhan Liu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: signal _NEW_ARRAY when transitioning between glBegin/End, glDrawArrays
Hi, On Thursday, December 22, 2011 18:30:44 Brian Paul wrote: I'm not sure if playback_vertex_list is more like DRAW_BEGIN_END or DRAW_ARRAYS. Maybe add a DRAW_DISPLAY_LIST enum value? It's more like begin/end I think. The begin/end code just sets the array state below the state tracking of the api function. And the way this happens is very much the same for save and draw when you look at the code that binds the vbos. To me it makes perfectly sense that the vbo_{save,exec}_draw just disturbs the vbo_array_draw path. So, probably a simple flag that marks if we were drawing by vbo_{save,exec}_draw.c the last time would do the job also. But if you think you want to distinguish this, go ahead... I also believe that this difference we are talking about will only trigger in very few untypical cases. Greetings Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev