[Mesa-dev] [PATCH] i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity. Signed-off-by: Vinson Lee --- src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp index d9ed27c..45072da 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp @@ -951,8 +951,8 @@ fs_generator::generate_pack_half_2x16_split(fs_inst *inst, { assert(intel->gen >= 7); assert(dst.type == BRW_REGISTER_TYPE_UD); - assert(x.type = BRW_REGISTER_TYPE_F); - assert(y.type = BRW_REGISTER_TYPE_F); + assert(x.type == BRW_REGISTER_TYPE_F); + assert(y.type == BRW_REGISTER_TYPE_F); /* From the Ivybridge PRM, Vol4, Part3, Section 6.27 f32to16: * -- 1.8.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.
On 01/25/2013 03:13 PM, Roland Scheidegger wrote: I'm quite sure there are g965 boards around which indeed support Pentium 4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are safe - not that the p4 had a usable cmov implementation as it was incredibly slow IIRC but it should at least work). Roland Sadly I think Roland is right here: the Intel 946GZ is a Gen4 chip that appears on motherboards which claim to support Pentium 4s. That's crazy...I've never heard of such a machine. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.
Roland Scheidegger writes: > I'm quite sure there are g965 boards around which indeed support Pentium > 4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are > safe - not that the p4 had a usable cmov implementation as it was > incredibly slow IIRC but it should at least work). It looks like everything you could put in a g965 had SSE3, but that *S*SSE3 is not covered. Sigh. -march=nocona -mtune=core2 I think should work. pgpOKOJnoM543.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel: Un-hardcode lengths from blitter commands.
The packet length may change at some point in the future. Specifying it explicitly (rather than hardcoding it in the command #define) allows us to change it much more easily in the future. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/intel/intel_blit.c | 8 src/mesa/drivers/dri/intel/intel_reg.h | 6 +++--- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_blit.c b/src/mesa/drivers/dri/intel/intel_blit.c index 4b86f0e..0946972 100644 --- a/src/mesa/drivers/dri/intel/intel_blit.c +++ b/src/mesa/drivers/dri/intel/intel_blit.c @@ -194,7 +194,7 @@ intelEmitCopyBlit(struct intel_context *intel, assert(dst_y < dst_y2); BEGIN_BATCH_BLT(8); - OUT_BATCH(CMD); + OUT_BATCH(CMD | (8 - 2)); OUT_BATCH(BR13 | (uint16_t)dst_pitch); OUT_BATCH((dst_y << 16) | dst_x); OUT_BATCH((dst_y2 << 16) | dst_x2); @@ -368,7 +368,7 @@ intelClearWithBlit(struct gl_context *ctx, GLbitfield mask) } BEGIN_BATCH_BLT(6); - OUT_BATCH(CMD); + OUT_BATCH(CMD | (6 - 2)); OUT_BATCH(BR13); OUT_BATCH((y1 << 16) | x1); OUT_BATCH((y2 << 16) | x2); @@ -445,7 +445,7 @@ intelEmitImmediateColorExpandBlit(struct intel_context *intel, blit_cmd |= XY_DST_TILED; BEGIN_BATCH_BLT(8 + 3); - OUT_BATCH(opcode); + OUT_BATCH(opcode | (8 - 2)); OUT_BATCH(br13); OUT_BATCH((0 << 16) | 0); /* clip x1, y1 */ OUT_BATCH((100 << 16) | 100); /* clip x2, y2 */ @@ -587,7 +587,7 @@ intel_set_teximage_alpha_to_one(struct gl_context *ctx, } BEGIN_BATCH_BLT(6); - OUT_BATCH(CMD); + OUT_BATCH(CMD | (6 - 2)); OUT_BATCH(BR13); OUT_BATCH((y1 << 16) | x1); OUT_BATCH((y2 << 16) | x2); diff --git a/src/mesa/drivers/dri/intel/intel_reg.h b/src/mesa/drivers/dri/intel/intel_reg.h index 53b1cb9..e4871eb 100644 --- a/src/mesa/drivers/dri/intel/intel_reg.h +++ b/src/mesa/drivers/dri/intel/intel_reg.h @@ -240,11 +240,11 @@ #define PRIM3D_DIB (0x9<<18) #define PRIM3D_MASK(0x1f<<18) -#define XY_SETUP_BLT_CMD (CMD_2D | (0x01 << 22) | 6) +#define XY_SETUP_BLT_CMD (CMD_2D | (0x01 << 22)) -#define XY_COLOR_BLT_CMD (CMD_2D | (0x50 << 22) | 4) +#define XY_COLOR_BLT_CMD (CMD_2D | (0x50 << 22)) -#define XY_SRC_COPY_BLT_CMD (CMD_2D | (0x53 << 22) | 6) +#define XY_SRC_COPY_BLT_CMD (CMD_2D | (0x53 << 22)) #define XY_TEXT_IMMEDIATE_BLIT_CMD (CMD_2D | (0x31 << 22)) # define XY_TEXT_BYTE_PACKED (1 << 16) -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 Alex Deucher changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from Alex Deucher --- Fixed by: http://cgit.freedesktop.org/mesa/mesa/commit/?id=264e6dad28e64755dc1580abdbb4e339c3439883 -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1
On Fri, Jan 25, 2013 at 6:02 PM, Marek Olšák wrote: > These extensions are not new in Mesa: > - ARB_base_instance (since 9.0) > - ARB_vertex_type_2_10_10_10_rev (since 8.0) > - OES_standard_derivatives (since 7.10, I think) Ah you're right. It was just i965 that added these. I'll drop them from the list. > Also, we don't have ARB_ES3_compatibility yet. We do (on i965), since today: e4f661afc89e6e7608edceb73528a5e54a147a85. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix up CP DMA for VM on cayman and TN
Reviewed-by: Marek Olšák Marek On Sat, Jan 26, 2013 at 12:49 AM, wrote: > From: Alex Deucher > > Need to add the virtual address. > > Signed-off-by: Alex Deucher > --- > src/gallium/drivers/r600/r600.h|4 ++-- > src/gallium/drivers/r600/r600_hw_context.c | 11 +++ > 2 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h > index 93604fb..06e914f 100644 > --- a/src/gallium/drivers/r600/r600.h > +++ b/src/gallium/drivers/r600/r600.h > @@ -172,8 +172,8 @@ void r600_context_streamout_end(struct r600_context *ctx); > void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean > count_draw_in); > void r600_context_block_emit_dirty(struct r600_context *ctx, struct > r600_block *block, unsigned pkt_flags); > void r600_cp_dma_copy_buffer(struct r600_context *rctx, > -struct pipe_resource *dst, unsigned dst_offset, > -struct pipe_resource *src, unsigned src_offset, > +struct pipe_resource *dst, unsigned long > dst_offset, > +struct pipe_resource *src, unsigned long > src_offset, > unsigned size); > > int evergreen_context_init(struct r600_context *ctx); > diff --git a/src/gallium/drivers/r600/r600_hw_context.c > b/src/gallium/drivers/r600/r600_hw_context.c > index caebf5c..e13b502 100644 > --- a/src/gallium/drivers/r600/r600_hw_context.c > +++ b/src/gallium/drivers/r600/r600_hw_context.c > @@ -1065,8 +1065,8 @@ void r600_context_streamout_end(struct r600_context > *ctx) > #define CP_DMA_MAX_BYTE_COUNT ((1 << 21) - 8) > > void r600_cp_dma_copy_buffer(struct r600_context *rctx, > -struct pipe_resource *dst, unsigned dst_offset, > -struct pipe_resource *src, unsigned src_offset, > +struct pipe_resource *dst, unsigned long > dst_offset, > +struct pipe_resource *src, unsigned long > src_offset, > unsigned size) > { > struct radeon_winsys_cs *cs = rctx->cs; > @@ -1079,6 +1079,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, > return; > } > > + dst_offset += r600_resource_va(&rctx->screen->screen, dst); > + src_offset += r600_resource_va(&rctx->screen->screen, src); > + > /* We flush the caches, because we might read from or write > * to resources which are bound right now. */ > rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES | > @@ -1112,9 +1115,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, > > r600_write_value(cs, PKT3(PKT3_CP_DMA, 4, 0)); > r600_write_value(cs, src_offset); /* SRC_ADDR_LO [31:0] > */ > - r600_write_value(cs, sync); /* CP_SYNC [31] | > SRC_ADDR_HI [7:0] */ > + r600_write_value(cs, sync | ((src_offset >> 32) & 0xff)); > /* CP_SYNC [31] | SRC_ADDR_HI [7:0] */ > r600_write_value(cs, dst_offset); /* DST_ADDR_LO [31:0] > */ > - r600_write_value(cs, 0);/* DST_ADDR_HI [7:0] > */ > + r600_write_value(cs, (dst_offset >> 32) & 0xff); > /* DST_ADDR_HI [7:0] */ > r600_write_value(cs, byte_count); /* COMMAND [29:22] | > BYTE_COUNT [20:0] */ > > r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); > -- > 1.7.7.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1
These extensions are not new in Mesa: - ARB_base_instance (since 9.0) - ARB_vertex_type_2_10_10_10_rev (since 8.0) - OES_standard_derivatives (since 7.10, I think) Also, we don't have ARB_ES3_compatibility yet. Marek On Sat, Jan 26, 2013 at 12:08 AM, Matt Turner wrote: > I did not list the *_get_program_binary extensions since they're not > useful to anyone with their current implementation (that supports 0 > binary formats). > --- > We should also write something about ES3 and the float-texture & S3TC > changes. > > docs/relnotes-9.1.html | 12 +++- > 1 files changed, 11 insertions(+), 1 deletions(-) > > diff --git a/docs/relnotes-9.1.html b/docs/relnotes-9.1.html > index ffca275..14e6c02 100644 > --- a/docs/relnotes-9.1.html > +++ b/docs/relnotes-9.1.html > @@ -44,9 +44,19 @@ Note: some of the new features are only available with > certain drivers. > > > > +GL_ANGLE_texture_compression_dxt3 > +GL_ANGLE_texture_compression_dxt5 > +GL_ARB_base_instance > +GL_ARB_ES3_compatibility > +GL_ARB_internalformat_query > GL_ARB_map_buffer_alignment > -GL_ARB_texture_cube_map_array > +GL_ARB_shading_language_packing > GL_ARB_texture_buffer_object_rgb32 > +GL_ARB_texture_cube_map_array > +GL_ARB_vertex_type_2_10_10_10_rev > +GL_EXT_color_buffer_float > +GL_OES_depth_texture_cube_map > +GL_OES_standard_derivatives > > > > -- > 1.7.8.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59877] Build fail since r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM
https://bugs.freedesktop.org/show_bug.cgi?id=59877 --- Comment #1 from Tom Stellard --- Created attachment 73664 --> https://bugs.freedesktop.org/attachment.cgi?id=73664&action=edit Possible fix Does this patch help? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] r600g: only emit gfx cmd when there is actual work in it
You forgot about fences and queries other than timestamp. All queries must be emitted even if there is no rendering between them (the GL spec says that if a query is busy, any later query must be busy too, and empty queries are allowed - we have piglit tests for all that). Anyway, I think this is not needed and it's also prone to errors as your patch shows. The current mechanism that prevents an empty CS from being emitted is sufficient. The CS flushing is skipped if: - cs->cdw == ctx->start_cs_cmd.num_dw in r600_context_flush, or - cs->cdw == 0 in radeon_drm_cs_flush Marek On Fri, Jan 25, 2013 at 6:50 PM, wrote: > From: Jerome Glisse > > Signed-off-by: Jerome Glisse > --- > src/gallium/drivers/r600/evergreen_compute.c | 2 ++ > src/gallium/drivers/r600/r600_hw_context.c | 1 + > src/gallium/drivers/r600/r600_pipe.c | 6 ++ > src/gallium/drivers/r600/r600_pipe.h | 1 + > src/gallium/drivers/r600/r600_query.c| 2 ++ > src/gallium/drivers/r600/r600_state_common.c | 1 + > 6 files changed, 13 insertions(+) > > diff --git a/src/gallium/drivers/r600/evergreen_compute.c > b/src/gallium/drivers/r600/evergreen_compute.c > index f4a7905..977595e 100644 > --- a/src/gallium/drivers/r600/evergreen_compute.c > +++ b/src/gallium/drivers/r600/evergreen_compute.c > @@ -308,6 +308,8 @@ static void evergreen_emit_direct_dispatch( > r600_write_value(cs, grid_layout[2]); > /* VGT_DISPATCH_INITIATOR = COMPUTE_SHADER_EN */ > r600_write_value(cs, 1); > + > + rctx->rings.gfx.cdraw++; > } > > static void compute_emit_cs(struct r600_context *ctx, const uint > *block_layout, > diff --git a/src/gallium/drivers/r600/r600_hw_context.c > b/src/gallium/drivers/r600/r600_hw_context.c > index d7518a5..511a276 100644 > --- a/src/gallium/drivers/r600/r600_hw_context.c > +++ b/src/gallium/drivers/r600/r600_hw_context.c > @@ -1122,6 +1122,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, > size -= byte_count; > src_offset += byte_count; > dst_offset += byte_count; > + rctx->rings.gfx.cdraw++; > } > } > > diff --git a/src/gallium/drivers/r600/r600_pipe.c > b/src/gallium/drivers/r600/r600_pipe.c > index 6767412..af08cff 100644 > --- a/src/gallium/drivers/r600/r600_pipe.c > +++ b/src/gallium/drivers/r600/r600_pipe.c > @@ -120,6 +120,10 @@ static void r600_flush(struct pipe_context *ctx, > unsigned flags) > struct pipe_query *render_cond = NULL; > unsigned render_cond_mode = 0; > > + if (!rctx->rings.gfx.cdraw) { > + return; > + } > + > rctx->rings.gfx.flushing = true; > /* Disable render condition. */ > if (rctx->current_render_cond) { > @@ -130,6 +134,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned > flags) > > r600_context_flush(rctx, flags); > rctx->rings.gfx.flushing = false; > + rctx->rings.gfx.cdraw = 0; > r600_begin_new_cs(rctx); > > /* Re-enable render condition. */ > @@ -387,6 +392,7 @@ static struct pipe_context *r600_create_context(struct > pipe_screen *screen, void > goto fail; > } > > + rctx->rings.gfx.cdraw = 0; > rctx->rings.gfx.cs = rctx->ws->cs_create(rctx->ws, RING_GFX); > rctx->rings.gfx.flush = r600_flush_gfx_ring; > rctx->ws->cs_set_flush_callback(rctx->rings.gfx.cs, > r600_flush_from_winsys, rctx); > diff --git a/src/gallium/drivers/r600/r600_pipe.h > b/src/gallium/drivers/r600/r600_pipe.h > index 31dcd05..5c72756 100644 > --- a/src/gallium/drivers/r600/r600_pipe.h > +++ b/src/gallium/drivers/r600/r600_pipe.h > @@ -418,6 +418,7 @@ struct r600_fetch_shader { > struct r600_ring { > struct radeon_winsys_cs *cs; > boolflushing; > + unsignedcdraw; > void (*flush)(void *ctx, unsigned flags); > }; > > diff --git a/src/gallium/drivers/r600/r600_query.c > b/src/gallium/drivers/r600/r600_query.c > index 0335189..7916f2d 100644 > --- a/src/gallium/drivers/r600/r600_query.c > +++ b/src/gallium/drivers/r600/r600_query.c > @@ -149,6 +149,7 @@ static void r600_emit_query_begin(struct r600_context > *ctx, struct r600_query *q > cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF); > cs->buf[cs->cdw++] = 0; > cs->buf[cs->cdw++] = 0; > + ctx->rings.gfx.cdraw++; > break; > default: > assert(0); > @@ -201,6 +202,7 @@ static void r600_emit_query_end(struct r600_context *ctx, > struct r600_query *que > cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF); > cs->buf[cs->cdw++] = 0; > cs->buf[cs->cdw++] = 0; > + ctx->rings.gfx.cdraw++; > break; > default: > assert(0); > diff --git a/src/gallium/drivers/r600/r600_state_commo
[Mesa-dev] [Bug 59880] New: piglit arb_uniform_buffer_object-dlist regression
https://bugs.freedesktop.org/show_bug.cgi?id=59880 Priority: medium Bug ID: 59880 Keywords: regression CC: i...@freedesktop.org Assignee: mesa-dev@lists.freedesktop.org Summary: piglit arb_uniform_buffer_object-dlist regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master) $ ./bin/arb_uniform_buffer_object-dlist -auto Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block index 1 >= 1) Mesa: User error: GL_INVALID_VALUE in glGetActiveUniformBlockiv(block index 1 >= 1) piglit/tests/spec/arb_uniform_buffer_object/dlist.c:129: Binding 1 should be 3, was 2 Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block index 1 >= 1) Mesa: User error: GL_INVALID_VALUE in glGetActiveUniformBlockiv(block index 1 >= 1) piglit/tests/spec/arb_uniform_buffer_object/dlist.c:137: Binding 1 should be 3, was 2 Unexpected GL error: GL_INVALID_VALUE 0x501 (Error at piglit/tests/spec/arb_uniform_buffer_object/dlist.c:148) PIGLIT: {'result': 'fail' } There are only 'skip'ped commits left to test. The first bad commit could be any of: 32f322925592e9eeda6a5624c7320232fc170c03 514f8c7ec7cc1ab18be93cebb5b9bf970b1955a9 f09d77b2af0e6e7553a1e2efca2f12fe2e4dcea8 22233da1ee4b59663966169759960c00c033d0e9 We cannot bisect more! bisect run cannot continue any more -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59879] New: reducing symbol visibility of shared objects / static libstdc++
https://bugs.freedesktop.org/show_bug.cgi?id=59879 Priority: medium Bug ID: 59879 Assignee: mesa-dev@lists.freedesktop.org Summary: reducing symbol visibility of shared objects / static libstdc++ Severity: normal Classification: Unclassified OS: Linux (All) Reporter: liquid.a...@gmx.net Hardware: x86-64 (AMD64) Status: NEW Version: unspecified Component: Mesa core Product: Mesa Hello, this is sort of cleaned up report of bug #37637. To quickly summarize what happens there: Build r600g with the llvm compiler backend and try starting ut2003. Segfault happens since apparantly ut's engine has a version of libstdc++ built in, which now clashes with the libstdc++ shared lib which either r600_dri.so or LLVM (when build as shared) loads. This is independent of preloading order. When the symbols from the system libstdc++ take preference, then the game engine crashes. When the game engine symbols take preference, the r600g driver initialization crashes. The fix for the problem: Since we can't modify the ut2003 binary, we have to hide the "duplicate" symbols somehow. This means: - build r600g with static llvm - build r600 with static libstdc++ - only make those symbols in r600_dri.so visible which are necessary Building r600g with static llvm is trivial. The symbol visibility can be properly handled by ld: http://sourceware.org/binutils/docs-2.21/ld/VERSION.html#VERSION My current dri-symbols.map: { global: __dri*; dri*; _glapi*; local: *; }; This hides everything except for the symbols matching __dri*, dri* and _glapi*. This can be potentially reduced even further. However it's not clear to me what the loader code in libGL really needs. This version-script'ing can be properly put into autotools language: http://www.gnu.org/software/gnulib/manual/html_node/LD-Version-Scripts.html What I'm struggling with is properly telling autotools to build a shared lib with static libstdc++. The gcc manpage mentions an option called "-static-libstdc++", but it doesn't seem to have any effect. Let's look at the critical calls (I've shortened them somewhat): OK, we're in src/gallium/targets/dri-r600. The last libtool call that the Makefile executes is the following one: bin/sh ../../../../libtool --tag=CXX --mode=link g++ -g -O2 -Wall -fno-strict-aliasing -fno-builtin-memcmp -module -avoid-version -shared -no-undefined -Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map -L/usr/lib64/llvm -lpthread -lffi -ldl -lm -o r600_dri.la -rpath /usr/local/lib/dri target.lo utils.lo dri_util.lo xmlconfig.lo -ldrm -lexpat -lm -lrt -lpthread -ldl -ldrm -ldrm_radeon libtool itself produces this call from it: g++ -fPIC -DPIC -shared -nostdlib -Wl,--whole-archive -Wl,--no-whole-archive -L/usr/lib64/llvm -lexpat -lrt -lpthread -ldl -ldrm -ldrm_radeon -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2 -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../.. -lstdc++ -lm -lc -lgcc_s -O2 -Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map -Wl,-soname -Wl,r600_dri.so -o .libs/r600_dri.so Notice the "-lstdc++", dynamic linking to libstdc++. What I'd like libtool (and eventually autotools) to produce is the following: g++ -fPIC -DPIC -shared -nostdlib -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic -Wl,--whole-archive -Wl,--no-whole-archive -L/usr/lib64/llvm -lexpat -lrt -lpthread -ldl -ldrm -ldrm_radeon -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2 -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../.. -lm -lc -lgcc_s -O2 -Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map -Wl,-soname -Wl,r600_dri.so -o .libs/r600_dri.so This is just removing "-lstdc++" and replacing it by "-Wl,-Bstatic -lstdc++ -Wl,-Bdynamic" at a different position (!). Putting it in front of the archive assembly seems to be critical. This links fine (no warnings, etc.) and produces a .so that is loaded properly by libGL's loader and (more importantly) works fine with ut2003. However it is still a mystery to me how to makes this clear to either libtool or autotools. Greets, Tobias -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59877] New: Build fail since r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM
https://bugs.freedesktop.org/show_bug.cgi?id=59877 Priority: medium Bug ID: 59877 Assignee: mesa-dev@lists.freedesktop.org Summary: Build fail since r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM Severity: normal Classification: Unclassified OS: Linux (All) Reporter: li...@andyfurniss.entadsl.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa make -k distclean git clean -dfx ./autogen.sh --prefix=/usr --disable-egl --enable-texture-float --enable-gallium-g3dvl --enable-r600-llvm-compiler --with-gallium-drivers=r600,swrast --with-dri-drivers= && make -j5 Making all in r600 make[4]: Entering directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/r600' CC r600_asm.lo In file included from r600_pipe.h:33:0, from r600_formats.h:5, from r600_asm.c:25: r600_llvm.h:7:25: fatal error: radeon_llvm.h: No such file or directory compilation terminated. make[4]: *** [r600_asm.lo] Error 1 make[4]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/r600' make[3]: *** [all-recursive] Error 1 make[3]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src' make: *** [all-recursive] Error 1 andy [ /mnt/sdb1/Src64/Mesa-git/mesa ]$ find ./ -name radeon_llvm.h ./src/gallium/drivers/radeon/radeon_llvm.h Reverting 264e6dad28e64755dc1580abdbb4e339c3439883 r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM will build OK (but not work due to undefined symbol) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59876] New: glGetTexLevelParameteriv broken for indirect rendering
https://bugs.freedesktop.org/show_bug.cgi?id=59876 Priority: medium Bug ID: 59876 Assignee: mesa-dev@lists.freedesktop.org Summary: glGetTexLevelParameteriv broken for indirect rendering Severity: normal Classification: Unclassified OS: All Reporter: gl...@gclements.plus.com Hardware: Other Status: NEW Version: 9.0 Component: GLX Product: Mesa With indirect rendering, glGetTexLevelParameteriv() is returning garbage, at least for GL_TEXTURE_WIDTH and GL_TEXTURE_HEIGHT. I don't know whether this is in libGL, XCB or the X server. I've tried it with several X servers (including Xorg, Xvnc, Xvfb, Cygwin's XWin.exe and Xming), but they're all based on the same underlying code base so that doesn't mean much. I managed to track it down as far as the USE_XCB branch of __indirect_glGetTexLevelParameteriv() in src/glx/indirect.c. The reply contains the following: > print *reply $7 = { response_type = 1 '\001', pad0 = 0 '\000', sequence = 65, length = 0, pad1 = "\000\000\000", n = 1, datum = 256, pad2 = "\230y'a\000\000\000\000\000\000\000" } As xcb_glx_get_tex_level_parameteriv_data_length(reply) (i.e. reply->n) is non-zero, it expects to find the value at xcb_glx_get_tex_level_parameteriv_data() (i.e. following the structure), but the correct value (256) is actually in reply->datum (which would have been used if reply->n was zero). -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59873] New: [swrast] piglit ext_framebuffer_multisample-interpolation 0 centroid-edges regression
https://bugs.freedesktop.org/show_bug.cgi?id=59873 Priority: medium Bug ID: 59873 Keywords: regression CC: bri...@vmware.com Assignee: mesa-dev@lists.freedesktop.org Summary: [swrast] piglit ext_framebuffer_multisample-interpolation 0 centroid-edges regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master) $ ./bin/ext_framebuffer_multisample-interpolation 0 centroid-edges -auto Probe at (70,86) Left: 1.00 1.00 1.00 1.00 Right: 0.00 0.00 1.00 1.00 PIGLIT: {'result': 'fail' } 728bf86a23f6de137c0871ea87b09e75e55468a9 is the first bad commit commit 728bf86a23f6de137c0871ea87b09e75e55468a9 Author: Brian Paul Date: Mon Jan 21 08:59:25 2013 -0700 swrast: move resampleRow setup code in blit_nearest() The resampleRow setup depends on pixelSize. For color buffers, we don't know the pixelSize until we're in the buffer loop. Move that code inside the loop. Fixes: http://bugs.freedesktop.org/show_bug.cgi?id=59541 Reviewed-by: José Fonseca :04 04 ab31ec7b1a500e0d1a18fed21d9aa50e1161e548 bbb140aa5922081d252b2cc4eea6f8ec113c4652 Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gles3: Update gl3.h
Reviewed-by: Jordan Justen On Fri, Jan 25, 2013 at 4:07 PM, Matt Turner wrote: > Contains a fix for Khronos bug 9557. > --- > include/GLES3/gl3.h |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/GLES3/gl3.h b/include/GLES3/gl3.h > index b9399e9..09f2b53 100644 > --- a/include/GLES3/gl3.h > +++ b/include/GLES3/gl3.h > @@ -2,7 +2,7 @@ > #define __gl3_h_ > > /* > - * gl3.h last updated on $Date: 2012-09-12 10:13:02 -0700 (Wed, 12 Sep 2012) > $ > + * gl3.h last updated on $Date: 2012-10-03 07:52:40 -0700 (Wed, 03 Oct 2012) > $ > */ > > #include > @@ -796,7 +796,7 @@ typedef struct __GLsync *GLsync; > #define GL_TEXTURE_IMMUTABLE_FORMAT 0x912F > #define GL_MAX_ELEMENT_INDEX 0x8D6B > #define GL_NUM_SAMPLE_COUNTS 0x9380 > -#define GL_TEXTURE_IMMUTABLE_LEVELS 0x8D63 > +#define GL_TEXTURE_IMMUTABLE_LEVELS 0x82DF > > /*- > * Entrypoint definitions > -- > 1.7.8.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] gallivm, draw, llvmpipe: mass rename of unit->texture_unit/sampler_unit
From: Roland Scheidegger Make it obvious what "unit" this is (no change in functionality). draw still uses "unit" in places where it changes the shader by adding texture sampling itself - it seems like this can't work with shaders using dx10-style sample opcodes (can't mix gl-style and dx10-style sample instructions in a shader). --- src/gallium/auxiliary/draw/draw_llvm_sample.c | 32 - src/gallium/auxiliary/gallivm/lp_bld_sample.c | 18 +++--- src/gallium/auxiliary/gallivm/lp_bld_sample.h | 32 - src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_aos.h |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 72 ++--- src/gallium/drivers/llvmpipe/lp_tex_sample.c | 32 - 7 files changed, 95 insertions(+), 95 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c b/src/gallium/auxiliary/draw/draw_llvm_sample.c index ac1c031..3f866d4 100644 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c @@ -86,7 +86,7 @@ struct draw_llvm_sampler_soa static LLVMValueRef draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base, struct gallivm_state *gallivm, - unsigned unit, + unsigned texture_unit, unsigned member_index, const char *member_name, boolean emit_load) @@ -98,14 +98,14 @@ draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base, LLVMValueRef ptr; LLVMValueRef res; - debug_assert(unit < PIPE_MAX_SHADER_SAMPLER_VIEWS); + debug_assert(texture_unit < PIPE_MAX_SHADER_SAMPLER_VIEWS); /* context[0] */ indices[0] = lp_build_const_int32(gallivm, 0); /* context[0].textures */ indices[1] = lp_build_const_int32(gallivm, DRAW_JIT_CTX_TEXTURES); /* context[0].textures[unit] */ - indices[2] = lp_build_const_int32(gallivm, unit); + indices[2] = lp_build_const_int32(gallivm, texture_unit); /* context[0].textures[unit].member */ indices[3] = lp_build_const_int32(gallivm, member_index); @@ -116,7 +116,7 @@ draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base, else res = ptr; - lp_build_name(res, "context.texture%u.%s", unit, member_name); + lp_build_name(res, "context.texture%u.%s", texture_unit, member_name); return res; } @@ -133,7 +133,7 @@ draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base, static LLVMValueRef draw_llvm_sampler_member(const struct lp_sampler_dynamic_state *base, struct gallivm_state *gallivm, - unsigned unit, + unsigned sampler_unit, unsigned member_index, const char *member_name, boolean emit_load) @@ -145,14 +145,14 @@ draw_llvm_sampler_member(const struct lp_sampler_dynamic_state *base, LLVMValueRef ptr; LLVMValueRef res; - debug_assert(unit < PIPE_MAX_SAMPLERS); + debug_assert(sampler_unit < PIPE_MAX_SAMPLERS); /* context[0] */ indices[0] = lp_build_const_int32(gallivm, 0); /* context[0].samplers */ indices[1] = lp_build_const_int32(gallivm, DRAW_JIT_CTX_SAMPLERS); /* context[0].samplers[unit] */ - indices[2] = lp_build_const_int32(gallivm, unit); + indices[2] = lp_build_const_int32(gallivm, sampler_unit); /* context[0].samplers[unit].member */ indices[3] = lp_build_const_int32(gallivm, member_index); @@ -163,7 +163,7 @@ draw_llvm_sampler_member(const struct lp_sampler_dynamic_state *base, else res = ptr; - lp_build_name(res, "context.sampler%u.%s", unit, member_name); + lp_build_name(res, "context.sampler%u.%s", sampler_unit, member_name); return res; } @@ -182,9 +182,9 @@ draw_llvm_sampler_member(const struct lp_sampler_dynamic_state *base, static LLVMValueRef \ draw_llvm_texture_##_name( const struct lp_sampler_dynamic_state *base, \ struct gallivm_state *gallivm, \ - unsigned unit)\ + unsigned texture_unit) \ { \ - return draw_llvm_texture_member(base, gallivm, unit, _index, #_name, _emit_load ); \ + return draw_llvm_texture_member(base, gallivm, texture_unit, _index, #_name, _emit_load ); \ } @@ -203,9 +203,9 @@ DRAW_LLVM_TEXTURE_MEMBER(mip_offsets, DRAW_JIT_TEXTURE_MIP_OFFSETS, FALSE) static LLVMValueRef \ draw_llvm_sampler_##_name( const struct lp_sampler_dynamic_state *base, \ struct gallivm_state *gallivm, \ - unsigned unit)\ + unsigned sampler_unit)
[Mesa-dev] [Bug 59872] New: [swrast] piglit depth_texture_mode_and_swizzle regression
https://bugs.freedesktop.org/show_bug.cgi?id=59872 Priority: medium Bug ID: 59872 Keywords: regression CC: cwo...@cworth.org Assignee: mesa-dev@lists.freedesktop.org Summary: [swrast] piglit depth_texture_mode_and_swizzle regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master) $ ./bin/depth_texture_mode_and_swizzle -auto Probe at (10,10) Expected: 0.50 0.50 0.50 0.50 Observed: 0.501961 0.501961 0.501961 1.00 Probe at (30,10) Expected: 1.00 0.50 0.50 0.50 Observed: 1.00 0.501961 0.501961 1.00 Probe at (130,10) Expected: 0.00 0.00 0.00 0.50 Observed: 0.00 0.00 0.00 1.00 Probe at (150,10) Expected: 1.00 0.00 0.50 0.00 Observed: 1.00 0.00 0.501961 1.00 PIGLIT: {'result': 'fail' } 570ed2be7d776211e1ca2a7a4c44ee6a1d141714 is the first bad commit commit 570ed2be7d776211e1ca2a7a4c44ee6a1d141714 Author: Carl Worth Date: Mon Jan 21 12:16:27 2013 -0800 ReadPixels: Force ALPHA to 1 while rebasing RGBA values for GL_RGB format When performing a ReadPixels operation, we may be reading from a buffer that stores alpha values, but that is actually representing a buffer with no alpha channel. In this case, while rebasing the values, touch up all alpha values read to 1.0. This commit fixes the following piglit (sub) tests: ARB_texture_float/fbo-colormask-formats GL_RBG16F_ARB EXT_texture_snorm/fbo-colormask-formats GL_RGB16_SNORM GL_RGB8_SNORM GL_RGB_SNORM It likely improves the results of other tests as well, but a PASS remains elusive due to additional bugs. Reviewed-by: Brian Paul Reviewed-by: Anuj Phogat :04 04 144369a7d3779929bad84beca8f3a5b2ccf90640 c25eb37e73f6f6e5435230fe8a799b1b62ed347b Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.
On Thu, Jan 24, 2013 at 7:33 PM, Eric Anholt wrote: > While most of our development and testing is on x86-64, some of our > major consumers of the driver are on i386 still. This meant they aren't > taking advantage of SSE for floating point math or cmov instructions, > unless the user went out of their way to choose a -march flag > (unlikely). Given that the driver can only get probed on i965 and newer > chipsets, which only support core2 and above CPUs, this is safe. > > Improves (32-bit) GLbenchmark 2.1 offscreen performance by .76 +/- 0.35% > (n=19) > --- > configure.ac | 17 + > src/mesa/drivers/dri/i965/Makefile.am |3 ++- > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/configure.ac b/configure.ac > index e769eda..0af3176 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -492,6 +492,23 @@ if test "x$enable_asm" = xyes; then > fi > AC_SUBST([MESA_ASM_FILES]) > > +# If the user hasn't set an explicit -march flag, then autodetect a few for > +# use by the i965 driver. > +if echo $CFLAGS | grep -v march > /dev/null; then > +case "$host_cpu" in > +i?86 | x86_64) > +save_CFLAGS="$CFLAGS" > +AC_MSG_CHECKING([whether $CC supports -march=core2]) > +CFLAGS="$save_CFLAGS -march=core2" > +AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [[]])], > + [AC_MSG_RESULT([yes]); > MARCH_CORE2="-march=core2"], > + [AC_MSG_RESULT([no]); MARCH_CORE2=""]) > +CFLAGS="$save_CFLAGS" > +;; > +esac > +fi > +AC_SUBST([MARCH_CORE2]) > + > dnl Check to see if dlopen is in default libraries (like Solaris, which > dnl has it in libc), or if libdl is needed to get it. > AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"], > diff --git a/src/mesa/drivers/dri/i965/Makefile.am > b/src/mesa/drivers/dri/i965/Makefile.am > index dc140df..d5d0631 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.am > +++ b/src/mesa/drivers/dri/i965/Makefile.am > @@ -38,7 +38,8 @@ AM_CFLAGS = \ > $(DEFINES) \ > $(API_DEFINES) \ > $(VISIBILITY_CFLAGS) \ > - $(INTEL_CFLAGS) > + $(INTEL_CFLAGS) \ > + $(MARCH_CORE2) > > AM_CXXFLAGS = $(AM_CFLAGS) > > -- > 1.7.10.4 Nice. Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gles3: Update gl3.h
Contains a fix for Khronos bug 9557. --- include/GLES3/gl3.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/GLES3/gl3.h b/include/GLES3/gl3.h index b9399e9..09f2b53 100644 --- a/include/GLES3/gl3.h +++ b/include/GLES3/gl3.h @@ -2,7 +2,7 @@ #define __gl3_h_ /* - * gl3.h last updated on $Date: 2012-09-12 10:13:02 -0700 (Wed, 12 Sep 2012) $ + * gl3.h last updated on $Date: 2012-10-03 07:52:40 -0700 (Wed, 03 Oct 2012) $ */ #include @@ -796,7 +796,7 @@ typedef struct __GLsync *GLsync; #define GL_TEXTURE_IMMUTABLE_FORMAT 0x912F #define GL_MAX_ELEMENT_INDEX 0x8D6B #define GL_NUM_SAMPLE_COUNTS 0x9380 -#define GL_TEXTURE_IMMUTABLE_LEVELS 0x8D63 +#define GL_TEXTURE_IMMUTABLE_LEVELS 0x82DF /*- * Entrypoint definitions -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: fix up CP DMA for VM on cayman and TN
From: Alex Deucher Need to add the virtual address. Signed-off-by: Alex Deucher --- src/gallium/drivers/r600/r600.h|4 ++-- src/gallium/drivers/r600/r600_hw_context.c | 11 +++ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h index 93604fb..06e914f 100644 --- a/src/gallium/drivers/r600/r600.h +++ b/src/gallium/drivers/r600/r600.h @@ -172,8 +172,8 @@ void r600_context_streamout_end(struct r600_context *ctx); void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in); void r600_context_block_emit_dirty(struct r600_context *ctx, struct r600_block *block, unsigned pkt_flags); void r600_cp_dma_copy_buffer(struct r600_context *rctx, -struct pipe_resource *dst, unsigned dst_offset, -struct pipe_resource *src, unsigned src_offset, +struct pipe_resource *dst, unsigned long dst_offset, +struct pipe_resource *src, unsigned long src_offset, unsigned size); int evergreen_context_init(struct r600_context *ctx); diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index caebf5c..e13b502 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1065,8 +1065,8 @@ void r600_context_streamout_end(struct r600_context *ctx) #define CP_DMA_MAX_BYTE_COUNT ((1 << 21) - 8) void r600_cp_dma_copy_buffer(struct r600_context *rctx, -struct pipe_resource *dst, unsigned dst_offset, -struct pipe_resource *src, unsigned src_offset, +struct pipe_resource *dst, unsigned long dst_offset, +struct pipe_resource *src, unsigned long src_offset, unsigned size) { struct radeon_winsys_cs *cs = rctx->cs; @@ -1079,6 +1079,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, return; } + dst_offset += r600_resource_va(&rctx->screen->screen, dst); + src_offset += r600_resource_va(&rctx->screen->screen, src); + /* We flush the caches, because we might read from or write * to resources which are bound right now. */ rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES | @@ -1112,9 +1115,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, r600_write_value(cs, PKT3(PKT3_CP_DMA, 4, 0)); r600_write_value(cs, src_offset); /* SRC_ADDR_LO [31:0] */ - r600_write_value(cs, sync); /* CP_SYNC [31] | SRC_ADDR_HI [7:0] */ + r600_write_value(cs, sync | ((src_offset >> 32) & 0xff)); /* CP_SYNC [31] | SRC_ADDR_HI [7:0] */ r600_write_value(cs, dst_offset); /* DST_ADDR_LO [31:0] */ - r600_write_value(cs, 0);/* DST_ADDR_HI [7:0] */ + r600_write_value(cs, (dst_offset >> 32) & 0xff); /* DST_ADDR_HI [7:0] */ r600_write_value(cs, byte_count); /* COMMAND [29:22] | BYTE_COUNT [20:0] */ r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); -- 1.7.7.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Trying MSAAx2 (r300g) on RS690/AMD Radeon X1200 128MB
When trying glxgears the screen locks up, and SSH eventually stops responding as well, but I was able to get these messages from kern.log: [ 790.516059] radeon :01:05.0: >GPU lockup CP stall for more than 1msec [ 790.516076] radeon :01:05.0: >GPU lockup (waiting for 0x215b last fence id 0x2157) [ 790.664495] radeon: wait for empty RBBM fifo failed ! Bad things might happen. [ 790.793829] Failed to wait GUI idle while programming pipes. Bad things might happen. [ 790.794831] radeon :01:05.0: >(rs600_asic_reset:357) RBBM_STATUS=0x9411C100 [ 791.292885] radeon :01:05.0: >(rs600_asic_reset:377) RBBM_STATUS=0x9401C100 [ 791.789934] radeon :01:05.0: >(rs600_asic_reset:385) RBBM_STATUS=0x9400C100 Testing as requested in commit 8ed6b1400. Using the oibaf PPA on a Quantal Lubuntu LiveUSB. I just upgraded mesa from oibaf repo this second time, but it still crashed when I did everything (Xorg/libdrm). Unfortunately, I can only do live testing on this machine and using the persistence file didn't seem to work with the big Xorg changes required, so upgrading mesa will be done fresh for every test. Thanks, Bryan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600: Fold remaining CONST_COPY after expand pseudo inst
--- lib/Target/R600/AMDGPUTargetMachine.cpp | 2 +- lib/Target/R600/R600LowerConstCopy.cpp | 170 +--- 2 files changed, 160 insertions(+), 12 deletions(-) diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp b/lib/Target/R600/AMDGPUTargetMachine.cpp index 7b069e7..2185be3 100644 --- a/lib/Target/R600/AMDGPUTargetMachine.cpp +++ b/lib/Target/R600/AMDGPUTargetMachine.cpp @@ -136,8 +136,8 @@ bool AMDGPUPassConfig::addPreEmitPass() { addPass(createAMDGPUCFGPreparationPass(*TM)); addPass(createAMDGPUCFGStructurizerPass(*TM)); addPass(createR600ExpandSpecialInstrsPass(*TM)); -addPass(createR600LowerConstCopy(*TM)); addPass(&FinalizeMachineBundlesID); +addPass(createR600LowerConstCopy(*TM)); } else { addPass(createSILowerLiteralConstantsPass(*TM)); addPass(createSILowerControlFlowPass(*TM)); diff --git a/lib/Target/R600/R600LowerConstCopy.cpp b/lib/Target/R600/R600LowerConstCopy.cpp index d14ae20..2557e8f 100644 --- a/lib/Target/R600/R600LowerConstCopy.cpp +++ b/lib/Target/R600/R600LowerConstCopy.cpp @@ -13,7 +13,6 @@ /// fold them inside vector instruction, like DOT4 or Cube ; ISel emits /// ConstCopy instead. This pass (executed after ExpandingSpecialInstr) will try /// to fold them if possible or replace them by MOV otherwise. -/// TODO : Implement the folding part, using Copy Propagation algorithm. // //===--===// @@ -30,6 +29,13 @@ class R600LowerConstCopy : public MachineFunctionPass { private: static char ID; const R600InstrInfo *TII; + + struct ConstPairs { +unsigned XYPair; +unsigned ZWPair; + }; + + bool canFoldInBundle(ConstPairs &UsedConst, unsigned ReadConst) const; public: R600LowerConstCopy(TargetMachine &tm); virtual bool runOnMachineFunction(MachineFunction &MF); @@ -39,27 +45,169 @@ public: char R600LowerConstCopy::ID = 0; - R600LowerConstCopy::R600LowerConstCopy(TargetMachine &tm) : MachineFunctionPass(ID), TII (static_cast(tm.getInstrInfo())) { } +bool R600LowerConstCopy::canFoldInBundle(ConstPairs &UsedConst, +unsigned ReadConst) const { + unsigned ReadConstChan = ReadConst & 3; + unsigned ReadConstIndex = ReadConst & (~3); + if (ReadConstChan < 2) { +if (!UsedConst.XYPair) { + UsedConst.XYPair = ReadConstIndex; +} +return UsedConst.XYPair == ReadConstIndex; + } else { +if (!UsedConst.ZWPair) { + UsedConst.ZWPair = ReadConstIndex; +} +return UsedConst.ZWPair == ReadConstIndex; + } +} + +static bool isControlFlow(const MachineInstr &MI) { + return (MI.getOpcode() == AMDGPU::IF_PREDICATE_SET) || + (MI.getOpcode() == AMDGPU::ENDIF) || + (MI.getOpcode() == AMDGPU::ELSE) || + (MI.getOpcode() == AMDGPU::WHILELOOP) || + (MI.getOpcode() == AMDGPU::BREAK); +} + bool R600LowerConstCopy::runOnMachineFunction(MachineFunction &MF) { + for (MachineFunction::iterator BB = MF.begin(), BB_E = MF.end(); BB != BB_E; ++BB) { MachineBasicBlock &MBB = *BB; -for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end(); - I != E;) { - MachineInstr &MI = *I; - I = llvm::next(I); - if (MI.getOpcode() != AMDGPU::CONST_COPY) +DenseMap RegToConstIndex; +for (MachineBasicBlock::instr_iterator I = MBB.instr_begin(), +E = MBB.instr_end(); I != E;) { + + if (I->getOpcode() == AMDGPU::CONST_COPY) { +MachineInstr &MI = *I; +I = llvm::next(I); +unsigned DstReg = MI.getOperand(0).getReg(); +DenseMap::iterator SrcMI = +RegToConstIndex.find(DstReg); +if (SrcMI != RegToConstIndex.end()) { + SrcMI->second->eraseFromParent(); + RegToConstIndex.erase(SrcMI); +} +MachineInstr *NewMI = +TII->buildDefaultInstruction(MBB, &MI, AMDGPU::MOV, +MI.getOperand(0).getReg(), AMDGPU::ALU_CONST); +TII->setImmOperand(NewMI, R600Operands::SRC0_SEL, +MI.getOperand(1).getImm()); +RegToConstIndex[DstReg] = NewMI; +MI.eraseFromParent(); continue; - MachineInstr *NewMI = TII->buildDefaultInstruction(MBB, I, AMDGPU::MOV, - MI.getOperand(0).getReg(), AMDGPU::ALU_CONST); - NewMI->getOperand(9).setImm(MI.getOperand(1).getImm()); - MI.eraseFromParent(); + } + + std::vector Defs; + // We consider all Instructions as bundled because algorithm that handle + // const read port limitations inside an IG is still valid with single + // instructions. + std::vector Bundle; + + if (I->isBundle()) { +unsigned BundleSize = I->getBundleSize(); +for (unsigned i = 0; i < BundleSize; i++) { + I = llvm::next(I); + Bundle.push_back(I); +} + } else if (TII->isALUInstr(I->getOpcode())){ +Bundle.push_back(I); + } el
Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.
I'm quite sure there are g965 boards around which indeed support Pentium 4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are safe - not that the p4 had a usable cmov implementation as it was incredibly slow IIRC but it should at least work). Roland Am 25.01.2013 04:33, schrieb Eric Anholt: > While most of our development and testing is on x86-64, some of our > major consumers of the driver are on i386 still. This meant they aren't > taking advantage of SSE for floating point math or cmov instructions, > unless the user went out of their way to choose a -march flag > (unlikely). Given that the driver can only get probed on i965 and newer > chipsets, which only support core2 and above CPUs, this is safe. > > Improves (32-bit) GLbenchmark 2.1 offscreen performance by .76 +/- 0.35% > (n=19) > --- > configure.ac | 17 + > src/mesa/drivers/dri/i965/Makefile.am |3 ++- > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/configure.ac b/configure.ac > index e769eda..0af3176 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -492,6 +492,23 @@ if test "x$enable_asm" = xyes; then > fi > AC_SUBST([MESA_ASM_FILES]) > > +# If the user hasn't set an explicit -march flag, then autodetect a few for > +# use by the i965 driver. > +if echo $CFLAGS | grep -v march > /dev/null; then > +case "$host_cpu" in > +i?86 | x86_64) > +save_CFLAGS="$CFLAGS" > +AC_MSG_CHECKING([whether $CC supports -march=core2]) > +CFLAGS="$save_CFLAGS -march=core2" > +AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [[]])], > + [AC_MSG_RESULT([yes]); > MARCH_CORE2="-march=core2"], > + [AC_MSG_RESULT([no]); MARCH_CORE2=""]) > +CFLAGS="$save_CFLAGS" > +;; > +esac > +fi > +AC_SUBST([MARCH_CORE2]) > + > dnl Check to see if dlopen is in default libraries (like Solaris, which > dnl has it in libc), or if libdl is needed to get it. > AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"], > diff --git a/src/mesa/drivers/dri/i965/Makefile.am > b/src/mesa/drivers/dri/i965/Makefile.am > index dc140df..d5d0631 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.am > +++ b/src/mesa/drivers/dri/i965/Makefile.am > @@ -38,7 +38,8 @@ AM_CFLAGS = \ > $(DEFINES) \ > $(API_DEFINES) \ > $(VISIBILITY_CFLAGS) \ > - $(INTEL_CFLAGS) > + $(INTEL_CFLAGS) \ > + $(MARCH_CORE2) > > AM_CXXFLAGS = $(AM_CFLAGS) > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not useful to anyone with their current implementation (that supports 0 binary formats). --- We should also write something about ES3 and the float-texture & S3TC changes. docs/relnotes-9.1.html | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/docs/relnotes-9.1.html b/docs/relnotes-9.1.html index ffca275..14e6c02 100644 --- a/docs/relnotes-9.1.html +++ b/docs/relnotes-9.1.html @@ -44,9 +44,19 @@ Note: some of the new features are only available with certain drivers. +GL_ANGLE_texture_compression_dxt3 +GL_ANGLE_texture_compression_dxt5 +GL_ARB_base_instance +GL_ARB_ES3_compatibility +GL_ARB_internalformat_query GL_ARB_map_buffer_alignment -GL_ARB_texture_cube_map_array +GL_ARB_shading_language_packing GL_ARB_texture_buffer_object_rgb32 +GL_ARB_texture_cube_map_array +GL_ARB_vertex_type_2_10_10_10_rev +GL_EXT_color_buffer_float +GL_OES_depth_texture_cube_map +GL_OES_standard_derivatives -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59851] AC_ARG_WITH misusage leading to mesa configure failure
https://bugs.freedesktop.org/show_bug.cgi?id=59851 Matt Turner changed: What|Removed |Added CC||tstel...@gmail.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On Thu, Jan 24, 2013 at 7:44 PM, Matt Turner wrote: > Following this email are eight patches that add the 4x8 pack/unpack > operations that are the difference between what GLSL ES 3.0 and > ARB_shading_language_packing require. > > They require Chad's gles3-glsl-packing series and are available at > http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing > > I've also added testing support on top of Chad's piglit patch. The > {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to > spot why. > > Please give it a look. I'd be nice to get this into 9.1. > > Thanks, > Matt Thanks for all the review comments. I've fixed the problems spotted and pushed. The piglit patch is getting a second look-over before it's pushed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] r600g: add async for staging buffer upload v2
From: Jerome Glisse v2: Add virtual address to dma src/dst offset for cayman Signed-off-by: Jerome Glisse --- src/gallium/drivers/r600/evergreen_hw_context.c | 46 ++ src/gallium/drivers/r600/evergreen_state.c | 201 src/gallium/drivers/r600/evergreend.h | 15 ++ src/gallium/drivers/r600/r600.h | 27 src/gallium/drivers/r600/r600_buffer.c | 25 ++- src/gallium/drivers/r600/r600_hw_context.c | 48 +- src/gallium/drivers/r600/r600_pipe.c| 6 +- src/gallium/drivers/r600/r600_pipe.h| 9 ++ src/gallium/drivers/r600/r600_state.c | 190 ++ src/gallium/drivers/r600/r600_state_common.c| 6 +- src/gallium/drivers/r600/r600_texture.c | 24 ++- src/gallium/drivers/r600/r600d.h| 15 ++ 12 files changed, 595 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c b/src/gallium/drivers/r600/evergreen_hw_context.c index fa90c9a..ca4f4b3 100644 --- a/src/gallium/drivers/r600/evergreen_hw_context.c +++ b/src/gallium/drivers/r600/evergreen_hw_context.c @@ -26,6 +26,7 @@ #include "r600_hw_context_priv.h" #include "evergreend.h" #include "util/u_memory.h" +#include "util/u_math.h" static const struct r600_reg cayman_config_reg_list[] = { {R_009100_SPI_CONFIG_CNTL, REG_FLAG_ENABLE_ALWAYS | REG_FLAG_FLUSH_CHANGE, 0}, @@ -238,3 +239,48 @@ void evergreen_set_streamout_enable(struct r600_context *ctx, unsigned buffer_en r600_write_context_reg(cs, R_028B94_VGT_STRMOUT_CONFIG, S_028B94_STREAMOUT_0_EN(0)); } } + +void evergreen_dma_copy(struct r600_context *rctx, + struct pipe_resource *dst, + struct pipe_resource *src, + unsigned long dst_offset, + unsigned long src_offset, + unsigned long size) +{ + struct radeon_winsys_cs *cs = rctx->rings.dma.cs; + unsigned i, ncopy, csize, sub_cmd, shift; + struct r600_resource *rdst = (struct r600_resource*)dst; + struct r600_resource *rsrc = (struct r600_resource*)src; + + /* make sure that the dma ring is only one active */ + rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC); + dst_offset += r600_resource_va(&rctx->screen->screen, dst); + src_offset += r600_resource_va(&rctx->screen->screen, src); + + /* see if we use dword or byte copy */ + if (!(dst_offset & 0x3) && !(src_offset & 0x3) && !(size & 0x3)) { + size >>= 2; + sub_cmd = 0x00; + shift = 2; + } else { + sub_cmd = 0x40; + shift = 0; + } + ncopy = (size / 0x000f) + !!(size % 0x000f); + + r600_need_dma_space(rctx, ncopy * 5); + for (i = 0; i < ncopy; i++) { + csize = size < 0x000f ? size : 0x000f; + /* emit reloc before writting cs so that cs is always in consistent state */ + r600_context_bo_reloc(rctx, &rctx->rings.dma, rsrc, RADEON_USAGE_READ); + r600_context_bo_reloc(rctx, &rctx->rings.dma, rdst, RADEON_USAGE_WRITE); + cs->buf[cs->cdw++] = DMA_PACKET(DMA_PACKET_COPY, sub_cmd, csize); + cs->buf[cs->cdw++] = dst_offset & 0x; + cs->buf[cs->cdw++] = src_offset & 0x; + cs->buf[cs->cdw++] = (dst_offset >> 32UL) & 0xff; + cs->buf[cs->cdw++] = (src_offset >> 32UL) & 0xff; + dst_offset += csize << shift; + src_offset += csize << shift; + size -= csize; + } +} diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 86e2c81..5c22e24 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -30,6 +30,20 @@ #include "util/u_framebuffer.h" #include "util/u_dual_blend.h" #include "evergreen_compute.h" +#include "util/u_math.h" + +static INLINE unsigned evergreen_array_mode(unsigned mode) +{ + switch (mode) { + case RADEON_SURF_MODE_LINEAR_ALIGNED: return V_028C70_ARRAY_LINEAR_ALIGNED; + break; + case RADEON_SURF_MODE_1D: return V_028C70_ARRAY_1D_TILED_THIN1; + break; + case RADEON_SURF_MODE_2D: return V_028C70_ARRAY_2D_TILED_THIN1; + default: + case RADEON_SURF_MODE_LINEAR: return V_028C70_ARRAY_LINEAR_GENERAL; + } +} static uint32_t eg_num_banks(uint32_t nbanks) { @@ -3445,3 +3459,190 @@ void evergreen_update_db_shader_control(struct r600_context * rctx) rctx->db_misc_state.atom.dirty = true; } } + +static void evergreen_dma_copy_tile(struct r600_context *rctx, + struct pipe_resource *dst, + unsigned dst_level, + unsigned dst
Re: [Mesa-dev] [PATCH] intel: Use a CPU map of the batch on LLC-sharing architectures.
On 01/25/2013 09:31 AM, Eric Anholt wrote: Kenneth Graunke writes: On 01/20/2013 02:59 PM, Eric Anholt wrote: Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in, which was an improvement over mapping the batch through the GTT directly (since any readback or other failure to stream through write combining correctly would hurt). However, on LLC-sharing architectures we can do better by mapping the batch directly, which reduces the cache footprint of the application since we no longer have this extra copy of a batchbuffer around. Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4% (n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves cairo-gl performance by 1.9% +/- 1.4% (n=57). No statistically significant difference in GLB2.1 on SNB (n=37). Improves cairo-gl performance by 2.1% +/- 0.1% (n=278). Looks good to me. Have you tested this on a non-LLC machine? Not in a long time. It shouldn't affect performance, since they get the same behavior as before. Okay. I mashed has_llc to false and ran glxgears, which was proof enough that the non-LLC path still works. I figured it did, but just wanted to check. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 25 January 2013 13:18, Matt Turner wrote: > On Fri, Jan 25, 2013 at 9:55 AM, Paul Berry > wrote: > > On 25 January 2013 07:49, Paul Berry wrote: > >> > >> On 24 January 2013 19:44, Matt Turner wrote: > >>> > >>> Following this email are eight patches that add the 4x8 pack/unpack > >>> operations that are the difference between what GLSL ES 3.0 and > >>> ARB_shading_language_packing require. > >>> > >>> They require Chad's gles3-glsl-packing series and are available at > >>> > >>> > http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing > >>> > >>> I've also added testing support on top of Chad's piglit patch. The > >>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to > >>> spot why. > >> > >> > >> I had minor comments on patches 4/8 and 5/8. The remainder is: > >> > >> Reviewed-by: Paul Berry > >> > >> I didn't spot anything that would explain the failure in unpackUnorm4x8 > >> tests. I'll go have a look at your piglit tests now, and if I don't > find > >> anything there either, I'll fire up the simulator and see if I can see > >> what's going wrong. > > > > > > I found the problem. On i965, floating point divisions are implemented > as > > multiplication by a reciprocal, whereas on the CPU there's a floating > point > > division instruction. Therefore, unpackUnorm4x8's computation of "f / > > 255.0" doesn't yield consistent results when run on the CPU vs the > > GPU--there is a tiny difference due to the accumulation of floating point > > rounding errors. > > > > That's why the "fs" and "vs" variants of the tests failed, and the > "const" > > variant passed--because Mesa does constant folding using the CPU's > floating > > point division instruction, which matches the Python test generator > > perfectly, whereas the "fs" and "vs" variants use the actual GPU. > > > > It's only by dumb luck that this rounding error issue didn't bite us > until > > now, because in principle it could equally well have occurred in the > > unpack2x16 functions. > > > > I believe we should relax the test to allow for these tiny rounding > errors > > (this is what the other test generators, such as > > gen_builtin_uniform_tests.py, do). As an experiment I modified > > gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and > > fs_unpack_2x16_template, "actual == expect${j}" is replaced with > > "distance(actual, expect${j}) < 0.1". With this change, the test > > passes. > > > > However, that change isn't good enough to commit to piglit, for two > reasons: > > > > (1) It should only be applied when testing the functions whose definition > > includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and > > unpackSnorm2x16). A properly functioning implementation ought to be > able to > > get exact answers with all the other packing functions, and we should > test > > that it does. > > > > (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced > > without error, since a shader author might conceivably write code that > > relies on these values being exact. That is, we should check that the > > following conversions are exact, with no rounding error: > > > > unpackUnorm4x8(0) == vec4(0.0) > > unpackUnorm4x8(0x) == vec4(1.0) > > unpackSnorm4x8(0) == vec4(0.0) > > unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0) > > unpackSnorm4x8(0x80808080) == vec4(-1.0) > > unpackSnorm4x8(0x81818181) == vec4(-1.0) > > unpackUnorm2x16(0) == vec2(0.0) > > unpackUnorm2x16(0x) == vec4(1.0) > > unpackSnorm2x16(0) == vec4(0.0) > > unpackSnorm2x16(0x7fff7fff) == vec4(1.0) > > unpackSnorm2x16(0x80008000) == vec4(-1.0) > > unpackSnorm2x16(0x80018001) == vec4(-1.0) > > > > My recommendation: address problem (1) by modifying the templates to > accept > > a new parameter that determines whether the test needs to be precise or > > approximate (e.g. "func.precise"). Address problem (2) by hand-coding a > few > > shader_runner tests to check the cases above. IMHO it would be ok to > leave > > the current patch as is (modulo my previous comments) and do a pair of > > follow-on patches to address problems (1) and (2). > > Interesting. Thanks a lot for finding that and writing it up. > > Since div() is used in by both the Snorm and Unorm unpacking > functions, any idea why it only adversely affects the results of > Unorm? Multiplication by 1/255 yields lower precision than by 1/127? > After messing around with numpy for a while, it looks like 1/255 expressed as a float32 happens to fall almost exactly between two representable float32 values: 0.0039215683937072754 (representable float32) 0.0039215686274509803 (true value of 1/255) 0.0039215688593685627 (next representable float32) So regardless of which way the rounding goes the relative error is approximately 5.9e-8. By luck, 1/127, 1/32767, and 1/65535 are all much closer to representable float32's, with relative errors of 3.7e-9, 9.3e-10, and 2.2e-10 respectively. So yeah, the relative error int
Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions
On Fri, Jan 25, 2013 at 11:59 AM, Chad Versace wrote: >>> + *x = unpack_1x8((uint8_t) (u & 0xff)); >>> + *y = unpack_1x8((uint8_t) (u >> 8)); >>> + *z = unpack_1x8((uint8_t) (u >> 16)); >>> + *w = unpack_1x8((uint8_t) (u >> 24)); >>> +} >> >> The bitmask (u & 0xff) confused me for a few moments, made me say "Why does >> Matt >> need a bitmask there?". But, then I realized that I did the same for >> unpack_2x16, >> and you likely just copied my pattern. Oh well. I'd prefer that unpack_2x16 >> and unpack_4x8 follow a similar visual pattern rather than clean that up now, >> so I'm ok with that funny looking bitmask staying in this patch. Hah, I wondered the same thing about your patch. :) gcc, and I assume any other compiler we could possible care about, knows the & 0xff is a nop. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On Fri, Jan 25, 2013 at 9:55 AM, Paul Berry wrote: > On 25 January 2013 07:49, Paul Berry wrote: >> >> On 24 January 2013 19:44, Matt Turner wrote: >>> >>> Following this email are eight patches that add the 4x8 pack/unpack >>> operations that are the difference between what GLSL ES 3.0 and >>> ARB_shading_language_packing require. >>> >>> They require Chad's gles3-glsl-packing series and are available at >>> >>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing >>> >>> I've also added testing support on top of Chad's piglit patch. The >>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to >>> spot why. >> >> >> I had minor comments on patches 4/8 and 5/8. The remainder is: >> >> Reviewed-by: Paul Berry >> >> I didn't spot anything that would explain the failure in unpackUnorm4x8 >> tests. I'll go have a look at your piglit tests now, and if I don't find >> anything there either, I'll fire up the simulator and see if I can see >> what's going wrong. > > > I found the problem. On i965, floating point divisions are implemented as > multiplication by a reciprocal, whereas on the CPU there's a floating point > division instruction. Therefore, unpackUnorm4x8's computation of "f / > 255.0" doesn't yield consistent results when run on the CPU vs the > GPU--there is a tiny difference due to the accumulation of floating point > rounding errors. > > That's why the "fs" and "vs" variants of the tests failed, and the "const" > variant passed--because Mesa does constant folding using the CPU's floating > point division instruction, which matches the Python test generator > perfectly, whereas the "fs" and "vs" variants use the actual GPU. > > It's only by dumb luck that this rounding error issue didn't bite us until > now, because in principle it could equally well have occurred in the > unpack2x16 functions. > > I believe we should relax the test to allow for these tiny rounding errors > (this is what the other test generators, such as > gen_builtin_uniform_tests.py, do). As an experiment I modified > gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and > fs_unpack_2x16_template, "actual == expect${j}" is replaced with > "distance(actual, expect${j}) < 0.1". With this change, the test > passes. > > However, that change isn't good enough to commit to piglit, for two reasons: > > (1) It should only be applied when testing the functions whose definition > includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and > unpackSnorm2x16). A properly functioning implementation ought to be able to > get exact answers with all the other packing functions, and we should test > that it does. > > (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced > without error, since a shader author might conceivably write code that > relies on these values being exact. That is, we should check that the > following conversions are exact, with no rounding error: > > unpackUnorm4x8(0) == vec4(0.0) > unpackUnorm4x8(0x) == vec4(1.0) > unpackSnorm4x8(0) == vec4(0.0) > unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0) > unpackSnorm4x8(0x80808080) == vec4(-1.0) > unpackSnorm4x8(0x81818181) == vec4(-1.0) > unpackUnorm2x16(0) == vec2(0.0) > unpackUnorm2x16(0x) == vec4(1.0) > unpackSnorm2x16(0) == vec4(0.0) > unpackSnorm2x16(0x7fff7fff) == vec4(1.0) > unpackSnorm2x16(0x80008000) == vec4(-1.0) > unpackSnorm2x16(0x80018001) == vec4(-1.0) > > My recommendation: address problem (1) by modifying the templates to accept > a new parameter that determines whether the test needs to be precise or > approximate (e.g. "func.precise"). Address problem (2) by hand-coding a few > shader_runner tests to check the cases above. IMHO it would be ok to leave > the current patch as is (modulo my previous comments) and do a pair of > follow-on patches to address problems (1) and (2). Interesting. Thanks a lot for finding that and writing it up. Since div() is used in by both the Snorm and Unorm unpacking functions, any idea why it only adversely affects the results of Unorm? Multiplication by 1/255 yields lower precision than by 1/127? In investigating the Unorm unpacking failure I did notice that some values worked (like 0.0, 1.0, and even 0.0078431377), so I don't expect any problems with precision on the values you suggest. I agree with your recommended solution. I'll push these patches today for the 9.1 branch and do follow-on patches to piglit like you suggest. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: split sampler and texture state
On 01/24/2013 07:48 PM, srol...@vmware.com wrote: From: Roland Scheidegger Split the sampler interface to use separate sampler and texture (sampler_view) state. This is needed to support dx10-style sampling instructions. This is not quite complete since both draw/llvmpipe don't really track textures/samplers independently yet, as well as the gallivm code not quite using the right sampler or texture index respectively (but it should work for the sampling codes used by opengl). We are however losing some optimizations in the process, apply_max_lod will no longer work, and we potentially could end up with more (unnecessary) recompiles (if switching textures with/without mipmaps only so it shouldn't be too bad). v2: don't use different callback structs for sampler/sampler view functions (which just complicates things), fix up sampling code to actually use the right texture or sampler index, and similar for llvmpipe/draw actually distinguish between samplers and sampler views. Looks good AFAICT. Just a few minor comments (about comments) below. Reviewed-by: Brian Paul Nice work! --- src/gallium/auxiliary/draw/draw_llvm.c| 129 +- src/gallium/auxiliary/draw/draw_llvm.h| 66 ++-- src/gallium/auxiliary/draw/draw_llvm_sample.c | 88 -- src/gallium/auxiliary/draw/draw_private.h |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 108 +++- src/gallium/auxiliary/gallivm/lp_bld_sample.h | 66 ++-- src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c | 104 ++-- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 187 - src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |3 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |6 +- src/gallium/drivers/llvmpipe/lp_jit.c | 54 -- src/gallium/drivers/llvmpipe/lp_jit.h | 24 ++- src/gallium/drivers/llvmpipe/lp_setup.c | 12 +- src/gallium/drivers/llvmpipe/lp_state_fs.c| 84 ++--- src/gallium/drivers/llvmpipe/lp_state_fs.h| 17 +- src/gallium/drivers/llvmpipe/lp_state_setup.c | 16 +- src/gallium/drivers/llvmpipe/lp_tex_sample.c | 102 --- 17 files changed, 711 insertions(+), 357 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index a3a3bbf..9e5ff1c 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -85,11 +85,6 @@ create_jit_texture_type(struct gallivm_state *gallivm, const char *struct_name) elem_types[DRAW_JIT_TEXTURE_IMG_STRIDE] = elem_types[DRAW_JIT_TEXTURE_MIP_OFFSETS] = LLVMArrayType(int32_type, PIPE_MAX_TEXTURE_LEVELS); - elem_types[DRAW_JIT_TEXTURE_MIN_LOD] = - elem_types[DRAW_JIT_TEXTURE_MAX_LOD] = - elem_types[DRAW_JIT_TEXTURE_LOD_BIAS] = LLVMFloatTypeInContext(gallivm->context); - elem_types[DRAW_JIT_TEXTURE_BORDER_COLOR] = - LLVMArrayType(LLVMFloatTypeInContext(gallivm->context), 4); texture_type = LLVMStructTypeInContext(gallivm->context, elem_types, Elements(elem_types), 0); @@ -130,18 +125,6 @@ create_jit_texture_type(struct gallivm_state *gallivm, const char *struct_name) LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, mip_offsets, target, texture_type, DRAW_JIT_TEXTURE_MIP_OFFSETS); - LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, min_lod, - target, texture_type, - DRAW_JIT_TEXTURE_MIN_LOD); - LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, max_lod, - target, texture_type, - DRAW_JIT_TEXTURE_MAX_LOD); - LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, lod_bias, - target, texture_type, - DRAW_JIT_TEXTURE_LOD_BIAS); - LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, border_color, - target, texture_type, - DRAW_JIT_TEXTURE_BORDER_COLOR); LP_CHECK_STRUCT_SIZE(struct draw_jit_texture, target, texture_type); @@ -150,15 +133,63 @@ create_jit_texture_type(struct gallivm_state *gallivm, const char *struct_name) /** + * Create LLVM type for struct draw_jit_sampler + */ +static LLVMTypeRef +create_jit_sampler_type(struct gallivm_state *gallivm, const char *struct_name) +{ + LLVMTargetDataRef target = gallivm->target; + LLVMTypeRef sampler_type; + LLVMTypeRef elem_types[DRAW_JIT_SAMPLER_NUM_FIELDS]; + + elem_types[DRAW_JIT_SAMPLER_MIN_LOD] = + elem_types[DRAW_JIT_SAMPLER_MAX_LOD] = + elem_types[DRAW_JIT_SAMPLER_LOD_BIAS] = LLVMFloatTypeInContext(gallivm->context); + elem_types[DRAW_JIT_SAMPLER_BORDER_COLOR] = + LLVMArrayType(LLVMFloatTypeInContext(gallivm->context), 4); + + sampler_type = LLVMStructTypeInContext(gallivm->context, elem_ty
Re: [Mesa-dev] [PATCH] glx: only advertise GLX_INTEL_swap_event if it's supported
On 01/24/2013 06:59 PM, Zack Rusin wrote: Only drivers supporting DRI2 version>=4 support GLX_INTEL_swap_event. So lets mark it as such otherwise applications which use this extension (i.e. everything based on Clutter, e.g. gnome-shell) break horribly on drivers supporting DRI2 versions only up to 3. Note: This is a candidate for the 9.0 branch. Signed-off-by: Zack Rusin --- src/glx/dri2_glx.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 1b3cf2b..a51716f 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1062,8 +1062,9 @@ dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions) __glXEnableDirectExtension(&psc->base, "GLX_MESA_swap_control"); __glXEnableDirectExtension(&psc->base, "GLX_SGI_make_current_read"); - /* FIXME: if DRI2 version supports it... */ - __glXEnableDirectExtension(&psc->base, "GLX_INTEL_swap_event"); + if (psc->dri2->base.version>= 4) { + __glXEnableDirectExtension(&psc->base, "GLX_INTEL_swap_event"); + } if (psc->dri2->base.version>= 3) { const unsigned mask = psc->dri2->getAPIMask(psc->driScreen); Other people are more familiar with this than me, but Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59835] ir_constant_expression.cpp:156: undefined reference to `_mesa_round_to_even'
https://bugs.freedesktop.org/show_bug.cgi?id=59835 --- Comment #2 from Chad Versace --- Sorry about that. Next time I change the Android and Autotools system, I'll remember to change Scons too. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 01/25/2013 09:55 AM, Paul Berry wrote: > On 25 January 2013 07:49, Paul Berry wrote: > >> On 24 January 2013 19:44, Matt Turner wrote: >> >>> Following this email are eight patches that add the 4x8 pack/unpack >>> operations that are the difference between what GLSL ES 3.0 and >>> ARB_shading_language_packing require. >>> >>> They require Chad's gles3-glsl-packing series and are available at >>> >>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing >>> >>> I've also added testing support on top of Chad's piglit patch. The >>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to >>> spot why. >>> >> >> I had minor comments on patches 4/8 and 5/8. The remainder is: >> >> Reviewed-by: Paul Berry >> >> I didn't spot anything that would explain the failure in unpackUnorm4x8 >> tests. I'll go have a look at your piglit tests now, and if I don't find >> anything there either, I'll fire up the simulator and see if I can see >> what's going wrong. >> > > I found the problem. On i965, floating point divisions are implemented as > multiplication by a reciprocal, whereas on the CPU there's a floating point > division instruction. Therefore, unpackUnorm4x8's computation of "f / > 255.0" doesn't yield consistent results when run on the CPU vs the > GPU--there is a tiny difference due to the accumulation of floating point > rounding errors. > > That's why the "fs" and "vs" variants of the tests failed, and the "const" > variant passed--because Mesa does constant folding using the CPU's floating > point division instruction, which matches the Python test generator > perfectly, whereas the "fs" and "vs" variants use the actual GPU. > > It's only by dumb luck that this rounding error issue didn't bite us until > now, because in principle it could equally well have occurred in the > unpack2x16 functions. > > I believe we should relax the test to allow for these tiny rounding errors > (this is what the other test generators, such as > gen_builtin_uniform_tests.py, do). As an experiment I modified > gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and > fs_unpack_2x16_template, "actual == expect${j}" is replaced with > "distance(actual, expect${j}) < 0.1". With this change, the test > passes. > > However, that change isn't good enough to commit to piglit, for two reasons: > > (1) It should only be applied when testing the functions whose definition > includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and > unpackSnorm2x16). A properly functioning implementation ought to be able > to get exact answers with all the other packing functions, and we should > test that it does. > > (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced > without error, since a shader author might conceivably write code that > relies on these values being exact. That is, we should check that the > following conversions are exact, with no rounding error: > > unpackUnorm4x8(0) == vec4(0.0) > unpackUnorm4x8(0x) == vec4(1.0) > unpackSnorm4x8(0) == vec4(0.0) > unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0) > unpackSnorm4x8(0x80808080) == vec4(-1.0) > unpackSnorm4x8(0x81818181) == vec4(-1.0) > unpackUnorm2x16(0) == vec2(0.0) > unpackUnorm2x16(0x) == vec4(1.0) > unpackSnorm2x16(0) == vec4(0.0) > unpackSnorm2x16(0x7fff7fff) == vec4(1.0) > unpackSnorm2x16(0x80008000) == vec4(-1.0) > unpackSnorm2x16(0x80018001) == vec4(-1.0) > > My recommendation: address problem (1) by modifying the templates to accept > a new parameter that determines whether the test needs to be precise or > approximate (e.g. "func.precise"). Address problem (2) by hand-coding a > few shader_runner tests to check the cases above. IMHO it would be ok to > leave the current patch as is (modulo my previous comments) and do a pair > of follow-on patches to address problems (1) and (2). > > Chad, do you have any thoughts on this subject, since you're the original > author of this test generator? I don't like the kludge of having a separate shader_test for exact values. But, I've thought hard on what modifications to the python script would be needed to solve the problem solely within the script and its generated shaders, and I like that solution even less. So, Paul, I think we should go forward with your proposed solution. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 01/24/2013 07:44 PM, Matt Turner wrote: > Following this email are eight patches that add the 4x8 pack/unpack > operations that are the difference between what GLSL ES 3.0 and > ARB_shading_language_packing require. > > They require Chad's gles3-glsl-packing series and are available at > http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing > > I've also added testing support on top of Chad's piglit patch. The > {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to > spot why. By the way, my Mesa series is committed to the master and gles3 branches. My Piglit patch is on master too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] glsl: Add infrastructure for ARB_shading_language_packing
Patches 1-3, 6-8 are Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions
On 01/25/2013 11:38 AM, Chad Versace wrote: > On 01/24/2013 07:47 PM, Matt Turner wrote: >> That is, evaluate constant expressions for the following functions: >> packSnorm4x8, unpackSnorm4x8 >> packUnorm4x8, unpackUnorm4x8 >> --- >> src/glsl/ir_constant_expression.cpp | 162 >> +++ >> 1 files changed, 162 insertions(+), 0 deletions(-) >> >> diff --git a/src/glsl/ir_constant_expression.cpp >> b/src/glsl/ir_constant_expression.cpp >> index b34c6e8..4796f6f 100644 >> --- a/src/glsl/ir_constant_expression.cpp >> +++ b/src/glsl/ir_constant_expression.cpp >> @@ -76,12 +76,24 @@ bitcast_f2u(float f) >> } >> >> /** >> + * Evaluate one component of a floating-point 4x8 unpacking function. >> + */ >> +typedef uint8_t >> +(*pack_1x8_func_t)(float); >> + >> +/** >> * Evaluate one component of a floating-point 2x16 unpacking function. >> */ >> typedef uint16_t >> (*pack_1x16_func_t)(float); >> >> /** >> + * Evaluate one component of a floating-point 4x8 unpacking function. >> + */ >> +typedef float >> +(*unpack_1x8_func_t)(uint8_t); >> + >> +/** >> * Evaluate one component of a floating-point 2x16 unpacking function. >> */ >> typedef float >> @@ -112,6 +124,32 @@ pack_2x16(pack_1x16_func_t pack_1x16, >> } >> >> /** >> + * Evaluate a 4x8 floating-point packing function. >> + */ >> +static uint32_t >> +pack_4x8(pack_1x8_func_t pack_1x8, >> + float x, float y, float z, float w) >> +{ >> + /* From section 8.4 of the GLSL 4.30 spec: >> +* >> +*packSnorm4x8 >> +* >> +*The first component of the vector will be written to the least >> +*significant bits of the output; the last component will be written >> to >> +*the most significant bits. >> +* >> +* The specifications for the other packing functions contain similar >> +* language. >> +*/ >> + uint32_t u = 0; >> + u |= ((uint32_t) pack_1x8(x) << 0); >> + u |= ((uint32_t) pack_1x8(y) << 8); >> + u |= ((uint32_t) pack_1x8(z) << 16); >> + u |= ((uint32_t) pack_1x8(w) << 24); >> + return u; >> +} >> + >> +/** >> * Evaluate a 2x16 floating-point unpacking function. >> */ >> static void >> @@ -135,6 +173,48 @@ unpack_2x16(unpack_1x16_func_t unpack_1x16, >> } >> >> /** >> + * Evaluate a 4x8 floating-point unpacking function. >> + */ >> +static void >> +unpack_4x8(unpack_1x8_func_t unpack_1x8, uint32_t u, >> + float *x, float *y, float *z, float *w) >> +{ >> +/* From section 8.4 of the GLSL 4.30 spec: >> + * >> + *unpackSnorm4x8 >> + *-- >> + *The first component of the returned vector will be extracted from >> + *the least significant bits of the input; the last component will >> be >> + *extracted from the most significant bits. >> + * >> + * The specifications for the other unpacking functions contain similar >> + * language. >> + */ >> + *x = unpack_1x8((uint8_t) (u & 0xff)); >> + *y = unpack_1x8((uint8_t) (u >> 8)); >> + *z = unpack_1x8((uint8_t) (u >> 16)); >> + *w = unpack_1x8((uint8_t) (u >> 24)); >> +} > > The bitmask (u & 0xff) confused me for a few moments, made me say "Why does > Matt > need a bitmask there?". But, then I realized that I did the same for > unpack_2x16, > and you likely just copied my pattern. Oh well. I'd prefer that unpack_2x16 > and unpack_4x8 follow a similar visual pattern rather than clean that up now, > so I'm ok with that funny looking bitmask staying in this patch. > >> + >> +/** >> + * Evaluate one component of packSnorm4x8. >> + */ >> +static uint8_t >> +pack_snorm_1x8(float x) >> +{ >> +/* From section 8.4 of the GLSL 4.30 spec: >> + * >> + *packSnorm4x8 >> + * >> + *The conversion for component c of v to fixed point is done as >> + *follows: >> + * >> + * packSnorm4x8: round(clamp(c, -1, +1) * 127.0) >> + */ >> + return (uint8_t) _mesa_round_to_even(CLAMP(x, -1.0f, +1.0f) * 127.0f); >> +} > > Conversion from a negative float to a uint, so an intermediate conversion to > int8_t is needed here. Like Paul said. With that change, this is > Reviewed-by: Chad Versace > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations
On 01/25/2013 11:59 AM, Chad Versace wrote: > On 01/24/2013 07:47 PM, Matt Turner wrote: >> Lower them to arithmetic and bit manipulation expressions. >> --- >> src/glsl/ir_optimization.h |6 + >> src/glsl/lower_packing_builtins.cpp | 279 >> +++ >> 2 files changed, 285 insertions(+), 0 deletions(-) >> >> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h >> index ac90b87..8f33018 100644 >> --- a/src/glsl/ir_optimization.h >> +++ b/src/glsl/ir_optimization.h >> @@ -54,6 +54,12 @@ enum lower_packing_builtins_op { >> >> LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040, >> LOWER_UNPACK_HALF_2x16_TO_SPLIT = 0x0080, >> + >> + LOWER_PACK_SNORM_4x8 = 0x0100, >> + LOWER_UNPACK_SNORM_4x8 = 0x0200, >> + >> + LOWER_PACK_UNORM_4x8 = 0x0400, >> + LOWER_UNPACK_UNORM_4x8 = 0x0800, >> }; >> >> bool do_common_optimization(exec_list *ir, bool linked, >> diff --git a/src/glsl/lower_packing_builtins.cpp >> b/src/glsl/lower_packing_builtins.cpp >> index 49176cc..aa6765f 100644 >> --- a/src/glsl/lower_packing_builtins.cpp >> +++ b/src/glsl/lower_packing_builtins.cpp >> @@ -85,9 +85,15 @@ public: >>case LOWER_PACK_SNORM_2x16: >> *rvalue = lower_pack_snorm_2x16(op0); >> break; >> + case LOWER_PACK_SNORM_4x8: >> + *rvalue = lower_pack_snorm_4x8(op0); >> + break; >>case LOWER_PACK_UNORM_2x16: >> *rvalue = lower_pack_unorm_2x16(op0); >> break; >> + case LOWER_PACK_UNORM_4x8: >> + *rvalue = lower_pack_unorm_4x8(op0); >> + break; >>case LOWER_PACK_HALF_2x16: >> *rvalue = lower_pack_half_2x16(op0); >> break; >> @@ -97,9 +103,15 @@ public: >>case LOWER_UNPACK_SNORM_2x16: >> *rvalue = lower_unpack_snorm_2x16(op0); >> break; >> + case LOWER_UNPACK_SNORM_4x8: >> + *rvalue = lower_unpack_snorm_4x8(op0); >> + break; >>case LOWER_UNPACK_UNORM_2x16: >> *rvalue = lower_unpack_unorm_2x16(op0); >> break; >> + case LOWER_UNPACK_UNORM_4x8: >> + *rvalue = lower_unpack_unorm_4x8(op0); >> + break; >>case LOWER_UNPACK_HALF_2x16: >> *rvalue = lower_unpack_half_2x16(op0); >> break; >> @@ -137,18 +149,30 @@ private: >>case ir_unop_pack_snorm_2x16: >> result = op_mask & LOWER_PACK_SNORM_2x16; >> break; >> + case ir_unop_pack_snorm_4x8: >> + result = op_mask & LOWER_PACK_SNORM_4x8; >> + break; >>case ir_unop_pack_unorm_2x16: >> result = op_mask & LOWER_PACK_UNORM_2x16; >> break; >> + case ir_unop_pack_unorm_4x8: >> + result = op_mask & LOWER_PACK_UNORM_4x8; >> + break; >>case ir_unop_pack_half_2x16: >> result = op_mask & (LOWER_PACK_HALF_2x16 | >> LOWER_PACK_HALF_2x16_TO_SPLIT); >> break; >>case ir_unop_unpack_snorm_2x16: >> result = op_mask & LOWER_UNPACK_SNORM_2x16; >> break; >> + case ir_unop_unpack_snorm_4x8: >> + result = op_mask & LOWER_UNPACK_SNORM_4x8; >> + break; >>case ir_unop_unpack_unorm_2x16: >> result = op_mask & LOWER_UNPACK_UNORM_2x16; >> break; >> + case ir_unop_unpack_unorm_4x8: >> + result = op_mask & LOWER_UNPACK_UNORM_4x8; >> + break; >>case ir_unop_unpack_half_2x16: >> result = op_mask & (LOWER_UNPACK_HALF_2x16 | >> LOWER_UNPACK_HALF_2x16_TO_SPLIT); >> break; >> @@ -214,6 +238,30 @@ private: >> } >> >> /** >> +* \brief Pack four uint8's into a single uint32. >> +* >> +* Interpret the given uvec4 as a uint32 quad. Pack the quad into a >> uint32 >> +* where the least significant bits specify the first element of the >> quad. >> +* Return the uint32. >> +*/ > > I find the term "uint32 quad" confusing. It is too reminiscient of "quadword". > This not-so-bright reviewer thought: "A uint32 quadword? Huh? Oh! That means > a uint32 4-tuple". So, I'd like to see the phrase changed to "uint32 4-tuple" > or something similar, but this suggestion doesn't block the patch. > >> + ir_rvalue* >> + pack_uvec4_to_uint(ir_rvalue *uvec4_rval) >> + { >> + assert(uvec4_rval->type == glsl_type::uvec4_type); >> + >> + /* uvec4 u = UVEC4_RVAL; */ >> + ir_variable *u = factory.make_temp(glsl_type::uvec4_type, >> + "tmp_pack_uvec4_to_uint"); >> + factory.emit(assign(u, uvec4_rval)); >> + >> + /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff) >> << 8) | (u.x & 0xff); */ > ^^^ missing & >> + return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)), >> constant(24u)), >> + lshift(bit_and(swizzle_z(u), constant(0xffu)), >> consta
Re: [Mesa-dev] [PATCH 24/32] glsl: Make the align function available elsewhere in the linker
On 01/25/2013 05:43 AM, Ian Romanick wrote: On 01/24/2013 08:40 PM, Kenneth Graunke wrote: On 01/22/2013 12:52 AM, Ian Romanick wrote: From: Ian Romanick Signed-off-by: Ian Romanick --- src/glsl/glsl_types.cpp | 12 +++- src/glsl/glsl_types.h| 6 ++ src/glsl/link_uniforms.cpp | 14 -- src/glsl/lower_ubo_reference.cpp | 19 +++ 4 files changed, 20 insertions(+), 31 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index 0075550..ddd0148 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -863,12 +863,6 @@ glsl_type::std140_base_alignment(bool row_major) const return -1; } -static unsigned -align(unsigned val, unsigned align) -{ - return (val + align - 1) / align * align; -} - Why not just eliminate this function altogether and use ALIGN() from macros.h? (The implementation is slightly different, but I think it should work.) I thought about that. The ALIGN macro only works when align is a power of two, and it wasn't obvious to me that all the uses of this function met that requirement. I did this refactor right before sending this series out, and it felt a little like the 11th hour to do something that could have a functional change. I'd prefer to revisit this after the release. Sounds like a good plan. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] configure.ac: Don't set LLVM_LIBS when llvm is disabled
From: Tom Stellard --- configure.ac | 35 +++ 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/configure.ac b/configure.ac index ccf95c5..9cc5c4a 100644 --- a/configure.ac +++ b/configure.ac @@ -1898,21 +1898,23 @@ dnl by calling llvm-config --libs ${DRIVER_LLVM_COMPONENTS}, but dnl this was causing the same libraries to be appear multiple times dnl in LLVM_LIBS. -LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`" +if test "x$MESA_LLVM" != x0; then -if test "x$with_llvm_shared_libs" = xyes; then -dnl We can't use $LLVM_VERSION because it has 'svn' stripped out, -LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version` -AC_CHECK_FILE("$LLVM_LIBDIR/lib$LLVM_SO_NAME.so", llvm_have_one_so=yes,) +LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`" -if test "x$llvm_have_one_so" = xyes; then -dnl LLVM was built using auto*, so there is only one shared object. -LLVM_LIBS="-l$LLVM_SO_NAME" -else -dnl If LLVM was built with CMake, there will be one shared object per -dnl component. -AC_CHECK_FILE("$LLVM_LIBDIR/libLLVMTarget.so",, -AC_MSG_ERROR([Could not find llvm shared libraries: +if test "x$with_llvm_shared_libs" = xyes; then +dnl We can't use $LLVM_VERSION because it has 'svn' stripped out, +LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version` +AC_CHECK_FILE("$LLVM_LIBDIR/lib$LLVM_SO_NAME.so", llvm_have_one_so=yes,) + +if test "x$llvm_have_one_so" = xyes; then +dnl LLVM was built using auto*, so there is only one shared object. +LLVM_LIBS="-l$LLVM_SO_NAME" +else +dnl If LLVM was built with CMake, there will be one shared object per +dnl component. +AC_CHECK_FILE("$LLVM_LIBDIR/libLLVMTarget.so",, +AC_MSG_ERROR([Could not find llvm shared libraries: Please make sure you have built llvm with the --enable-shared option and that your llvm libraries are installed in $LLVM_LIBDIR If you have installed your llvm libraries to a different directory you @@ -1925,9 +1927,10 @@ if test "x$with_llvm_shared_libs" = xyes; then use llvm static libraries then remove these options from your configure invocation and reconfigure.])) - dnl We don't need to update LLVM_LIBS in this case because the LLVM - dnl install uses a shared object for each compoenent and we have - dnl already added all of these objects to LLVM_LIBS. + dnl We don't need to update LLVM_LIBS in this case because the LLVM + dnl install uses a shared object for each compoenent and we have + dnl already added all of these objects to LLVM_LIBS. +fi fi fi -- 1.7.11.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 25 January 2013 07:49, Paul Berry wrote: > On 24 January 2013 19:44, Matt Turner wrote: > >> Following this email are eight patches that add the 4x8 pack/unpack >> operations that are the difference between what GLSL ES 3.0 and >> ARB_shading_language_packing require. >> >> They require Chad's gles3-glsl-packing series and are available at >> >> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing >> >> I've also added testing support on top of Chad's piglit patch. The >> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to >> spot why. >> > > I had minor comments on patches 4/8 and 5/8. The remainder is: > > Reviewed-by: Paul Berry > > I didn't spot anything that would explain the failure in unpackUnorm4x8 > tests. I'll go have a look at your piglit tests now, and if I don't find > anything there either, I'll fire up the simulator and see if I can see > what's going wrong. > I found the problem. On i965, floating point divisions are implemented as multiplication by a reciprocal, whereas on the CPU there's a floating point division instruction. Therefore, unpackUnorm4x8's computation of "f / 255.0" doesn't yield consistent results when run on the CPU vs the GPU--there is a tiny difference due to the accumulation of floating point rounding errors. That's why the "fs" and "vs" variants of the tests failed, and the "const" variant passed--because Mesa does constant folding using the CPU's floating point division instruction, which matches the Python test generator perfectly, whereas the "fs" and "vs" variants use the actual GPU. It's only by dumb luck that this rounding error issue didn't bite us until now, because in principle it could equally well have occurred in the unpack2x16 functions. I believe we should relax the test to allow for these tiny rounding errors (this is what the other test generators, such as gen_builtin_uniform_tests.py, do). As an experiment I modified gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and fs_unpack_2x16_template, "actual == expect${j}" is replaced with "distance(actual, expect${j}) < 0.1". With this change, the test passes. However, that change isn't good enough to commit to piglit, for two reasons: (1) It should only be applied when testing the functions whose definition includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and unpackSnorm2x16). A properly functioning implementation ought to be able to get exact answers with all the other packing functions, and we should test that it does. (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced without error, since a shader author might conceivably write code that relies on these values being exact. That is, we should check that the following conversions are exact, with no rounding error: unpackUnorm4x8(0) == vec4(0.0) unpackUnorm4x8(0x) == vec4(1.0) unpackSnorm4x8(0) == vec4(0.0) unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0) unpackSnorm4x8(0x80808080) == vec4(-1.0) unpackSnorm4x8(0x81818181) == vec4(-1.0) unpackUnorm2x16(0) == vec2(0.0) unpackUnorm2x16(0x) == vec4(1.0) unpackSnorm2x16(0) == vec4(0.0) unpackSnorm2x16(0x7fff7fff) == vec4(1.0) unpackSnorm2x16(0x80008000) == vec4(-1.0) unpackSnorm2x16(0x80018001) == vec4(-1.0) My recommendation: address problem (1) by modifying the templates to accept a new parameter that determines whether the test needs to be precise or approximate (e.g. "func.precise"). Address problem (2) by hand-coding a few shader_runner tests to check the cases above. IMHO it would be ok to leave the current patch as is (modulo my previous comments) and do a pair of follow-on patches to address problems (1) and (2). Chad, do you have any thoughts on this subject, since you're the original author of this test generator? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] r600g: add async for staging buffer upload
From: Jerome Glisse Signed-off-by: Jerome Glisse --- src/gallium/drivers/r600/evergreen_hw_context.c | 44 ++ src/gallium/drivers/r600/evergreen_state.c | 197 src/gallium/drivers/r600/evergreend.h | 15 ++ src/gallium/drivers/r600/r600.h | 27 src/gallium/drivers/r600/r600_buffer.c | 25 ++- src/gallium/drivers/r600/r600_hw_context.c | 48 +- src/gallium/drivers/r600/r600_pipe.c| 6 +- src/gallium/drivers/r600/r600_pipe.h| 9 ++ src/gallium/drivers/r600/r600_state.c | 190 +++ src/gallium/drivers/r600/r600_state_common.c| 6 +- src/gallium/drivers/r600/r600_texture.c | 24 ++- src/gallium/drivers/r600/r600d.h| 15 ++ 12 files changed, 589 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c b/src/gallium/drivers/r600/evergreen_hw_context.c index fa90c9a..1c30404 100644 --- a/src/gallium/drivers/r600/evergreen_hw_context.c +++ b/src/gallium/drivers/r600/evergreen_hw_context.c @@ -26,6 +26,7 @@ #include "r600_hw_context_priv.h" #include "evergreend.h" #include "util/u_memory.h" +#include "util/u_math.h" static const struct r600_reg cayman_config_reg_list[] = { {R_009100_SPI_CONFIG_CNTL, REG_FLAG_ENABLE_ALWAYS | REG_FLAG_FLUSH_CHANGE, 0}, @@ -238,3 +239,46 @@ void evergreen_set_streamout_enable(struct r600_context *ctx, unsigned buffer_en r600_write_context_reg(cs, R_028B94_VGT_STRMOUT_CONFIG, S_028B94_STREAMOUT_0_EN(0)); } } + +void evergreen_dma_copy(struct r600_context *rctx, + struct pipe_resource *dst, + struct pipe_resource *src, + unsigned long dst_offset, + unsigned long src_offset, + unsigned long size) +{ + struct radeon_winsys_cs *cs = rctx->rings.dma.cs; + unsigned i, ncopy, csize, sub_cmd, shift; + struct r600_resource *rdst = (struct r600_resource*)dst; + struct r600_resource *rsrc = (struct r600_resource*)src; + + /* make sure that the dma ring is only one active */ + rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC); + + /* see if we use dword or byte copy */ + if (!(dst_offset & 0x3) && !(src_offset & 0x3) && !(size & 0x3)) { + size >>= 2; + sub_cmd = 0x00; + shift = 2; + } else { + sub_cmd = 0x40; + shift = 0; + } + ncopy = (size / 0x000f) + !!(size % 0x000f); + + r600_need_dma_space(rctx, ncopy * 5); + for (i = 0; i < ncopy; i++) { + csize = size < 0x000f ? size : 0x000f; + /* emit reloc before writting cs so that cs is always in consistent state */ + r600_context_bo_reloc(rctx, &rctx->rings.dma, rsrc, RADEON_USAGE_READ); + r600_context_bo_reloc(rctx, &rctx->rings.dma, rdst, RADEON_USAGE_WRITE); + cs->buf[cs->cdw++] = DMA_PACKET(DMA_PACKET_COPY, sub_cmd, csize); + cs->buf[cs->cdw++] = dst_offset & 0x; + cs->buf[cs->cdw++] = src_offset & 0x; + cs->buf[cs->cdw++] = (dst_offset >> 32UL) & 0xff; + cs->buf[cs->cdw++] = (src_offset >> 32UL) & 0xff; + dst_offset += csize << shift; + src_offset += csize << shift; + size -= csize; + } +} diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 86e2c81..f0511d8 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -30,6 +30,20 @@ #include "util/u_framebuffer.h" #include "util/u_dual_blend.h" #include "evergreen_compute.h" +#include "util/u_math.h" + +static INLINE unsigned evergreen_array_mode(unsigned mode) +{ + switch (mode) { + case RADEON_SURF_MODE_LINEAR_ALIGNED: return V_028C70_ARRAY_LINEAR_ALIGNED; + break; + case RADEON_SURF_MODE_1D: return V_028C70_ARRAY_1D_TILED_THIN1; + break; + case RADEON_SURF_MODE_2D: return V_028C70_ARRAY_2D_TILED_THIN1; + default: + case RADEON_SURF_MODE_LINEAR: return V_028C70_ARRAY_LINEAR_GENERAL; + } +} static uint32_t eg_num_banks(uint32_t nbanks) { @@ -3445,3 +3459,186 @@ void evergreen_update_db_shader_control(struct r600_context * rctx) rctx->db_misc_state.atom.dirty = true; } } + +static void evergreen_dma_copy_tile(struct r600_context *rctx, + struct pipe_resource *dst, + unsigned dst_level, + unsigned dst_x, + unsigned dst_y, + unsigned dst_z, + struct pipe_resource *src, + un
[Mesa-dev] [PATCH 1/4] radeon/winsys: add dma ring support to winsys v3
From: Jerome Glisse Add ring support, you can create a cs for each ring. DMA ring is bit special regarding relocation as you must emit as much relocation as there is use of the buffer. v2: - Improved comment on relocation changes - Use a single thread to queue cs submittion this simplify driver code while not impacting performances. Rational for this is that you have to wait for all previous submission to have completed so there was never a case while we could have 2 different thread submitting a command stream at the same time. This code just consolidate submission into one single thread per winsys. v3: - Do not use semaphore for empty queue signaling, instead use cond var. This is because it's tricky to maintain an even number of call to semaphore wait and semaphore signal (the number of cs in the stack would for instance make that number vary). Signed-off-by: Jerome Glisse --- src/gallium/drivers/r300/r300_context.c | 2 +- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/radeonsi/radeonsi_pipe.c | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 160 -- src/gallium/winsys/radeon/drm/radeon_drm_cs.h | 8 +- src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 87 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 17 +++ src/gallium/winsys/radeon/drm/radeon_winsys.h | 20 ++- 9 files changed, 218 insertions(+), 82 deletions(-) diff --git a/src/gallium/drivers/r300/r300_context.c b/src/gallium/drivers/r300/r300_context.c index d8af13f..340a7f0 100644 --- a/src/gallium/drivers/r300/r300_context.c +++ b/src/gallium/drivers/r300/r300_context.c @@ -379,7 +379,7 @@ struct pipe_context* r300_create_context(struct pipe_screen* screen, sizeof(struct pipe_transfer), 64, UTIL_SLAB_SINGLETHREADED); -r300->cs = rws->cs_create(rws); +r300->cs = rws->cs_create(rws, RING_GFX); if (r300->cs == NULL) goto fail; diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index fda5074..e4a35cf 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -289,7 +289,7 @@ static struct pipe_context *r600_create_context(struct pipe_screen *screen, void goto fail; } - rctx->cs = rctx->ws->cs_create(rctx->ws); + rctx->cs = rctx->ws->cs_create(rctx->ws, RING_GFX); rctx->ws->cs_set_flush_callback(rctx->cs, r600_flush_from_winsys, rctx); rctx->uploader = u_upload_create(&rctx->context, 1024 * 1024, 256, diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c b/src/gallium/drivers/radeonsi/radeonsi_pipe.c index cbb3bc4..5792fe2 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c @@ -222,7 +222,7 @@ static struct pipe_context *r600_create_context(struct pipe_screen *screen, void case TAHITI: si_init_state_functions(rctx); LIST_INITHEAD(&rctx->active_query_list); - rctx->cs = rctx->ws->cs_create(rctx->ws); + rctx->cs = rctx->ws->cs_create(rctx->ws, RING_GFX); rctx->max_db = 8; si_init_config(rctx); break; diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 897e962..6daafc3 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -453,7 +453,7 @@ static void *radeon_bo_map(struct radeon_winsys_cs_handle *buf, } else { /* Try to avoid busy-waiting in radeon_bo_wait. */ if (p_atomic_read(&bo->num_active_ioctls)) -radeon_drm_cs_sync_flush(cs); +radeon_drm_cs_sync_flush(rcs); } radeon_bo_wait((struct pb_buffer*)bo, RADEON_USAGE_READWRITE); diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c index c5e7f1e..cab2704 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c @@ -90,6 +90,10 @@ #define RADEON_CS_RING_COMPUTE 1 #endif +#ifndef RADEON_CS_RING_DMA +#define RADEON_CS_RING_DMA 2 +#endif + #ifndef RADEON_CS_END_OF_FRAME #define RADEON_CS_END_OF_FRAME 0x04 #endif @@ -158,10 +162,8 @@ static void radeon_destroy_cs_context(struct radeon_cs_context *csc) FREE(csc->relocs); } -DEBUG_GET_ONCE_BOOL_OPTION(thread, "RADEON_THREAD", TRUE) -static PIPE_THREAD_ROUTINE(radeon_drm_cs_emit_ioctl, param); -static struct radeon_winsys_cs *radeon_drm_cs_create(struct radeon_winsys *rws) +static struct radeon_winsys_cs *radeon_drm_cs_create(struct radeon_winsys *rws
[Mesa-dev] r600g async dma support
So design is mostly the same then previously. Few changes, first i use only one thread to offload all cs submission wether gfx or dma. Reasons is that using on thread for gfx and one for dma lead to more complex synchronization with no gain ie when submitting gfx you would need to make sure previous dma submittion are done and vice et versa. So in the end it's just not a good idea. Moreover the dma submission is lot faster than the gfx one as the dma cs are smaller and simpler to parse for the kernel. Second is that i don't use a stack in r600g to keep track of cs submission ordering. Instead anytime r600g switch cmd stream ie start writing dma command after writing gfx one, we first asynchronously flush the gfx command. This insure that any point in time the driver is only building command for either gfx or dma ring and everything is serialize from driver pov. It simplify implementation as there is no need to special case some corner case such as query/event or streamout buffer. The last patch is a small optimization that decrease the cpu overhead by not submitting gfx cmd that does not do anything. Everything been tested on r7xx and evergreen and i witnessed no regression. Evergreen can be improved by adding support for partial blit but i am not sure it's worth it. Cheers, Jerome ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: handle new GLSL IR enumerants in switch statements
To silence warnings about unhandled cases. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 643a9bb..2c5ba41 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -984,6 +984,7 @@ type_size(const struct glsl_type *type) * at link time. */ return 1; + case GLSL_TYPE_INTERFACE: case GLSL_TYPE_VOID: case GLSL_TYPE_ERROR: assert(!"Invalid type in type_size"); @@ -1934,10 +1935,19 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) } break; } + case ir_unop_pack_snorm_2x16: + case ir_unop_pack_unorm_2x16: + case ir_unop_pack_half_2x16: + case ir_unop_unpack_snorm_2x16: + case ir_unop_unpack_unorm_2x16: + case ir_unop_unpack_half_2x16: + case ir_unop_unpack_half_2x16_split_x: + case ir_unop_unpack_half_2x16_split_y: + case ir_binop_pack_half_2x16_split: case ir_quadop_vector: - /* This operation should have already been handled. + /* This operation is not supported, or should have already been handled. */ - assert(!"Should not get here."); + assert(!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"); break; } -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: Use a CPU map of the batch on LLC-sharing architectures.
Kenneth Graunke writes: > On 01/20/2013 02:59 PM, Eric Anholt wrote: >> Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in, >> which was an improvement over mapping the batch through the GTT directly >> (since any readback or other failure to stream through write combining >> correctly would hurt). However, on LLC-sharing architectures we can do >> better >> by mapping the batch directly, which reduces the cache footprint of the >> application since we no longer have this extra copy of a batchbuffer around. >> >> Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4% >> (n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves >> cairo-gl performance by 1.9% +/- 1.4% (n=57). >> >> No statistically significant difference in GLB2.1 on SNB (n=37). Improves >> cairo-gl performance by 2.1% +/- 0.1% (n=278). > > Looks good to me. Have you tested this on a non-LLC machine? Not in a long time. It shouldn't affect performance, since they get the same behavior as before. pgpwEqlFHN6a3.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] android: fix stride to be bytes instead of pixels
Tapani Pälli writes: > commit 60894edeef973e86a73067276f658b72f84271b6 changed the way dri2 > buffer pitch is interpreted in intel driver createImageFromName > implementation, caller must set pitch in bytes, not pixels. Oops, I didn't mean to change behavior of the interface. It looks like dri2_create_image_khr_pixmap() is also passing in a number of pixels. I can't tell on dri2_create_image_mesa_drm_buffer(). Since it's an interface breakage, so I think we should fix it on the intel driver side, unless krh agrees that this is the intended interface all along and that we don't care about new libGL vs old Intel drivers in this particular case. pgpKqKbrc6cI6.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] r600g: only emit gfx cmd when there is actual work in it
From: Jerome Glisse Signed-off-by: Jerome Glisse --- src/gallium/drivers/r600/evergreen_compute.c | 2 ++ src/gallium/drivers/r600/r600_hw_context.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 6 ++ src/gallium/drivers/r600/r600_pipe.h | 1 + src/gallium/drivers/r600/r600_query.c| 2 ++ src/gallium/drivers/r600/r600_state_common.c | 1 + 6 files changed, 13 insertions(+) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index f4a7905..977595e 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -308,6 +308,8 @@ static void evergreen_emit_direct_dispatch( r600_write_value(cs, grid_layout[2]); /* VGT_DISPATCH_INITIATOR = COMPUTE_SHADER_EN */ r600_write_value(cs, 1); + + rctx->rings.gfx.cdraw++; } static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index d7518a5..511a276 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1122,6 +1122,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, size -= byte_count; src_offset += byte_count; dst_offset += byte_count; + rctx->rings.gfx.cdraw++; } } diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 6767412..af08cff 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -120,6 +120,10 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags) struct pipe_query *render_cond = NULL; unsigned render_cond_mode = 0; + if (!rctx->rings.gfx.cdraw) { + return; + } + rctx->rings.gfx.flushing = true; /* Disable render condition. */ if (rctx->current_render_cond) { @@ -130,6 +134,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags) r600_context_flush(rctx, flags); rctx->rings.gfx.flushing = false; + rctx->rings.gfx.cdraw = 0; r600_begin_new_cs(rctx); /* Re-enable render condition. */ @@ -387,6 +392,7 @@ static struct pipe_context *r600_create_context(struct pipe_screen *screen, void goto fail; } + rctx->rings.gfx.cdraw = 0; rctx->rings.gfx.cs = rctx->ws->cs_create(rctx->ws, RING_GFX); rctx->rings.gfx.flush = r600_flush_gfx_ring; rctx->ws->cs_set_flush_callback(rctx->rings.gfx.cs, r600_flush_from_winsys, rctx); diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 31dcd05..5c72756 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -418,6 +418,7 @@ struct r600_fetch_shader { struct r600_ring { struct radeon_winsys_cs *cs; boolflushing; + unsignedcdraw; void (*flush)(void *ctx, unsigned flags); }; diff --git a/src/gallium/drivers/r600/r600_query.c b/src/gallium/drivers/r600/r600_query.c index 0335189..7916f2d 100644 --- a/src/gallium/drivers/r600/r600_query.c +++ b/src/gallium/drivers/r600/r600_query.c @@ -149,6 +149,7 @@ static void r600_emit_query_begin(struct r600_context *ctx, struct r600_query *q cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF); cs->buf[cs->cdw++] = 0; cs->buf[cs->cdw++] = 0; + ctx->rings.gfx.cdraw++; break; default: assert(0); @@ -201,6 +202,7 @@ static void r600_emit_query_end(struct r600_context *ctx, struct r600_query *que cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF); cs->buf[cs->cdw++] = 0; cs->buf[cs->cdw++] = 0; + ctx->rings.gfx.cdraw++; break; default: assert(0); diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index b547d64..d4616ce 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -1439,6 +1439,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info r600_trace_emit(rctx); } #endif + rctx->rings.gfx.cdraw++; /* Set the depth buffer as dirty. */ if (rctx->framebuffer.state.zsbuf) { -- 1.7.11.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v5
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead of the buffer's current size so we know we always have to use the full size of the buffer object (i.e. even if it changes without the user calling TexBuffer again) for the texture. Clarify invalid offset alignment error message. v3: Use extra GL_CORE-only section in get_hash_params.py for TEXTURE_BUFFER_OFFSET_ALIGNMENT. v4: Remove unnecessary check for profile in _mesa_TexBufferRange. Add check for extension enable in get_tex_level_parameter_buffer. v5: Fix position in gl_API.xml. Add comment about meaning of BufferSize == -1. --- src/mapi/glapi/gen/ARB_texture_buffer_range.xml | 22 ++ src/mapi/glapi/gen/Makefile.am |1 + src/mapi/glapi/gen/gl_API.xml |4 + src/mesa/main/context.c |1 + src/mesa/main/extensions.c |1 + src/mesa/main/get.c |1 + src/mesa/main/get_hash_params.py|6 ++ src/mesa/main/mtypes.h |6 ++ src/mesa/main/teximage.c| 84 ++- src/mesa/main/teximage.h|4 + src/mesa/main/texparam.c| 12 +++ 11 files changed, 125 insertions(+), 17 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml new file mode 100644 index 000..2176c08 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml @@ -0,0 +1,22 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index f869d28..4d51bbc 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -108,6 +108,7 @@ API_XML = \ ARB_seamless_cube_map.xml \ ARB_sync.xml \ ARB_texture_buffer_object.xml \ + ARB_texture_buffer_range.xml \ ARB_texture_compression_rgtc.xml \ ARB_texture_float.xml \ ARB_texture_rg.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 404ccea..4cbd724 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8316,6 +8316,10 @@ http://www.w3.org/2001/XInclude"/> + + +http://www.w3.org/2001/XInclude"/> + diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 5e9e539..5058c07 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx) ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY; ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS; ctx->Const.MaxTextureBufferSize = 65536; + ctx->Const.TextureBufferOffsetAlignment = 1; ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE; ctx->Const.SubPixelBits = SUB_PIXEL_BITS; ctx->Const.MinPointSize = MIN_POINT_SIZE; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 5d01ac8..207572f 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -130,6 +130,7 @@ static const struct extension extension_table[] = { { "GL_ARB_texture_border_clamp", o(ARB_texture_border_clamp),GLL,2000 }, { "GL_ARB_texture_buffer_object", o(ARB_texture_buffer_object), GLC,2008 }, { "GL_ARB_texture_buffer_object_rgb32", o(ARB_texture_buffer_object_rgb32), GLC,2009 }, + { "GL_ARB_texture_buffer_range", o(ARB_texture_buffer_range),GLC,2012 }, { "GL_ARB_texture_compression", o(dummy_true), GLL,2000 }, { "GL_ARB_texture_compression_rgtc", o(ARB_texture_compression_rgtc),GL, 2004 }, { "GL_ARB_texture_cube_map",o(ARB_texture_cube_map), GLL,1999 }, diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 5f4e2fa..da1e01c 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object); EXTRA_EXT(ARB_timer_query); EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); +EXTRA_EXT(ARB_texture_buffer_range); static const int extra_NV_primitive_restart[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 26a722a..b6bed80 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -701,6 +701,12 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ], +]}, + +# Enums restricted to OpenGL Core profile +{ "apis"
Re: [Mesa-dev] [PATCH 1/2] r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM
On Fri, Jan 25, 2013 at 10:43 AM, Tom Stellard wrote: > From: Tom Stellard > > We were using the NEED_RADEON_GALLIUM conditional to decide whether or not > to build llvm_wrapper.cpp, which is required for using the LLVM backend. > llvm_wrapper.cpp needs to be linked against the LLVM IPO libary > and this library is only added to LLVM_LIBS if either opencl or the > r600-llvm-compiler is enabled. > > The NEED_RADEON_GALLIUM conditional is set to true when enabling the > radeonsi driver, so if the radeonsi and r600 drivers are enabled without > also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be > built, but the IPO library won't be added to LLVM_LIBS. This was > causing unresolved symbol errors when buiding with this configuration. > > https://bugs.freedesktop.org/show_bug.cgi?id=59831 confirmed this fixes the issue. for the series: Tested-by: Alex Deucher > --- > src/gallium/drivers/r600/Makefile.am | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/r600/Makefile.am > b/src/gallium/drivers/r600/Makefile.am > index 995261b..6de7e0f 100644 > --- a/src/gallium/drivers/r600/Makefile.am > +++ b/src/gallium/drivers/r600/Makefile.am > @@ -13,7 +13,8 @@ AM_CFLAGS = \ > libr600_la_SOURCES = \ > $(C_SOURCES) > > -if NEED_RADEON_GALLIUM > +if USE_R600_LLVM_COMPILER > +if HAVE_GALLIUM_COMPUTE > > libr600_la_SOURCES += \ > $(LLVM_C_SOURCES) \ > @@ -28,6 +29,7 @@ AM_CFLAGS += \ > AM_CXXFLAGS= \ > $(LLVM_CXXFLAGS) > endif > +endif > > if USE_R600_LLVM_COMPILER > AM_CFLAGS += \ > -- > 1.7.11.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g/llvm: Add dummy export for vs output
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59588 --- src/gallium/drivers/r600/r600_llvm.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 32b8e56..913dccc 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -374,9 +374,27 @@ static void llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) } } } + // Add dummy exports + if (ctx->type == TGSI_PROCESSOR_VERTEX) { + if (!next_param) { + lp_build_intrinsic_unary(base->gallivm->builder, "llvm.R600.store.dummy", + LLVMVoidTypeInContext(base->gallivm->context), + lp_build_const_int32(base->gallivm, V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PARAM)); + } + if (!(next_pos-60)) { + lp_build_intrinsic_unary(base->gallivm->builder, "llvm.R600.store.dummy", + LLVMVoidTypeInContext(base->gallivm->context), + lp_build_const_int32(base->gallivm, V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_POS)); + } + } + if (ctx->type == TGSI_PROCESSOR_FRAGMENT) { + if (!has_color) { + lp_build_intrinsic_unary(base->gallivm->builder, "llvm.R600.store.dummy", + LLVMVoidTypeInContext(base->gallivm->context), + lp_build_const_int32(base->gallivm, V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL)); + } + } - if (!has_color && ctx->type == TGSI_PROCESSOR_FRAGMENT) - lp_build_intrinsic(base->gallivm->builder, "llvm.R600.store.pixel.dummy", LLVMVoidTypeInContext(base->gallivm->context), 0, 0); } static void llvm_emit_tex( -- 1.8.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600: Make store_dummy intrinsic more general by passing export type
--- lib/Target/R600/R600Instructions.td | 9 +++-- lib/Target/R600/R600Intrinsics.td | 4 ++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/lib/Target/R600/R600Instructions.td b/lib/Target/R600/R600Instructions.td index 13293b6..3537906 100644 --- a/lib/Target/R600/R600Instructions.td +++ b/lib/Target/R600/R600Instructions.td @@ -608,9 +608,14 @@ multiclass ExportPattern cf_inst> { 0, 61, 7, 0, 7, 7, cf_inst, 0) >; - def : Pat<(int_R600_store_pixel_dummy), + def : Pat<(int_R600_store_dummy (i32 imm:$type)), (ExportInst -(v4f32 (IMPLICIT_DEF)), 0, 0, 7, 7, 7, 7, cf_inst, 0) +(v4f32 (IMPLICIT_DEF)), imm:$type, 0, 7, 7, 7, 7, cf_inst, 0) + >; + + def : Pat<(int_R600_store_dummy 1), +(ExportInst +(v4f32 (IMPLICIT_DEF)), 1, 60, 7, 7, 7, 7, cf_inst, 0) >; def : Pat<(EXPORT (v4f32 R600_Reg128:$src), (i32 imm:$base), (i32 imm:$type), diff --git a/lib/Target/R600/R600Intrinsics.td b/lib/Target/R600/R600Intrinsics.td index 4c652a6..b5e4f1e 100644 --- a/lib/Target/R600/R600Intrinsics.td +++ b/lib/Target/R600/R600Intrinsics.td @@ -24,6 +24,6 @@ let TargetPrefix = "R600", isTarget = 1 in { Intrinsic<[], [llvm_float_ty], []>; def int_R600_store_pixel_stencil : Intrinsic<[], [llvm_float_ty], []>; - def int_R600_store_pixel_dummy : - Intrinsic<[], [], []>; + def int_R600_store_dummy : + Intrinsic<[], [llvm_i32_ty], []>; } -- 1.8.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #7 from Tom Stellard --- (In reply to comment #5) > It was false to remove libr600_la_LDFLAGS in this patch: > http://cgit.freedesktop.org/mesa/mesa/commit/ > ?id=69d639ba8b3cfd95cfbb12b861dbe2eda53f2e25 > > And please change all Makefile.am to generate LLVM related LIBADDs this way > to avoid stupid dependencies if LLVM was compiled with the better cmake > build system which creates shared instead of static libs / one big shared > lib and can save memory this way. Generating different shared libraries depending on the build system used is a bug in LLVM. However, until it is fixed we need to support both build systems even if one is better. Adding llvm libraries in makefiles using llvm-config will not work when we are linking against shared libraries generated by an autotools build of LLVM, because then we will be linking against shared and static libraries at the same time. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 24 January 2013 19:44, Matt Turner wrote: > Following this email are eight patches that add the 4x8 pack/unpack > operations that are the difference between what GLSL ES 3.0 and > ARB_shading_language_packing require. > > They require Chad's gles3-glsl-packing series and are available at > > http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing > > I've also added testing support on top of Chad's piglit patch. The > {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to > spot why. > I had minor comments on patches 4/8 and 5/8. The remainder is: Reviewed-by: Paul Berry I didn't spot anything that would explain the failure in unpackUnorm4x8 tests. I'll go have a look at your piglit tests now, and if I don't find anything there either, I'll fire up the simulator and see if I can see what's going wrong. > > Please give it a look. I'd be nice to get this into 9.1. > > Thanks, > Matt > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #6 from Tom Stellard --- This should be fixed by this patch: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033482.html -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations
On 24 January 2013 19:47, Matt Turner wrote: > Lower them to arithmetic and bit manipulation expressions. > --- > src/glsl/ir_optimization.h |6 + > src/glsl/lower_packing_builtins.cpp | 279 > +++ > 2 files changed, 285 insertions(+), 0 deletions(-) > > diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h > index ac90b87..8f33018 100644 > --- a/src/glsl/ir_optimization.h > +++ b/src/glsl/ir_optimization.h > @@ -54,6 +54,12 @@ enum lower_packing_builtins_op { > > LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040, > LOWER_UNPACK_HALF_2x16_TO_SPLIT = 0x0080, > + > + LOWER_PACK_SNORM_4x8 = 0x0100, > + LOWER_UNPACK_SNORM_4x8 = 0x0200, > + > + LOWER_PACK_UNORM_4x8 = 0x0400, > + LOWER_UNPACK_UNORM_4x8 = 0x0800, > }; > > bool do_common_optimization(exec_list *ir, bool linked, > diff --git a/src/glsl/lower_packing_builtins.cpp > b/src/glsl/lower_packing_builtins.cpp > index 49176cc..aa6765f 100644 > --- a/src/glsl/lower_packing_builtins.cpp > +++ b/src/glsl/lower_packing_builtins.cpp > @@ -85,9 +85,15 @@ public: >case LOWER_PACK_SNORM_2x16: > *rvalue = lower_pack_snorm_2x16(op0); > break; > + case LOWER_PACK_SNORM_4x8: > + *rvalue = lower_pack_snorm_4x8(op0); > + break; >case LOWER_PACK_UNORM_2x16: > *rvalue = lower_pack_unorm_2x16(op0); > break; > + case LOWER_PACK_UNORM_4x8: > + *rvalue = lower_pack_unorm_4x8(op0); > + break; >case LOWER_PACK_HALF_2x16: > *rvalue = lower_pack_half_2x16(op0); > break; > @@ -97,9 +103,15 @@ public: >case LOWER_UNPACK_SNORM_2x16: > *rvalue = lower_unpack_snorm_2x16(op0); > break; > + case LOWER_UNPACK_SNORM_4x8: > + *rvalue = lower_unpack_snorm_4x8(op0); > + break; >case LOWER_UNPACK_UNORM_2x16: > *rvalue = lower_unpack_unorm_2x16(op0); > break; > + case LOWER_UNPACK_UNORM_4x8: > + *rvalue = lower_unpack_unorm_4x8(op0); > + break; >case LOWER_UNPACK_HALF_2x16: > *rvalue = lower_unpack_half_2x16(op0); > break; > @@ -137,18 +149,30 @@ private: >case ir_unop_pack_snorm_2x16: > result = op_mask & LOWER_PACK_SNORM_2x16; > break; > + case ir_unop_pack_snorm_4x8: > + result = op_mask & LOWER_PACK_SNORM_4x8; > + break; >case ir_unop_pack_unorm_2x16: > result = op_mask & LOWER_PACK_UNORM_2x16; > break; > + case ir_unop_pack_unorm_4x8: > + result = op_mask & LOWER_PACK_UNORM_4x8; > + break; >case ir_unop_pack_half_2x16: > result = op_mask & (LOWER_PACK_HALF_2x16 | > LOWER_PACK_HALF_2x16_TO_SPLIT); > break; >case ir_unop_unpack_snorm_2x16: > result = op_mask & LOWER_UNPACK_SNORM_2x16; > break; > + case ir_unop_unpack_snorm_4x8: > + result = op_mask & LOWER_UNPACK_SNORM_4x8; > + break; >case ir_unop_unpack_unorm_2x16: > result = op_mask & LOWER_UNPACK_UNORM_2x16; > break; > + case ir_unop_unpack_unorm_4x8: > + result = op_mask & LOWER_UNPACK_UNORM_4x8; > + break; >case ir_unop_unpack_half_2x16: > result = op_mask & (LOWER_UNPACK_HALF_2x16 | > LOWER_UNPACK_HALF_2x16_TO_SPLIT); > break; > @@ -214,6 +238,30 @@ private: > } > > /** > +* \brief Pack four uint8's into a single uint32. > +* > +* Interpret the given uvec4 as a uint32 quad. Pack the quad into a > uint32 > +* where the least significant bits specify the first element of the > quad. > +* Return the uint32. > +*/ > + ir_rvalue* > + pack_uvec4_to_uint(ir_rvalue *uvec4_rval) > + { > + assert(uvec4_rval->type == glsl_type::uvec4_type); > + > + /* uvec4 u = UVEC4_RVAL; */ > + ir_variable *u = factory.make_temp(glsl_type::uvec4_type, > + "tmp_pack_uvec4_to_uint"); > + factory.emit(assign(u, uvec4_rval)); > Rather than do four scalar bit_and(..., constant(0xffu)) instructions below, how about changing the above line to: factory.emit(assign(u, bit_and(uvec4_rval, constant(0xffu; That way we take advantage of vector processing in the GPU to do all four bit_ands at once. With that fixed (as well as the copy/paste errors Ian spotted), this patch is: Reviewed-by: Paul Berry > + > + /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff) > << 8) | (u.x & 0xff); */ > + return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)), > constant(24u)), > + lshift(bit_and(swizzle_z(u), constant(0xffu)), > constant(16u))), > +bit_or(lshift(bit_and(swizzle_y(u), constant(0xffu)), > constant(8u)), > + bit_and(swizzle_
[Mesa-dev] [PATCH 2/2] configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
From: Tom Stellard This is required when LLVM is built with CMake, which creates one shared library for each component. --- configure.ac | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index ccf95c5..90085de 100644 --- a/configure.ac +++ b/configure.ac @@ -1662,16 +1662,14 @@ if test "x$enable_gallium_llvm" = xyes; then if test "x$LLVM_CONFIG" != xno; then LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'` LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'` -if test "x$with_llvm_shared_libs" != xyes; then -LLVM_COMPONENTS="engine bitwriter" -if $LLVM_CONFIG --components | grep -q '\'; then -LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit" -fi +LLVM_COMPONENTS="engine bitwriter" +if $LLVM_CONFIG --components | grep -q '\'; then +LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit" +fi -if test "x$enable_opencl" = xyes; then -LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation" -fi - fi +if test "x$enable_opencl" = xyes; then +LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation" +fi LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags` LLVM_BINDIR=`$LLVM_CONFIG --bindir` LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"` @@ -1839,7 +1837,7 @@ if test "x$with_gallium_drivers" != x; then if test "x$enable_r600_llvm" = xyes; then USE_R600_LLVM_COMPILER=yes; fi -if test "x$enable_opencl" = xyes -a "x$with_llvm_shared_libs" = xno; then +if test "x$enable_opencl" = xyes; then LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser" fi gallium_check_st "radeon/drm" "dri-r600" "xorg-r600" "" "xvmc-r600" "vdpau-r600" -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM
From: Tom Stellard We were using the NEED_RADEON_GALLIUM conditional to decide whether or not to build llvm_wrapper.cpp, which is required for using the LLVM backend. llvm_wrapper.cpp needs to be linked against the LLVM IPO libary and this library is only added to LLVM_LIBS if either opencl or the r600-llvm-compiler is enabled. The NEED_RADEON_GALLIUM conditional is set to true when enabling the radeonsi driver, so if the radeonsi and r600 drivers are enabled without also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be built, but the IPO library won't be added to LLVM_LIBS. This was causing unresolved symbol errors when buiding with this configuration. https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- src/gallium/drivers/r600/Makefile.am | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/Makefile.am b/src/gallium/drivers/r600/Makefile.am index 995261b..6de7e0f 100644 --- a/src/gallium/drivers/r600/Makefile.am +++ b/src/gallium/drivers/r600/Makefile.am @@ -13,7 +13,8 @@ AM_CFLAGS = \ libr600_la_SOURCES = \ $(C_SOURCES) -if NEED_RADEON_GALLIUM +if USE_R600_LLVM_COMPILER +if HAVE_GALLIUM_COMPUTE libr600_la_SOURCES += \ $(LLVM_C_SOURCES) \ @@ -28,6 +29,7 @@ AM_CFLAGS += \ AM_CXXFLAGS= \ $(LLVM_CXXFLAGS) endif +endif if USE_R600_LLVM_COMPILER AM_CFLAGS += \ -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions
On 24 January 2013 19:47, Matt Turner wrote: > That is, evaluate constant expressions for the following functions: > packSnorm4x8, unpackSnorm4x8 > packUnorm4x8, unpackUnorm4x8 > --- > src/glsl/ir_constant_expression.cpp | 162 > +++ > 1 files changed, 162 insertions(+), 0 deletions(-) > > diff --git a/src/glsl/ir_constant_expression.cpp > b/src/glsl/ir_constant_expression.cpp > index b34c6e8..4796f6f 100644 > --- a/src/glsl/ir_constant_expression.cpp > +++ b/src/glsl/ir_constant_expression.cpp > @@ -76,12 +76,24 @@ bitcast_f2u(float f) > } > > /** > + * Evaluate one component of a floating-point 4x8 unpacking function. > + */ > +typedef uint8_t > +(*pack_1x8_func_t)(float); > + > +/** > * Evaluate one component of a floating-point 2x16 unpacking function. > */ > typedef uint16_t > (*pack_1x16_func_t)(float); > > /** > + * Evaluate one component of a floating-point 4x8 unpacking function. > + */ > +typedef float > +(*unpack_1x8_func_t)(uint8_t); > + > +/** > * Evaluate one component of a floating-point 2x16 unpacking function. > */ > typedef float > @@ -112,6 +124,32 @@ pack_2x16(pack_1x16_func_t pack_1x16, > } > > /** > + * Evaluate a 4x8 floating-point packing function. > + */ > +static uint32_t > +pack_4x8(pack_1x8_func_t pack_1x8, > + float x, float y, float z, float w) > +{ > + /* From section 8.4 of the GLSL 4.30 spec: > +* > +*packSnorm4x8 > +* > +*The first component of the vector will be written to the least > +*significant bits of the output; the last component will be > written to > +*the most significant bits. > +* > +* The specifications for the other packing functions contain similar > +* language. > +*/ > + uint32_t u = 0; > + u |= ((uint32_t) pack_1x8(x) << 0); > + u |= ((uint32_t) pack_1x8(y) << 8); > + u |= ((uint32_t) pack_1x8(z) << 16); > + u |= ((uint32_t) pack_1x8(w) << 24); > + return u; > +} > + > +/** > * Evaluate a 2x16 floating-point unpacking function. > */ > static void > @@ -135,6 +173,48 @@ unpack_2x16(unpack_1x16_func_t unpack_1x16, > } > > /** > + * Evaluate a 4x8 floating-point unpacking function. > + */ > +static void > +unpack_4x8(unpack_1x8_func_t unpack_1x8, uint32_t u, > + float *x, float *y, float *z, float *w) > +{ > +/* From section 8.4 of the GLSL 4.30 spec: > + * > + *unpackSnorm4x8 > + *-- > + *The first component of the returned vector will be extracted > from > + *the least significant bits of the input; the last component > will be > + *extracted from the most significant bits. > + * > + * The specifications for the other unpacking functions contain > similar > + * language. > + */ > + *x = unpack_1x8((uint8_t) (u & 0xff)); > + *y = unpack_1x8((uint8_t) (u >> 8)); > + *z = unpack_1x8((uint8_t) (u >> 16)); > + *w = unpack_1x8((uint8_t) (u >> 24)); > +} > + > +/** > + * Evaluate one component of packSnorm4x8. > + */ > +static uint8_t > +pack_snorm_1x8(float x) > +{ > +/* From section 8.4 of the GLSL 4.30 spec: > + * > + *packSnorm4x8 > + * > + *The conversion for component c of v to fixed point is done as > + *follows: > + * > + * packSnorm4x8: round(clamp(c, -1, +1) * 127.0) > + */ > + return (uint8_t) _mesa_round_to_even(CLAMP(x, -1.0f, +1.0f) * 127.0f); > +} > IIRC, Brian Paul has a patch out on the list that changes the return type of _mesa_round_to_even() to float. If & when that patch lands, this conversion will result in undefined behaviour, since casing from a negative float to an unsigned value is undefined by the C standard. I recommend changing this to "return (uint8_t) (int8_t) _mesa_round_to_even(...)" and adding a sentence to the comment to explain why this is necessary. See the existing pack_snorm_1x16() function, which used to have the same issue. With that change, this patch is: Reviewed-by: Paul Berry > + > +/** > * Evaluate one component of packSnorm2x16. > */ > static uint16_t > @@ -153,6 +233,24 @@ pack_snorm_1x16(float x) > } > > /** > + * Evaluate one component of unpackSnorm4x8. > + */ > +static float > +unpack_snorm_1x8(uint8_t u) > +{ > +/* From section 8.4 of the GLSL 4.30 spec: > + * > + *unpackSnorm4x8 > + *-- > + *The conversion for unpacked fixed-point value f to floating > point is > + *done as follows: > + * > + * unpackSnorm4x8: clamp(f / 127.0, -1, +1) > + */ > + return CLAMP((int8_t) u / 127.0f, -1.0f, +1.0f); > +} > + > +/** > * Evaluate one component of unpackSnorm2x16. > */ > static float > @@ -171,6 +269,24 @@ unpack_snorm_1x16(uint16_t u) > } > > /** > + * Evaluate one component packUnorm4x8. > + */ > +static uint8_t > +pack_unorm_1x8(float x) > +{ > +/* From section 8.4 of the GLSL 4.30 spec: > + * > +
Re: [Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v4
On 01/25/2013 08:54 AM, Christoph Bumiller wrote: v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead of the buffer's current size so we know we always have to use the full size of the buffer object (i.e. even if it changes without the user calling TexBuffer again) for the texture. Maybe make this a comment in the code somewhere. Perhaps at the BufferSize declaration in gl_texture_object? Clarify invalid offset alignment error message. v3: Use extra GL_CORE-only section in get_hash_params.py for TEXTURE_BUFFER_OFFSET_ALIGNMENT. v4: Remove unnecessary check for profile in _mesa_TexBufferRange. Add check for extension enable in get_tex_level_parameter_buffer. --- src/mapi/glapi/gen/ARB_texture_buffer_range.xml | 22 ++ src/mapi/glapi/gen/Makefile.am |1 + src/mapi/glapi/gen/gl_API.xml |2 + src/mesa/main/context.c |1 + src/mesa/main/extensions.c |1 + src/mesa/main/get.c |1 + src/mesa/main/get_hash_params.py|6 ++ src/mesa/main/mtypes.h |6 ++ src/mesa/main/teximage.c| 84 ++- src/mesa/main/teximage.h|4 + src/mesa/main/texparam.c| 12 +++ 11 files changed, 123 insertions(+), 17 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml new file mode 100644 index 000..2176c08 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml @@ -0,0 +1,22 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index f869d28..4d51bbc 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -108,6 +108,7 @@ API_XML = \ ARB_seamless_cube_map.xml \ ARB_sync.xml \ ARB_texture_buffer_object.xml \ + ARB_texture_buffer_range.xml \ ARB_texture_compression_rgtc.xml \ ARB_texture_float.xml \ ARB_texture_rg.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 404ccea..8d700a1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8151,6 +8151,8 @@ http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> + Everywhere else we sort alphabetically by name. Here, however, we sort by assigned extension number. It just happens that the other 3 extensions shown in this hunk have the same sort order for both. http://www.w3.org/2001/XInclude"/> http://www.w3.org/2001/XInclude"/> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 5e9e539..5058c07 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx) ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY; ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS; ctx->Const.MaxTextureBufferSize = 65536; + ctx->Const.TextureBufferOffsetAlignment = 1; ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE; ctx->Const.SubPixelBits = SUB_PIXEL_BITS; ctx->Const.MinPointSize = MIN_POINT_SIZE; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 5d01ac8..207572f 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -130,6 +130,7 @@ static const struct extension extension_table[] = { { "GL_ARB_texture_border_clamp", o(ARB_texture_border_clamp),GLL,2000 }, { "GL_ARB_texture_buffer_object", o(ARB_texture_buffer_object), GLC,2008 }, { "GL_ARB_texture_buffer_object_rgb32", o(ARB_texture_buffer_object_rgb32), GLC,2009 }, + { "GL_ARB_texture_buffer_range", o(ARB_texture_buffer_range),GLC,2012 }, { "GL_ARB_texture_compression", o(dummy_true), GLL,2000 }, { "GL_ARB_texture_compression_rgtc", o(ARB_texture_compression_rgtc),GL, 2004 }, { "GL_ARB_texture_cube_map",o(ARB_texture_cube_map), GLL,1999 }, diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 5f4e2fa..da1e01c 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object); EXTRA_EXT(ARB_timer_query); EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); +EXTRA_EXT(ARB_texture_buffer_range); static const int extra_NV_primitive_restart[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/mai
[Mesa-dev] [PATCH V6 6/8] intel: Create a miptree using offsets in intel_set_texture_image_region
When binding a region to a texture image, re-create the miptree base-level considering the offset and dimension information exported by DRIImage. Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_tex_image.c | 31 -- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c b/src/mesa/drivers/dri/intel/intel_tex_image.c index 7361e6a..a4cf883 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_image.c +++ b/src/mesa/drivers/dri/intel/intel_tex_image.c @@ -256,7 +256,11 @@ intel_set_texture_image_region(struct gl_context *ctx, GLenum target, GLenum internalFormat, gl_format format, - uint32_t offset) + uint32_t offset, + GLuint width, + GLuint height, + GLuint tile_x, + GLuint tile_y) { struct intel_context *intel = intel_context(ctx); struct intel_texture_image *intel_image = intel_texture_image(image); @@ -264,14 +268,22 @@ intel_set_texture_image_region(struct gl_context *ctx, struct intel_texture_object *intel_texobj = intel_texture_object(texobj); _mesa_init_teximage_fields(&intel->ctx, image, - region->width, region->height, 1, + width, height, 1, 0, internalFormat, format); ctx->Driver.FreeTextureImageBuffer(ctx, image); - intel_image->mt = intel_miptree_create_for_region(intel, target, -image->TexFormat, -region); + intel_image->mt = intel_miptree_create_layout(intel, target, image->TexFormat, + 0, 0, + width, height, 1, + true, 0 /* num_samples */, + INTEL_MSAA_LAYOUT_NONE); + intel_region_reference(&intel_image->mt->region, region); + intel_image->mt->total_width = width; + intel_image->mt->total_height = height; + intel_image->mt->level[0].slice[0].x_offset = tile_x; + intel_image->mt->level[0].slice[0].y_offset = tile_y; + if (intel_image->mt == NULL) return; intel_texobj->needs_validate = true; @@ -332,7 +344,10 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target, _mesa_lock_texture(&intel->ctx, texObj); texImage = _mesa_get_tex_image(ctx, texObj, target, level); intel_set_texture_image_region(ctx, texImage, rb->mt->region, target, - internalFormat, texFormat, 0); + internalFormat, texFormat, 0, + rb->mt->region->width, + rb->mt->region->height, + 0, 0); _mesa_unlock_texture(&intel->ctx, texObj); } @@ -363,7 +378,9 @@ intel_image_target_texture_2d(struct gl_context *ctx, GLenum target, intel_set_texture_image_region(ctx, texImage, image->region, target, image->internal_format, - image->format, image->offset); + image->format, image->offset, + image->width, image->height, + image->tile_x, image->tile_y); } void -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V6 8/8] intel: implement create image from texture
Save miptree level info to DRIImage: - Appropriately-aligned base offset pointing to the image - Additional x/y adjustment offsets from above. In non-tile-aligned surface cases where resolving back to the original image located in mip-levels higher than the base level proves problematic due to offset alignment issues, report INVALID_OPERATION as per spec wording. Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_screen.c | 179 + 1 file changed, 159 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_screen.c b/src/mesa/drivers/dri/intel/intel_screen.c index e0fe8c1..d23246a 100644 --- a/src/mesa/drivers/dri/intel/intel_screen.c +++ b/src/mesa/drivers/dri/intel/intel_screen.c @@ -31,11 +31,13 @@ #include "main/context.h" #include "main/framebuffer.h" #include "main/renderbuffer.h" +#include "main/texobj.h" #include "main/hash.h" #include "main/fbobject.h" #include "main/mfeatures.h" #include "main/version.h" #include "swrast/s_renderbuffer.h" +#include "egl/main/eglcurrent.h" #include "utils.h" #include "xmlpool.h" @@ -104,6 +106,10 @@ const GLuint __driNConfigOptions = 15; #include "intel_tex.h" #include "intel_regions.h" +#ifndef I915 +#include "brw_context.h" +#endif + #include "i915_drm.h" #ifdef USE_NEW_INTERFACE @@ -295,6 +301,87 @@ intel_allocate_image(int dri_format, void *loaderPrivate) return image; } +static void +intel_image_set_level_info(__DRIimage *image, struct intel_mipmap_tree *mt, + int level, int slice) +{ + unsigned int draw_x, draw_y; + uint32_t mask_x, mask_y; + + intel_region_get_tile_masks(mt->region, &mask_x, &mask_y, false); + intel_miptree_get_image_offset(mt, level, slice, &draw_x, &draw_y); + + image->width = mt->level[level].width; + image->height = mt->level[level].height; + image->tile_x = draw_x & mask_x; + image->tile_y = draw_y & mask_y; + + image->offset = intel_region_get_aligned_offset(mt->region, + draw_x & ~mask_x, + draw_y & ~mask_y, + false); +} + +/** + * Sets up a DRIImage structure to point to our shared image in a region + */ +static bool +intel_setup_image_from_mipmap_tree(struct intel_context *intel, __DRIimage *image, + struct intel_mipmap_tree *mt, GLuint level, + GLuint zoffset) +{ + bool has_surface_tile_offset = false; + uint32_t draw_x, draw_y; + + intel_miptree_check_level_layer(mt, level, zoffset); + intel_miptree_get_tile_offsets(mt, level, zoffset, &draw_x, &draw_y); + +#ifndef I915 + has_surface_tile_offset = brw_context(&intel->ctx)->has_surface_tile_offset; +#endif + if (!has_surface_tile_offset && + (draw_x != 0 || draw_y != 0)) + /* Non-tile aligned sufaces in gen4 hw and earlier have problems resolving + * back to our destination due to alignment issues. Bail-out and report error + */ + return false; + + intel_image_set_level_info(image, mt, level, zoffset); + intel_region_reference(&image->region, mt->region); + + return true; +} + +static void +intel_setup_image_from_dimensions(__DRIimage *image) +{ + image->width= image->region->width; + image->height = image->region->height; + image->tile_x = 0; + image->tile_y = 0; +} + +static inline uint32_t +intel_dri_format(GLuint format) +{ + switch (format) { + case MESA_FORMAT_RGB565: + return __DRI_IMAGE_FORMAT_RGB565; + case MESA_FORMAT_XRGB: + return __DRI_IMAGE_FORMAT_XRGB; + case MESA_FORMAT_ARGB: + return __DRI_IMAGE_FORMAT_ARGB; + case MESA_FORMAT_RGBA_REV: + return __DRI_IMAGE_FORMAT_ABGR; + case MESA_FORMAT_R8: + return __DRI_IMAGE_FORMAT_R8; + case MESA_FORMAT_RG88: + return __DRI_IMAGE_FORMAT_GR88; + } + + return MESA_FORMAT_NONE; +} + static __DRIimage * intel_create_image_from_name(__DRIscreen *screen, int width, int height, int format, @@ -317,6 +404,8 @@ intel_create_image_from_name(__DRIscreen *screen, return NULL; } +intel_setup_image_from_dimensions(image); + return image; } @@ -346,26 +435,69 @@ intel_create_image_from_renderbuffer(__DRIcontext *context, image->offset = 0; image->data = loaderPrivate; intel_region_reference(&image->region, irb->mt->region); + intel_setup_image_from_dimensions(image); + image->dri_format = intel_dri_format(image->format); - switch (image->format) { - case MESA_FORMAT_RGB565: - image->dri_format = __DRI_IMAGE_FORMAT_RGB565; - break; - case MESA_FORMAT_XRGB: - image->dri_format = __DRI_IMAGE_FORMAT_XRGB; - break; - case MESA_FORMAT_ARGB: - image->dri_format = __DRI_IMAGE_FORMAT_ARGB; - break; - case MESA_FORMAT_RGBA888
[Mesa-dev] [PATCH V6 7/8] intel: Account for mt->offset in intel_miptree_map
We need to take account the offset from original bo when using glTexSubImage() and other functions that manipulate the subregion of an exported texture. Offsets are appended to mapped region address and when blitting from a source region. Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 435f12f..ceb5322 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1126,7 +1126,7 @@ intel_miptree_map_gtt(struct intel_context *intel, assert(y % bh == 0); y /= bh; - base = intel_region_map(intel, mt->region, map->mode); + base = intel_region_map(intel, mt->region, map->mode) + mt->offset; if (base == NULL) map->ptr = NULL; @@ -1186,7 +1186,7 @@ intel_miptree_map_blit(struct intel_context *intel, if (!intelEmitCopyBlit(intel, mt->region->cpp, mt->region->pitch, mt->region->bo, - 0, mt->region->tiling, + mt->offset, mt->region->tiling, map->stride / mt->region->cpp, map->bo, 0, I915_TILING_NONE, x, y, -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V6 5/8] i965: Account for offsets when updating SURFACE_STATE.
If the offsets are present, this lets us specify a particular level and slice in a shared region using the base level of an exported mip-map tree. Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 +++- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 12 ++-- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index a2a875f..e37de8d 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -804,6 +804,7 @@ brw_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; int width, height, depth; + uint32_t tile_x, tile_y; if (tObj->Target == GL_TEXTURE_BUFFER) { brw_update_buffer_texture_surface(ctx, unit, binding_table, surf_index); @@ -837,7 +838,16 @@ brw_update_texture_surface(struct gl_context *ctx, surf[4] = 0; - surf[5] = (mt->align_h == 4) ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0; + intel_miptree_get_tile_offsets(intelObj->mt, 0, 0, &tile_x, &tile_y); + assert(brw->has_surface_tile_offset || (tile_x == 0 && tile_y == 0)); + /* Note that the low bits of these fields are missing, so +* there's the possibility of getting in trouble. +*/ + assert(tile_x % 4 == 0); + assert(tile_y % 2 == 0); + surf[5] = ((tile_x / 4) << BRW_SURFACE_X_OFFSET_SHIFT | + (tile_y / 2) << BRW_SURFACE_Y_OFFSET_SHIFT | + (mt->align_h == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0)); /* Emit relocation to surface contents */ drm_intel_bo_emit_reloc(brw->intel.batch.bo, diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 1e5af95..0eacd0a 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -302,6 +302,7 @@ gen7_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); struct gen7_surface_state *surf; int width, height, depth; + uint32_t tile_x, tile_y; if (tObj->Target == GL_TEXTURE_BUFFER) { gen7_update_buffer_texture_surface(ctx, unit, binding_table, surf_index); @@ -360,12 +361,19 @@ gen7_update_texture_surface(struct gl_context *ctx, /* ss4: ignored? */ + intel_miptree_get_tile_offsets(intelObj->mt, 0, 0, &tile_x, &tile_y); + assert(brw->has_surface_tile_offset || (tile_x == 0 && tile_y == 0)); + /* Note that the low bits of these fields are missing, so +* there's the possibility of getting in trouble. +*/ + assert(tile_x % 4 == 0); + assert(tile_y % 2 == 0); surf->ss5.mip_count = intelObj->_MaxLevel - tObj->BaseLevel; surf->ss5.min_lod = 0; + surf->ss5.x_offset = tile_x / 4; + surf->ss5.y_offset = tile_y / 2; /* ss5 remaining fields: -* - x_offset (N/A for textures?) -* - y_offset (ditto) * - cache_control */ -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V6 4/8] intel: add pixel offset calculator for miptree levels
Add helper to calculate fine-grained x and y adjustment pixels to an image within a miptree level for tiled regions. Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 15 +++ src/mesa/drivers/dri/intel/intel_mipmap_tree.h |6 ++ 2 files changed, 21 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index cc74d3c..435f12f 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -688,6 +688,21 @@ intel_miptree_get_image_offset(struct intel_mipmap_tree *mt, *y = mt->level[level].slice[slice].y_offset; } +void +intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt, + GLuint level, GLuint slice, + uint32_t *tile_x, + uint32_t *tile_y) +{ + struct intel_region *region = mt->region; + uint32_t mask_x, mask_y; + + intel_region_get_tile_masks(region, &mask_x, &mask_y, false); + + *tile_x = mt->level[level].slice[slice].x_offset & mask_x; + *tile_y = mt->level[level].slice[slice].y_offset & mask_y; +} + static void intel_miptree_copy_slice(struct intel_context *intel, struct intel_mipmap_tree *dst_mt, diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h index 1b2270a..d822491 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h @@ -460,6 +460,12 @@ void intel_miptree_get_dimensions_for_image(struct gl_texture_image *image, int *width, int *height, int *depth); +void +intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt, + GLuint level, GLuint slice, + uint32_t *tile_x, + uint32_t *tile_y); + void intel_miptree_set_level_info(struct intel_mipmap_tree *mt, GLuint level, GLuint x, GLuint y, -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V6 3/8] intel: Expose intel_miptree_create_internal as intel_miptree_create_layout.
Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 37 src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 14 - 2 files changed, 31 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8d814bd..cc74d3c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -71,18 +71,18 @@ target_to_target(GLenum target) *intel_miptree_create_for_region(). If true, then do not create *\c stencil_mt. */ -static struct intel_mipmap_tree * -intel_miptree_create_internal(struct intel_context *intel, - GLenum target, - gl_format format, - GLuint first_level, - GLuint last_level, - GLuint width0, - GLuint height0, - GLuint depth0, - bool for_region, - GLuint num_samples, - enum intel_msaa_layout msaa_layout) +struct intel_mipmap_tree * +intel_miptree_create_layout(struct intel_context *intel, +GLenum target, +gl_format format, +GLuint first_level, +GLuint last_level, +GLuint width0, +GLuint height0, +GLuint depth0, +bool for_region, +GLuint num_samples, +enum intel_msaa_layout msaa_layout) { struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1); int compress_byte = 0; @@ -262,7 +262,7 @@ intel_miptree_create(struct intel_context *intel, tiling = I915_TILING_X; } - mt = intel_miptree_create_internal(intel, target, format, + mt = intel_miptree_create_layout(intel, target, format, first_level, last_level, width0, height0, depth0, false, num_samples, msaa_layout); @@ -305,7 +305,6 @@ intel_miptree_create(struct intel_context *intel, return mt; } - struct intel_mipmap_tree * intel_miptree_create_for_region(struct intel_context *intel, GLenum target, @@ -314,11 +313,11 @@ intel_miptree_create_for_region(struct intel_context *intel, { struct intel_mipmap_tree *mt; - mt = intel_miptree_create_internal(intel, target, format, - 0, 0, - region->width, region->height, 1, - true, 0 /* num_samples */, - INTEL_MSAA_LAYOUT_NONE); + mt = intel_miptree_create_layout(intel, target, format, +0, 0, +region->width, region->height, 1, +true, 0 /* num_samples */, +INTEL_MSAA_LAYOUT_NONE); if (!mt) return mt; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h index eb4ad7f..1b2270a 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h @@ -387,6 +387,19 @@ struct intel_mipmap_tree *intel_miptree_create(struct intel_context *intel, enum intel_msaa_layout msaa_layout); struct intel_mipmap_tree * +intel_miptree_create_layout(struct intel_context *intel, +GLenum target, +gl_format format, +GLuint first_level, +GLuint last_level, +GLuint width0, +GLuint height0, +GLuint depth0, +bool for_region, +GLuint num_samples, +enum intel_msaa_layout msaa_layout); + +struct intel_mipmap_tree * intel_miptree_create_for_region(struct intel_context *intel, GLenum target, gl_format format, @@ -398,7 +411,6 @@ intel_miptree_create_for_dri2_buffer(struct intel_context *intel, gl_format format, uint32_t num_samples, struct intel_region *region); - /** * Create a miptree appropriate as the storage for a non-texture renderbuffer. * The miptree has the following properties: -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.free
[Mesa-dev] [PATCH V6 2/8] intel: expose dimensions and offsets of a miptree level in DRIImage
Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/intel/intel_regions.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_regions.h b/src/mesa/drivers/dri/intel/intel_regions.h index 8737a6d..1eef3b5 100644 --- a/src/mesa/drivers/dri/intel/intel_regions.h +++ b/src/mesa/drivers/dri/intel/intel_regions.h @@ -174,6 +174,12 @@ struct __DRIimageRec { uint32_t offsets[3]; struct intel_image_format *planar_format; + /* particular miptree level */ + GLuint width; + GLuint height; + GLuint tile_x; + GLuint tile_y; + void *data; }; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V6 1/8] dri2: Create image from texture
Add create image from texture extension and bump version. Signed-off-by: Abdiel Janulgue --- include/GL/internal/dri_interface.h | 14 +- src/egl/drivers/dri2/egl_dri2.c | 85 +++ 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 568581d..63cb2d6 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -937,7 +937,7 @@ struct __DRIdri2ExtensionRec { * extensions. */ #define __DRI_IMAGE "DRI_IMAGE" -#define __DRI_IMAGE_VERSION 5 +#define __DRI_IMAGE_VERSION 6 /** * These formats correspond to the similarly named MESA_FORMAT_* @@ -1086,6 +1086,18 @@ struct __DRIimageExtensionRec { */ __DRIimage *(*fromPlanar)(__DRIimage *image, int plane, void *loaderPrivate); + +/** + * Create image from texture. + * + * \since 6 + */ + __DRIimage *(*createImageFromTexture)(__DRIcontext *context, + int target, + unsigned texture, + int depth, + int level, + void *loaderPrivate); }; diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 1f13d79..5d83573 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -490,6 +490,11 @@ dri2_setup_screen(_EGLDisplay *disp) disp->Extensions.MESA_drm_image = EGL_TRUE; disp->Extensions.KHR_image_base = EGL_TRUE; disp->Extensions.KHR_gl_renderbuffer_image = EGL_TRUE; + if (dri2_dpy->image->base.version >= 5 && + dri2_dpy->image->createImageFromTexture) { + disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE; + disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE; + } } } @@ -1210,6 +1215,78 @@ dri2_create_image_wayland_wl_buffer(_EGLDisplay *disp, _EGLContext *ctx, } #endif +static _EGLImage * +dri2_create_image_khr_texture(_EGLDisplay *disp, _EGLContext *ctx, + EGLenum target, + EGLClientBuffer buffer, + const EGLint *attr_list) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx); + struct dri2_egl_image *dri2_img; + GLuint texture = (GLuint) (uintptr_t) buffer; + _EGLImageAttribs attrs; + GLuint depth; + GLenum gl_target; + + if (texture == 0) { + _eglError(EGL_BAD_PARAMETER, "dri2_create_image_khr"); + return EGL_NO_IMAGE_KHR; + } + + if (_eglParseImageAttribList(&attrs, disp, attr_list) != EGL_SUCCESS) + return EGL_NO_IMAGE_KHR; + + switch (target) { + case EGL_GL_TEXTURE_2D_KHR: + depth = 0; + gl_target = GL_TEXTURE_2D; + break; + case EGL_GL_TEXTURE_3D_KHR: + depth = attrs.GLTextureZOffset; + gl_target = GL_TEXTURE_3D; + break; + case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_X_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Y_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Y_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Z_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z_KHR: + depth = target - EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR; + gl_target = GL_TEXTURE_CUBE_MAP; + break; + default: + _eglError(EGL_BAD_PARAMETER, "dri2_create_image_khr"); + return EGL_NO_IMAGE_KHR; + } + + dri2_img = malloc(sizeof *dri2_img); + if (!dri2_img) { + _eglError(EGL_BAD_ALLOC, "dri2_create_image_khr"); + return EGL_NO_IMAGE_KHR; + } + + if (!_eglInitImage(&dri2_img->base, disp)) { + _eglError(EGL_BAD_ALLOC, "dri2_create_image_khr"); + free(dri2_img); + return EGL_NO_IMAGE_KHR; + } + + dri2_img->dri_image = + dri2_dpy->image->createImageFromTexture(dri2_ctx->dri_context, + gl_target, + texture, + depth, + attrs.GLTextureLevel, + dri2_img); + + if (!dri2_img->dri_image) { + free(dri2_img); + return EGL_NO_IMAGE_KHR; + } + return &dri2_img->base; +} + _EGLImage * dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, _EGLContext *ctx, EGLenum target, @@ -1218,6 +1295,14 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, (void) drv; switch (target) { + case EGL_GL_TEXTURE_2D_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_X_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Y_KHR: + case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Y_KHR: + case EGL_GL_TEXTURE_
[Mesa-dev] [PATCH V6 0/8] intel: add support for EGL_KHR_gl_image
- Rename draw_x/y to tile_x/y in dri image struct. These are now used as adjustment pixels from our stored aligned offset to the exported image instead of the entire x/y offset from the base address. - Take into consideration the offset from our bo so that sub-image functions resolves properly to the our original image. - Move mt->stencil_mt check out of misleading comment in intel_setup_image_from_mipmap_tree ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations
On 01/24/2013 10:47 PM, Matt Turner wrote: Lower them to arithmetic and bit manipulation expressions. --- src/glsl/ir_optimization.h |6 + src/glsl/lower_packing_builtins.cpp | 279 +++ 2 files changed, 285 insertions(+), 0 deletions(-) diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index ac90b87..8f33018 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -54,6 +54,12 @@ enum lower_packing_builtins_op { LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040, LOWER_UNPACK_HALF_2x16_TO_SPLIT = 0x0080, + + LOWER_PACK_SNORM_4x8 = 0x0100, + LOWER_UNPACK_SNORM_4x8 = 0x0200, + + LOWER_PACK_UNORM_4x8 = 0x0400, + LOWER_UNPACK_UNORM_4x8 = 0x0800, }; bool do_common_optimization(exec_list *ir, bool linked, diff --git a/src/glsl/lower_packing_builtins.cpp b/src/glsl/lower_packing_builtins.cpp index 49176cc..aa6765f 100644 --- a/src/glsl/lower_packing_builtins.cpp +++ b/src/glsl/lower_packing_builtins.cpp @@ -85,9 +85,15 @@ public: case LOWER_PACK_SNORM_2x16: *rvalue = lower_pack_snorm_2x16(op0); break; + case LOWER_PACK_SNORM_4x8: + *rvalue = lower_pack_snorm_4x8(op0); + break; case LOWER_PACK_UNORM_2x16: *rvalue = lower_pack_unorm_2x16(op0); break; + case LOWER_PACK_UNORM_4x8: + *rvalue = lower_pack_unorm_4x8(op0); + break; case LOWER_PACK_HALF_2x16: *rvalue = lower_pack_half_2x16(op0); break; @@ -97,9 +103,15 @@ public: case LOWER_UNPACK_SNORM_2x16: *rvalue = lower_unpack_snorm_2x16(op0); break; + case LOWER_UNPACK_SNORM_4x8: + *rvalue = lower_unpack_snorm_4x8(op0); + break; case LOWER_UNPACK_UNORM_2x16: *rvalue = lower_unpack_unorm_2x16(op0); break; + case LOWER_UNPACK_UNORM_4x8: + *rvalue = lower_unpack_unorm_4x8(op0); + break; case LOWER_UNPACK_HALF_2x16: *rvalue = lower_unpack_half_2x16(op0); break; @@ -137,18 +149,30 @@ private: case ir_unop_pack_snorm_2x16: result = op_mask & LOWER_PACK_SNORM_2x16; break; + case ir_unop_pack_snorm_4x8: + result = op_mask & LOWER_PACK_SNORM_4x8; + break; case ir_unop_pack_unorm_2x16: result = op_mask & LOWER_PACK_UNORM_2x16; break; + case ir_unop_pack_unorm_4x8: + result = op_mask & LOWER_PACK_UNORM_4x8; + break; case ir_unop_pack_half_2x16: result = op_mask & (LOWER_PACK_HALF_2x16 | LOWER_PACK_HALF_2x16_TO_SPLIT); break; case ir_unop_unpack_snorm_2x16: result = op_mask & LOWER_UNPACK_SNORM_2x16; break; + case ir_unop_unpack_snorm_4x8: + result = op_mask & LOWER_UNPACK_SNORM_4x8; + break; case ir_unop_unpack_unorm_2x16: result = op_mask & LOWER_UNPACK_UNORM_2x16; break; + case ir_unop_unpack_unorm_4x8: + result = op_mask & LOWER_UNPACK_UNORM_4x8; + break; case ir_unop_unpack_half_2x16: result = op_mask & (LOWER_UNPACK_HALF_2x16 | LOWER_UNPACK_HALF_2x16_TO_SPLIT); break; @@ -214,6 +238,30 @@ private: } /** +* \brief Pack four uint8's into a single uint32. +* +* Interpret the given uvec4 as a uint32 quad. Pack the quad into a uint32 +* where the least significant bits specify the first element of the quad. +* Return the uint32. +*/ + ir_rvalue* + pack_uvec4_to_uint(ir_rvalue *uvec4_rval) + { + assert(uvec4_rval->type == glsl_type::uvec4_type); + + /* uvec4 u = UVEC4_RVAL; */ + ir_variable *u = factory.make_temp(glsl_type::uvec4_type, + "tmp_pack_uvec4_to_uint"); + factory.emit(assign(u, uvec4_rval)); + + /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff) << 8) | (u.x & 0xff); */ + return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)), constant(24u)), + lshift(bit_and(swizzle_z(u), constant(0xffu)), constant(16u))), +bit_or(lshift(bit_and(swizzle_y(u), constant(0xffu)), constant(8u)), + bit_and(swizzle_x(u), constant(0xffu; + } + + /** * \brief Unpack a uint32 into two uint16's. * * Interpret the given uint32 as a uint16 pair where the uint32's least @@ -244,6 +292,44 @@ private: } /** +* \brief Unpack a uint32 into four uint8's. +* +* Interpret the given uint32 as a uint8 quad where the uint32's least +* significant bits specify the quad's first element. Return the uint8 +* quad as a uvec4. +*/ + ir_rvalue* + unpack_uint_to_uvec4(ir_rvalue *uint_rval) + { + assert(uint_rval->type == glsl_type::uint_
Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing
On 01/24/2013 10:44 PM, Matt Turner wrote: Following this email are eight patches that add the 4x8 pack/unpack operations that are the difference between what GLSL ES 3.0 and ARB_shading_language_packing require. They require Chad's gles3-glsl-packing series and are available at http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing I've also added testing support on top of Chad's piglit patch. The {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to spot why. Do they pass of you modify the tests to use my_unpackUnorm4x8 and hand-code a my_unpackUnorm4x8 that does what your lowering pass generates? In other words, is it possible this is exposing an existing bug? Please give it a look. I'd be nice to get this into 9.1. Thanks, Matt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 Johannes Obermayr changed: What|Removed |Added Assignee|mesa-dev@lists.freedesktop. |tstel...@gmail.com |org | QA Contact||mesa-dev@lists.freedesktop. ||org --- Comment #5 from Johannes Obermayr --- It was false to remove libr600_la_LDFLAGS in this patch: http://cgit.freedesktop.org/mesa/mesa/commit/?id=69d639ba8b3cfd95cfbb12b861dbe2eda53f2e25 And please change all Makefile.am to generate LLVM related LIBADDs this way to avoid stupid dependencies if LLVM was compiled with the better cmake build system which creates shared instead of static libs / one big shared lib and can save memory this way. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59851] New: AC_ARG_WITH misusage leading to mesa configure failure
https://bugs.freedesktop.org/show_bug.cgi?id=59851 Priority: medium Bug ID: 59851 Assignee: mesa-dev@lists.freedesktop.org Summary: AC_ARG_WITH misusage leading to mesa configure failure Severity: normal Classification: Unclassified OS: All Reporter: sardemff7+freedesk...@sardemff7.net Hardware: All Status: NEW Version: git Component: Mesa core Product: Mesa Created attachment 73648 --> https://bugs.freedesktop.org/attachment.cgi?id=73648&action=edit Patch to fix mesa configure Copy of the commit message: The third argument of AC_ARG_WITH is evaluated for any provided value, not only on --with-, so it must not force-enable the feature Also, setting $with_llvm_shared_libs in the opencl check was overriding the user switch -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] glsl: Add infrastructure for ARB_shading_language_packing
On 01/24/2013 10:47 PM, Matt Turner wrote: --- src/glsl/builtins/tools/generate_builtins.py |1 + src/glsl/glcpp/glcpp-parse.y |3 +++ src/glsl/glsl_parser_extras.cpp |1 + src/glsl/glsl_parser_extras.h|2 ++ src/glsl/standalone_scaffolding.cpp |1 + src/mesa/main/extensions.c |1 + src/mesa/main/mtypes.h |1 + 7 files changed, 10 insertions(+), 0 deletions(-) diff --git a/src/glsl/builtins/tools/generate_builtins.py b/src/glsl/builtins/tools/generate_builtins.py index 2cfb1a3..3db862e 100755 --- a/src/glsl/builtins/tools/generate_builtins.py +++ b/src/glsl/builtins/tools/generate_builtins.py @@ -189,6 +189,7 @@ read_builtins(GLenum target, const char *protos, const char **functions, unsigne st->OES_EGL_image_external_enable = true; st->ARB_shader_bit_encoding_enable = true; st->ARB_texture_cube_map_array_enable = true; + st->ARB_shading_language_packing_enable = true; _mesa_glsl_initialize_types(st); sh->ir = new(sh) exec_list; diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 8fba923..e927c7c 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -1227,6 +1227,9 @@ glcpp_parser_create (const struct gl_extensions *extensions, int api) if (extensions->ARB_texture_cube_map_array) add_builtin_define(parser, "GL_ARB_texture_cube_map_array", 1); + + if (extensions->ARB_shading_language_packing) +add_builtin_define(parser, "GL_ARB_shading_language_packing", 1); } } diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index b460c86..c8dbc89 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -462,6 +462,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = { EXT(ARB_uniform_buffer_object, true, false, true, true, false, ARB_uniform_buffer_object), EXT(OES_standard_derivatives, false, false, true, false, true, OES_standard_derivatives), EXT(ARB_texture_cube_map_array, true, false, true, true, false, ARB_texture_cube_map_array), + EXT(ARB_shading_language_packing, true, false, true, true, false, ARB_shading_language_packing), This array should be sorted... }; #undef EXT diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 2e6bb0b..53df149 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -272,6 +272,8 @@ struct _mesa_glsl_parse_state { bool OES_standard_derivatives_warn; bool ARB_texture_cube_map_array_enable; bool ARB_texture_cube_map_array_warn; + bool ARB_shading_language_packing_enable; + bool ARB_shading_language_packing_warn; /*@}*/ /** Extensions supported by the OpenGL implementation. */ diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index ccf5b4f..8b12f81 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -101,6 +101,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) ctx->Extensions.ARB_shader_bit_encoding = true; ctx->Extensions.OES_standard_derivatives = true; ctx->Extensions.ARB_texture_cube_map_array = true; + ctx->Extensions.ARB_shading_language_packing = true; ctx->Const.GLSLVersion = 120; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index fd25d31..fb41760 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -125,6 +125,7 @@ static const struct extension extension_table[] = { { "GL_ARB_shader_stencil_export", o(ARB_shader_stencil_export), GL, 2009 }, { "GL_ARB_shader_texture_lod", o(ARB_shader_texture_lod), GL, 2009 }, { "GL_ARB_shading_language_100", o(ARB_shading_language_100),GLL,2003 }, + { "GL_ARB_shading_language_packing", o(ARB_shading_language_packing),GL, 2011 }, { "GL_ARB_shadow", o(ARB_shadow), GLL,2001 }, { "GL_ARB_sync",o(ARB_sync), GL, 2003 }, { "GL_ARB_texture_border_clamp", o(ARB_texture_border_clamp),GLL,2000 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index cba1e16..254679f 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3042,6 +3042,7 @@ struct gl_extensions GLboolean ARB_shader_stencil_export; GLboolean ARB_shader_texture_lod; GLboolean ARB_shading_language_100; + GLboolean ARB_shading_language_packing; GLboolean ARB_sha
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #4 from Tom Stellard --- The problem is that llvm_wrapper.cpp is being built without --enable-opencl or --enable-r600-llvm-compiler, so the necessary libraries haven't been added to LLVM_LIBS. The fix is to disable building of llvm_wrapper.cpp in this case. I will write a patch. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/32] glsl: Generate an interface type for uniform blocks
On 01/23/2013 09:49 PM, Paul Berry wrote: On 22 January 2013 00:52, Ian Romanick mailto:i...@freedesktop.org>> wrote: From: Ian Romanick mailto:ian.d.roman...@intel.com>> If the block has an instance name, add the instance name to the symbol table instead of the individual fields. Fixes the piglit test interface-name-access-without-interface-name.vert for real. Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>> --- src/glsl/ast_to_hir.cpp | 167 ++-- 1 file changed, 118 insertions(+), 49 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 575dd84..a740a3c 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4020,7 +4020,9 @@ ast_process_structure_or_interface_block(exec_list *instructions, struct _mesa_glsl_parse_state *state, exec_list *declarations, YYLTYPE &loc, -glsl_struct_field **fields_ret) +glsl_struct_field **fields_ret, + bool is_interface, + bool block_row_major) { unsigned decl_count = 0; @@ -4062,7 +4064,32 @@ ast_process_structure_or_interface_block(exec_list *instructions, foreach_list_typed (ast_declaration, decl, link, &decl_list->declarations) { -const struct glsl_type *field_type = decl_type; + /* From the GL_ARB_uniform_buffer_object spec: + * + * "Sampler types are not allowed inside of uniform + * blocks. All other types, arrays, and structures + * allowed for uniforms are allowed within a uniform + * block." + */ + const struct glsl_type *field_type = decl_type; + + if (is_interface && field_type->contains_sampler()) { +YYLTYPE loc = decl_list->get_location(); +_mesa_glsl_error(&loc, state, + "Uniform in non-default uniform block contains sampler\n"); + } + + const struct ast_type_qualifier *const qual = +& decl_list->type->qualifier; + if (qual->flags.q.std140 || + qual->flags.q.packed || + qual->flags.q.shared) { +_mesa_glsl_error(&loc, state, + "uniform block layout qualifiers std140, packed, and " + "shared can only be applied to uniform blocks, not " + "members"); + } + if (decl->is_array) { field_type = process_array_type(&loc, decl_type, decl->array_size, state); @@ -4070,6 +4097,26 @@ ast_process_structure_or_interface_block(exec_list *instructions, fields[i].type = (field_type != NULL) ? field_type : glsl_type::error_type; fields[i].name = decl->identifier; + + if (qual->flags.q.row_major || qual->flags.q.column_major) { +if (!field_type->is_matrix() && !field_type->is_record()) { + _mesa_glsl_error(&loc, state, +"uniform block layout qualifiers row_major and " +"column_major can only be applied to matrix and " +"structure types"); +} else + validate_matrix_layout_for_type(state, &loc, field_type); + } + + if (field_type->is_matrix() || + (field_type->is_array() && field_type->fields.array->is_matrix())) { +fields[i].row_major = block_row_major; +if (qual->flags.q.row_major) + fields[i].row_major = true; +else if (qual->flags.q.column_major) + fields[i].row_major = false; + } + i++; } } @@ -4092,7 +4139,9 @@ ast_struct_specifier::hir(exec_list *instructions, state, &this->declarations, loc, - &fields); + &fields, + false, + false); const glsl_type *t = glsl_type::get_record_instance(fields, decl_count, t
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #3 from Michel Dänzer --- (In reply to comment #2) > without. Is that required now? No, but I do wonder if we shouldn't drop support for linking LLVM statically. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v4
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead of the buffer's current size so we know we always have to use the full size of the buffer object (i.e. even if it changes without the user calling TexBuffer again) for the texture. Clarify invalid offset alignment error message. v3: Use extra GL_CORE-only section in get_hash_params.py for TEXTURE_BUFFER_OFFSET_ALIGNMENT. v4: Remove unnecessary check for profile in _mesa_TexBufferRange. Add check for extension enable in get_tex_level_parameter_buffer. --- src/mapi/glapi/gen/ARB_texture_buffer_range.xml | 22 ++ src/mapi/glapi/gen/Makefile.am |1 + src/mapi/glapi/gen/gl_API.xml |2 + src/mesa/main/context.c |1 + src/mesa/main/extensions.c |1 + src/mesa/main/get.c |1 + src/mesa/main/get_hash_params.py|6 ++ src/mesa/main/mtypes.h |6 ++ src/mesa/main/teximage.c| 84 ++- src/mesa/main/teximage.h|4 + src/mesa/main/texparam.c| 12 +++ 11 files changed, 123 insertions(+), 17 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml new file mode 100644 index 000..2176c08 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml @@ -0,0 +1,22 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index f869d28..4d51bbc 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -108,6 +108,7 @@ API_XML = \ ARB_seamless_cube_map.xml \ ARB_sync.xml \ ARB_texture_buffer_object.xml \ + ARB_texture_buffer_range.xml \ ARB_texture_compression_rgtc.xml \ ARB_texture_float.xml \ ARB_texture_rg.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 404ccea..8d700a1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8151,6 +8151,8 @@ http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> + http://www.w3.org/2001/XInclude"/> http://www.w3.org/2001/XInclude"/> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 5e9e539..5058c07 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx) ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY; ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS; ctx->Const.MaxTextureBufferSize = 65536; + ctx->Const.TextureBufferOffsetAlignment = 1; ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE; ctx->Const.SubPixelBits = SUB_PIXEL_BITS; ctx->Const.MinPointSize = MIN_POINT_SIZE; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 5d01ac8..207572f 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -130,6 +130,7 @@ static const struct extension extension_table[] = { { "GL_ARB_texture_border_clamp", o(ARB_texture_border_clamp),GLL,2000 }, { "GL_ARB_texture_buffer_object", o(ARB_texture_buffer_object), GLC,2008 }, { "GL_ARB_texture_buffer_object_rgb32", o(ARB_texture_buffer_object_rgb32), GLC,2009 }, + { "GL_ARB_texture_buffer_range", o(ARB_texture_buffer_range),GLC,2012 }, { "GL_ARB_texture_compression", o(dummy_true), GLL,2000 }, { "GL_ARB_texture_compression_rgtc", o(ARB_texture_compression_rgtc),GL, 2004 }, { "GL_ARB_texture_cube_map",o(ARB_texture_cube_map), GLL,1999 }, diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 5f4e2fa..da1e01c 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object); EXTRA_EXT(ARB_timer_query); EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); +EXTRA_EXT(ARB_texture_buffer_range); static const int extra_NV_primitive_restart[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 26a722a..b6bed80 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -701,6 +701,12 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ], +]}, + +# Enums restricted to OpenGL Core profile +{ "apis": ["GL_CORE"], "p
Re: [Mesa-dev] [PATCH 24/32] glsl: Make the align function available elsewhere in the linker
On 01/24/2013 08:40 PM, Kenneth Graunke wrote: On 01/22/2013 12:52 AM, Ian Romanick wrote: From: Ian Romanick Signed-off-by: Ian Romanick --- src/glsl/glsl_types.cpp | 12 +++- src/glsl/glsl_types.h| 6 ++ src/glsl/link_uniforms.cpp | 14 -- src/glsl/lower_ubo_reference.cpp | 19 +++ 4 files changed, 20 insertions(+), 31 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index 0075550..ddd0148 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -863,12 +863,6 @@ glsl_type::std140_base_alignment(bool row_major) const return -1; } -static unsigned -align(unsigned val, unsigned align) -{ - return (val + align - 1) / align * align; -} - Why not just eliminate this function altogether and use ALIGN() from macros.h? (The implementation is slightly different, but I think it should work.) I thought about that. The ALIGN macro only works when align is a power of two, and it wasn't obvious to me that all the uses of this function met that requirement. I did this refactor right before sending this series out, and it felt a little like the 11th hour to do something that could have a functional change. I'd prefer to revisit this after the release. unsigned glsl_type::std140_size(bool row_major) const { @@ -970,11 +964,11 @@ glsl_type::std140_size(bool row_major) const for (unsigned i = 0; i < this->length; i++) { const struct glsl_type *field_type = this->fields.structure[i].type; unsigned align = field_type->std140_base_alignment(row_major); - size = (size + align - 1) / align * align; + size = glsl_align(size, align); size += field_type->std140_size(row_major); } - size = align(size, - this->fields.structure[0].type->std140_base_alignment(row_major)); + size = glsl_align(size, + this->fields.structure[0].type->std140_base_alignment(row_major)); return size; } diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h index 8588685..b0db2bf 100644 --- a/src/glsl/glsl_types.h +++ b/src/glsl/glsl_types.h @@ -601,6 +601,12 @@ struct glsl_struct_field { bool row_major; }; +static inline unsigned int +glsl_align(unsigned int a, unsigned int align) +{ + return (a + align - 1) / align * align; +} + #endif /* __cplusplus */ #endif /* GLSL_TYPES_H */ diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 2a1af6b..439b711 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -29,12 +29,6 @@ #include "program/hash_table.h" #include "program.h" -static inline unsigned int -align(unsigned int a, unsigned int align) -{ - return (a + align - 1) / align * align; -} - /** * \file link_uniforms.cpp * Assign locations for GLSL uniforms. @@ -421,13 +415,13 @@ private: this->uniforms[id].block_index = this->ubo_block_index; unsigned alignment = type->std140_base_alignment(ubo_row_major); - this->ubo_byte_offset = align(this->ubo_byte_offset, alignment); + this->ubo_byte_offset = glsl_align(this->ubo_byte_offset, alignment); this->uniforms[id].offset = this->ubo_byte_offset; this->ubo_byte_offset += type->std140_size(ubo_row_major); if (type->is_array()) { this->uniforms[id].array_stride = - align(type->fields.array->std140_size(ubo_row_major), 16); + glsl_align(type->fields.array->std140_size(ubo_row_major), 16); } else { this->uniforms[id].array_stride = 0; } @@ -564,7 +558,7 @@ link_assign_uniform_block_offsets(struct gl_shader *shader) unsigned alignment = type->std140_base_alignment(ubo_var->RowMajor); unsigned size = type->std140_size(ubo_var->RowMajor); - offset = align(offset, alignment); + offset = glsl_align(offset, alignment); ubo_var->Offset = offset; offset += size; } @@ -580,7 +574,7 @@ link_assign_uniform_block_offsets(struct gl_shader *shader) * and rounding up to the next multiple of the base * alignment required for a vec4." */ - block->UniformBufferSize = align(offset, 16); + block->UniformBufferSize = glsl_align(offset, 16); } } diff --git a/src/glsl/lower_ubo_reference.cpp b/src/glsl/lower_ubo_reference.cpp index 1d08009..8d13ec1 100644 --- a/src/glsl/lower_ubo_reference.cpp +++ b/src/glsl/lower_ubo_reference.cpp @@ -61,12 +61,6 @@ public: bool progress; }; -static inline unsigned int -align(unsigned int a, unsigned int align) -{ - return (a + align - 1) / align * align; -} - void lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue) { @@ -113,7 +107,7 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue) array_stride = 4; } else { array_stride = deref_array->type->std140_size(row_major); -array_stride = align(array_stride, 16); +array_stride = glsl_align
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #2 from Alex Deucher --- without. Is that required now? My configure options are: ./autogen.sh --prefix=/usr --libdir=/usr/lib64 --with-dri-drivers=radeon,r200 --with-gallium-drivers=r300,r600,radeonsi,swrast --enable-gles1 --enable-gles2 --enable-xorg --enable-vdpau --enable-shared-glapi --enable-gbm --enable-gallium-llvm --with-egl-platforms=drm --enable-glx-tls --enable-debug Also, I'm using llvm c5c65f9ad0e1e897f6d828248bdf25a6714cdd09 from Tom's tree. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g
https://bugs.freedesktop.org/show_bug.cgi?id=59831 --- Comment #1 from Michel Dänzer --- Is that with or without --with-llvm-shared-libs for the Mesa build? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/32] UBOs for OpenGL ES 3.0
23 & 26-31 Reviewed-by: Jordan Justen On Tue, Jan 22, 2013 at 12:51 AM, Ian Romanick wrote: > So here it is. > > This is the last of the UBO instance and array instance rework for the > linker. It's a giant pile of patches, so let me explain what's going > on. > > Previous to this patch series, information about the layout of a UBO was > created at compile-time during ast-to-ir translation. This made it > somewhere between difficult and impossible to implement several require > features for OpenGL ES 3.0 conformance. > > 1. Uniform blocks with an instance name. These blocks have different > scoping rules, and the fields are exposed to applications differently > through the GL API. In the shader, these are accessed like structures. > > 2. Arrays of uniform blocks. These basically compound the issues of > instance names. For example, to query the layout of an instance array > block, you do *not* use the array index. > > 3. Marking unused block members and unused blocks as not active. This > was actually way more annoying to deal than I had expected. Even with > the std140 layout, if a block member is never used in a shader, it > should not show up in the active list. > > All of these issues led me to a design that does all of the layout > during linking. This allows our usual dead variable elimination and a > bunch of other nice things. > > To do this, I added a new type called GLSL_TYPE_INTERFACE. Interfaces > work mostly like structures, but they have additional semantic > limitations (imposed by the language). Once that was in place in the > compiler front-end, the linker just needed to detect unused blocks and > block members, cross-validate the blocks, and assign the offsets. > > The bulk of the added code is in link_uniform_blocks. This is the real > work-horse of the whole deal. The functions that do all the > intra-shader layouts and name assignments for the blocks live here. > > Other than the few cases mentioned in individual commit messages, there > are no commit-to-commit piglit or gles3conform regressions. I don't > believe there are any commit-to-commit build failures, but I'll double > check that before I push. > > With this series, i965 passes all of the gles3conform UBO tests on IVB. > I believe there is still one issue on SNB, but I haven't tested it. > > src/glsl/Makefile.sources | 2 + > src/glsl/ast.h | 12 ++- > src/glsl/ast_to_hir.cpp| 248 > +++--- > src/glsl/builtin_types.h | 74 +++ > src/glsl/glsl_parser.yy| 82 - > src/glsl/glsl_symbol_table.cpp | 14 +-- > src/glsl/glsl_symbol_table.h | 1 - > src/glsl/glsl_types.cpp| 94 +++ > src/glsl/glsl_types.h | 43 - > src/glsl/hir_field_selection.cpp | 3 +- > src/glsl/ir.cpp| 1 - > src/glsl/ir.h | 33 --- > src/glsl/ir_clone.cpp | 12 ++- > src/glsl/link_uniform_block_active_visitor.cpp | 162 > + > src/glsl/link_uniform_block_active_visitor.h | 62 + > src/glsl/link_uniform_blocks.cpp | 313 > > src/glsl/link_uniform_initializers.cpp | 6 +- > src/glsl/link_uniforms.cpp | 250 > +++ > src/glsl/linker.cpp| 25 ++ > src/glsl/linker.h | 45 +- > src/glsl/lower_ubo_reference.cpp | 104 ++--- > src/glsl/opt_dead_code.cpp | 7 +- > src/glsl/tests/uniform_initializer_utils.cpp | 3 + > src/mesa/drivers/dri/i965/brw_fs.cpp | 8 +- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +- > src/mesa/drivers/dri/i965/brw_shader.cpp | 8 +- > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 10 ++- > src/mesa/main/mtypes.h | 27 ++ > src/mesa/main/uniforms.c | 2 +- > src/mesa/program/ir_to_mesa.cpp| 26 -- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 8 +- > 31 files changed, 1355 insertions(+), 336 deletions(-) > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/32] glsl: Add link_uniform_blocks to calculate all UBO data at link-time
On Tue, Jan 22, 2013 at 12:52 AM, Ian Romanick wrote: > From: Ian Romanick > > Calculate all of the block member offsets, the IndexNames, and > everything else to do with every UBO. > > Signed-off-by: Ian Romanick > --- > src/glsl/link_uniform_blocks.cpp | 248 > +++ > src/glsl/linker.h| 7 ++ > 2 files changed, 255 insertions(+) > > diff --git a/src/glsl/link_uniform_blocks.cpp > b/src/glsl/link_uniform_blocks.cpp > index c9cbde9..74fe1e2 100644 > --- a/src/glsl/link_uniform_blocks.cpp > +++ b/src/glsl/link_uniform_blocks.cpp > @@ -25,8 +25,256 @@ > #include "ir.h" > #include "linker.h" > #include "ir_uniform.h" > +#include "link_uniform_block_active_visitor.h" > +#include "main/hash_table.h" > #include "program.h" > > +class ubo_visitor : public uniform_field_visitor { > +public: > + ubo_visitor(void *mem_ctx, gl_uniform_buffer_variable *variables, > + unsigned num_variables) > + : index(0), offset(0), buffer_size(0), variables(variables), > +num_variables(num_variables), mem_ctx(mem_ctx), > is_array_instance(false) > + { > + /* empty */ > + } > + > + void process(const glsl_type *type, const char *name) > + { > + this->offset = 0; > + this->buffer_size = 0; > + this->is_array_instance = strchr(name, ']') != NULL; > + this->uniform_field_visitor::process(type, name); > + } > + > + unsigned index; > + unsigned offset; > + unsigned buffer_size; > + gl_uniform_buffer_variable *variables; > + unsigned num_variables; > + void *mem_ctx; > + bool is_array_instance; > + > +private: > + virtual void visit_field(const glsl_type *type, const char *name, > +bool row_major) > + { > + assert(this->index < this->num_variables); > + > + gl_uniform_buffer_variable *v = &this->variables[this->index++]; > + > + v->Name = ralloc_strdup(mem_ctx, name); > + v->Type = type; > + v->RowMajor = row_major; > + > + if (this->is_array_instance) { > + v->IndexName = ralloc_strdup(mem_ctx, name); > + > + char *open_bracket = strchr(v->IndexName, '['); > + assert(open_bracket != NULL); > + > + char *close_bracket = strchr(open_bracket, ']'); > + assert(close_bracket != NULL); > + > + /* Length of the tail without the ']' but with the NUL. > + */ > + unsigned len = strlen(close_bracket + 1) + 1; > + > + memmove(open_bracket, close_bracket + 1, len); > + } else { Missing a space of indentation. -Jordan > + v->IndexName = v->Name; > + } > + > + unsigned alignment = type->std140_base_alignment(v->RowMajor); > + unsigned size = type->std140_size(v->RowMajor); > + > + this->offset = glsl_align(this->offset, alignment); > + v->Offset = this->offset; > + this->offset += size; > + > + /* From the GL_ARB_uniform_buffer_object spec: > + * > + * "For uniform blocks laid out according to [std140] rules, the > + * minimum buffer object size returned by the > + * UNIFORM_BLOCK_DATA_SIZE query is derived by taking the offset > of > + * the last basic machine unit consumed by the last uniform of the > + * uniform block (including any end-of-array or end-of-structure > + * padding), adding one, and rounding up to the next multiple of > + * the base alignment required for a vec4." > + */ > + this->buffer_size = glsl_align(this->offset, 16); > + } > + > + virtual void visit_field(const glsl_struct_field *field) > + { > + this->offset = glsl_align(this->offset, > +field->type->std140_base_alignment(false)); > + } > +}; > + > +class count_block_size : public uniform_field_visitor { > +public: > + count_block_size() : num_active_uniforms(0) > + { > + /* empty */ > + } > + > + unsigned num_active_uniforms; > + > +private: > + virtual void visit_field(const glsl_type *type, const char *name, > +bool row_major) > + { > + (void) type; > + (void) name; > + (void) row_major; > + this->num_active_uniforms++; > + } > +}; > + > +struct block { > + const glsl_type *type; > + bool has_instance_name; > +}; > + > +int > +link_uniform_blocks(void *mem_ctx, > +struct gl_shader_program *prog, > +struct gl_shader **shader_list, > +unsigned num_shaders, > +struct gl_uniform_block **blocks_ret) > +{ > + /* This hash table will track all of the uniform blocks that have been > +* encountered. Since blocks with the same block-name must be the same, > +* the hash is organized by block-name. > +*/ > + struct hash_table *block_hash = > + _mesa_hash_table_create(mem_ctx, _mesa_key_string_equal); > + > + /* Determine which uniform blocks are active. > +
Re: [Mesa-dev] [PATCH 2/2] st/mesa: do proper error checking for u_upload_alloc() calls
Series is Reviewed-by: Jose Fonseca - Original Message - > We weren't properly checking the return value of these calls (and > calls to u_upload_data()) to detect OOM errors. > --- > src/mesa/state_tracker/st_cb_bitmap.c |5 ++--- > src/mesa/state_tracker/st_cb_clear.c |5 ++--- > src/mesa/state_tracker/st_cb_drawpixels.c |5 ++--- > src/mesa/state_tracker/st_cb_drawtex.c|7 +++ > src/mesa/state_tracker/st_draw.c | 21 > + > 5 files changed, 26 insertions(+), 17 deletions(-) > > diff --git a/src/mesa/state_tracker/st_cb_bitmap.c > b/src/mesa/state_tracker/st_cb_bitmap.c > index 843dc5b..63dbdb2 100644 > --- a/src/mesa/state_tracker/st_cb_bitmap.c > +++ b/src/mesa/state_tracker/st_cb_bitmap.c > @@ -350,9 +350,8 @@ setup_bitmap_vertex_data(struct st_context *st, > bool normalized, >tBot = (GLfloat) height; > } > > - u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]), > vbuf_offset, vbuf, > - (void**)&vertices); > - if (!vbuf) { > + if (u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]), > + vbuf_offset, vbuf, (void **) &vertices) != > PIPE_OK) { >return; > } > > diff --git a/src/mesa/state_tracker/st_cb_clear.c > b/src/mesa/state_tracker/st_cb_clear.c > index d01236e..a5aa8f4 100644 > --- a/src/mesa/state_tracker/st_cb_clear.c > +++ b/src/mesa/state_tracker/st_cb_clear.c > @@ -141,9 +141,8 @@ draw_quad(struct st_context *st, > GLuint i, offset; > float (*vertices)[2][4]; /**< vertex pos + color */ > > - u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]), &offset, > &vbuf, > - (void**)&vertices); > - if (!vbuf) { > + if (u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]), > + &offset, &vbuf, (void **) &vertices) != > PIPE_OK) { >return; > } > > diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c > b/src/mesa/state_tracker/st_cb_drawpixels.c > index 65f1160..c944b81 100644 > --- a/src/mesa/state_tracker/st_cb_drawpixels.c > +++ b/src/mesa/state_tracker/st_cb_drawpixels.c > @@ -568,9 +568,8 @@ draw_quad(struct gl_context *ctx, GLfloat x0, > GLfloat y0, GLfloat z, > struct pipe_resource *buf = NULL; > unsigned offset; > > - u_upload_alloc(st->uploader, 0, 4 * sizeof(verts[0]), &offset, > &buf, > - (void**)&verts); > - if (!buf) { > + if (u_upload_alloc(st->uploader, 0, 4 * sizeof(verts[0]), > &offset, > + &buf, (void **) &verts) != PIPE_OK) { >return; > } > > diff --git a/src/mesa/state_tracker/st_cb_drawtex.c > b/src/mesa/state_tracker/st_cb_drawtex.c > index 269068d..5ca0970 100644 > --- a/src/mesa/state_tracker/st_cb_drawtex.c > +++ b/src/mesa/state_tracker/st_cb_drawtex.c > @@ -148,10 +148,9 @@ st_DrawTex(struct gl_context *ctx, GLfloat x, > GLfloat y, GLfloat z, >GLfloat *vbuf = NULL; >GLuint attr; > > - u_upload_alloc(st->uploader, 0, > - numAttribs * 4 * 4 * sizeof(GLfloat), > - &offset, &vbuffer, (void**)&vbuf); > - if (!vbuffer) { > + if (u_upload_alloc(st->uploader, 0, > + numAttribs * 4 * 4 * sizeof(GLfloat), > + &offset, &vbuffer, (void **) &vbuf) != > PIPE_OK) { > return; >} > > diff --git a/src/mesa/state_tracker/st_draw.c > b/src/mesa/state_tracker/st_draw.c > index de539ca..de62264 100644 > --- a/src/mesa/state_tracker/st_draw.c > +++ b/src/mesa/state_tracker/st_draw.c > @@ -84,7 +84,12 @@ all_varyings_in_vbos(const struct gl_client_array > *arrays[]) > } > > > -static void > +/** > + * Basically, translate Mesa's index buffer information into > + * a pipe_index_buffer object. > + * \return TRUE or FALSE for success/failure > + */ > +static boolean > setup_index_buffer(struct st_context *st, > const struct _mesa_index_buffer *ib, > struct pipe_index_buffer *ibuffer) > @@ -100,8 +105,12 @@ setup_index_buffer(struct st_context *st, >ibuffer->offset = pointer_to_offset(ib->ptr); > } > else if (st->indexbuf_uploader) { > - u_upload_data(st->indexbuf_uploader, 0, ib->count * > ibuffer->index_size, > -ib->ptr, &ibuffer->offset, &ibuffer->buffer); > + if (u_upload_data(st->indexbuf_uploader, 0, > +ib->count * ibuffer->index_size, ib->ptr, > +&ibuffer->offset, &ibuffer->buffer) != > PIPE_OK) { > + /* out of memory */ > + return FALSE; > + } >u_upload_unmap(st->indexbuf_uploader); > } > else { > @@ -110,6 +119,7 @@ setup_index_buffer(struct st_context *st, > } > > cso_set_index_buffer(st->cso_context, ibuffer); > + return TRUE; > } > > > @@ -220,7 +230,10 @@ st_draw_vbo(struct gl_context *ctx, > vbo_get_minmax_indices(ctx, prims, ib, &min_index, > &max_ind