[Mesa-dev] [Bug 61821] src/mesa/drivers/dri/common/xmlpool.h:96:29: fatal error: xmlpool/options.h
https://bugs.freedesktop.org/show_bug.cgi?id=61821 Vinson Lee v...@freedesktop.org changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #5 from Vinson Lee v...@freedesktop.org --- mesa: a6bb7a94957468453c436e3860ee2dd47575c461 (master) This is still failing for me here. Failure case and output is still identical to comment #0. -- You are receiving this mail because: You are on the CC list for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: Add compute support v2
On Die, 2013-03-12 at 17:42 -0400, Alex Deucher wrote: On Tue, Mar 12, 2013 at 4:23 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com v2: - Only dump shaders when env variable is set. A couple of comments below, other than that, looks good. Reviewed-by: Alex Deucher alexander.deuc...@amd.com Likewise, Reviewed-by: Michel Dänzer michel.daen...@amd.com @@ -139,6 +140,11 @@ void si_pm4_inval_texture_cache(struct si_pm4_state *state) state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1); } +void si_pm4_inval_texture_l1_cache(struct si_pm4_state *state) +{ + state-cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1); +} + Is there any value in keeping the L1 flush separate? I don't think so: TC_ACTION_ENA should take care of L1 as well (search for INVL2 in the register spec). Would it make more sense to just add it to si_pm4_inval_texture_cache()? Yeah, for clarity's sake it might be a good idea to make the above explicit by adding S_0085F0_TCL1_ACTION_ENA where S_0085F0_TC_ACTION_ENA is used. On a somewhat related note, I'm also not sure it's worth having a separate si_pm4_inval_vertex_cache() since there is no VC anymore and the function is identical to si_pm4_inval_texture_cache(). Right, I think the main reason I kept it was in case it might help share more code with r600g again. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61821] src/mesa/drivers/dri/common/xmlpool.h:96:29: fatal error: xmlpool/options.h
https://bugs.freedesktop.org/show_bug.cgi?id=61821 --- Comment #6 from Marc marvi...@gmx.de --- works fine here. git clean -f -d -x -e b.sh ./autogen.sh\ --prefix=/usr \ --libdir=/usr/lib64 \ --disable-debug \ --enable-texture-float \ --enable-gles1 \ --enable-gles2 \ --enable-openvg \ --enable-xorg \ --enable-xvmc \ --enable-vdpau \ --enable-shared-glapi \ --enable-glx-tls\ --enable-gallium-llvm \ --enable-gallium-egl\ --enable-gbm\ --enable-gallium-gbm\ --with-gallium-drivers=swrast,r600 \ --with-llvm-prefix=/source/dri-project/llvm/Release \ --with-llvm-shared-libs \ --enable-r600-llvm-compiler \ --with-dri-drivers= \ --with-dri-driverdir=/usr/lib64/dri make -- You are receiving this mail because: You are on the CC list for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] OSMesa VTK
My tests of nightly VTK built against nightly OSMesa failed last night. The onscreen builds are fine. The OSMesa build tests now fail with all black output but I haven't found any error message to inform me of what's going on. I know osmesa just switched to Gallium: that is reflected in VTK's LoadOpenGLExtension test output : GL_VERSION: 2.1 Mesa 9.2-devel (git-6173cc1) GL_RENDERER: Mesa OffScreen became GL_VERSION: 2.1 Mesa 9.2.0 (git-f7ef83c) GL_RENDERER: Gallium 0.4 on llvmpipe (LLVM 3.0, 128 bits) Mesa was built : ./autogen.sh \ --prefix=/home/kevin/mesa_nightly \ --enable-glx \ --enable-dri \ --enable-shared-glapi \ --enable-gallium-llvm \ --with-gallium-drivers=nouveau,swrast \ --enable-osmesa signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears is faster but 3D render is so slow
Hi Brian, Sorry for not being clear, let me clarify it. On 3/13/13, Brian Paul bri...@vmware.com wrote: Well, the Xlib/swrast driver does everything in software, unlike a DRI driver which does most things with the GPU. Xlib will always be slower. My local test machine graphic card does not have hardware acceleration, it does not support OpenGL, it does not have NVIDIA. I guess the DRI driver may still implement software GL even though it might access some basic functions of the low budget graphic card, but correct me, if I am using wrong terminology. The gallium llvmpipe driver should be quite a bit faster. Just install LLVM first, then reconfigure/rebuild Mesa, set your LD_LIBRARY_PATH to the lib/gallium/ directory and you should get llvmpipe. Yes, I have already built the Mesalib with LLVM, I posted the configuration in my last email, let me post it again. ${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx --disable-dri --enable-gallium-llvm --with-llvm-shared-libs It did not produce a faster result, it was virtually not much differences when I run Chimera comparing to use of --with-gallium-drivers=swrast unless my configuration is wrong. Please let me know a correct version of configuration for llvmpipe. The libdrm version is 2.4.42. The libllvm version is 3.2. I can also use a test program to measure Mesa in different drivers if you could let me know which test program can be used for benchmarking the Mesalib using different drivers of swrast, or llvm or DRI? And where is the test program source code I can download from? Please also see attached glxinfo for Mesa llvm. Thank you. Kind regards, Jupiter -Brian On 03/12/2013 07:37 AM, jupiter wrote: Hi Brian, You are right, setting MESA_GLX_DEPTH_BITS to 24 bit does not change anything. So why Xlib has such poor performance to run following application? http://www.cgl.ucsf.edu/chimera/download.html Are there any other things I can try to make Xlib driver performance equals to DRI? Thank you. Kind regards, Jupiter On 3/12/13, Brian Paulbri...@vmware.com wrote: I don't think you have to worry about the difference in buffer depths. If you really want a 24-bit depth buffer you can do 'export MESA_GLX_DEPTH_BITS=24' -Brian On 03/09/2013 12:48 AM, jupiter wrote: Hi Brian, Please see attached config.log. Le me make a correction, I mean 32 buffer bit and 24 depth bit in DRI and 24 buffer bit and 16 bit depth bit in xlib driver. Will it make difference if setting 32 buffer bit and 24 depth bit for xlib? If so, how to do it? Thank you. Kind regards. Jupiter On 3/8/13, jupiterjupiter@gmail.com wrote: Hi Brian, I finally built Mesa with configuration --enable-xlib-glx --disable-dri --enable-gallium-llvm --with-llvm-shared-libs, with dependencies of llvm and drm. It does not work either, please see following glxinfo. Please let me know if my configuration is not correct, or if there are any other ways I can try to make it work. $ glxinfo name of display: :0.0 display: :0 screen: 0 direct rendering: Yes server glx vendor string: Brian Paul server glx version string: 1.4 Mesa 9.1-devel server glx extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer client glx vendor string: Brian Paul client glx version string: 1.4 Mesa 9.1-devel client glx extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer GLX version: 1.4 GLX extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer OpenGL vendor string: Brian Paul OpenGL renderer string: Mesa X11 OpenGL version string: 2.1 Mesa 9.1-devel OpenGL shading language version string: 1.20 OpenGL extensions: GL_ARB_multisample, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_blend_color, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_copy_texture, GL_EXT_polygon_offset, GL_EXT_subtexture, GL_EXT_texture_object, GL_EXT_vertex_array, GL_EXT_compiled_vertex_array, GL_EXT_texture, GL_EXT_texture3D, GL_IBM_rasterpos_clip, GL_ARB_point_parameters, GL_EXT_draw_range_elements, GL_EXT_packed_pixels, GL_EXT_point_parameters, GL_EXT_rescale_normal, GL_EXT_separate_specular_color, GL_EXT_texture_edge_clamp, GL_SGIS_generate_mipmap, GL_SGIS_texture_border_clamp, GL_SGIS_texture_edge_clamp, GL_SGIS_texture_lod, GL_ARB_multitexture, GL_IBM_multimode_draw_arrays,
Re: [Mesa-dev] [RFC] GLX_MESA_query_renderer
On 12 March 2013 17:46, Ian Romanick i...@freedesktop.org wrote: Right... the extension also adds an attribute that can only be used with glXCreateContextAttribsARB. Yeah, all I was saying is that it probably wouldn't be too hard to word things along the lines of If glXCreateContextAttribsARB() isn't available GLX_RENDERER_ID_MESA goes away, and only one renderer is available / visible.. Perhaps it's not worth it though. My thinking was that it will be very rare for multiple renderers to support the same GL versions and different extension strings... at least in a way that would cause apps to make different context creation decisions. I guess that makes sense in the very coarse I need at least GL3 way. Part of the thinking is that it would force regularity in how the version is advertised. Otherwise everyone will have a different kind of string, and the currently annoying situation of parsing implementation dependent strings continues. Maybe GLX_RENDERER_VERSION_MESA should also be allowed with glXQueryRendererStringMESA? Yeah, I think that makes sense. I also based this on ISV feedback. Some just wanted to know what the hardware was, and others wanted to know that and who made the driver. I was really trying to get away from just parse this random string for as much of the API as possible. It seems like this should only make things easier for apps... should. In theory you could add a GL vendor ID similar to the PCI vendor ID, but then you'd have to allocate those globally, which would probably be annoying. So, yeah. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Apply depthstencil alignment workaround when doing fast clears.
On 12 March 2013 12:53, Paul Berry stereotype...@gmail.com wrote: On 12 March 2013 12:28, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: Fast depth clears have the same depth/stencil alignment requirements as other drawing operations. Therefore, we need to call brw_workaround_depthstencil_alignment() from both the clear and drawing paths. Without this fix, we get image corruption if the following conditions hold: (a) the first ever drawing operation to a depth miplevel (or the first drawing operation after having used the texture for sampling) is a clear, (b) the depth miplevel has a size that is eligible for fast depth clears, and (c) the depth miplevel has an offset within the miptree that isn't 8x8 aligned. Fixes piglit depthstencil-render-miplevels tests with size 273. NOTE: This is a candidate for stable branches --- src/mesa/drivers/dri/i965/brw_clear.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c index 53d8e54..cde1a06 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -40,6 +40,8 @@ #include intel_mipmap_tree.h #include intel_regions.h +#include brw_context.h + #define FILE_DEBUG_FLAG DEBUG_BLIT static const char *buffer_names[] = { @@ -219,7 +221,8 @@ brw_fast_clear_depth(struct gl_context *ctx) static void brw_clear(struct gl_context *ctx, GLbitfield mask) { - struct intel_context *intel = intel_context(ctx); + struct brw_context *brw = brw_context(ctx); + struct intel_context *intel = brw-intel; if (!_mesa_check_conditional_render(ctx)) return; @@ -229,6 +232,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask) } intel_prepare_render(intel); + brw_workaround_depthstencil_alignment(brw); It seems like this should be happening in brw_fast_clear(), either before before calling blorp or inside of it, instead of in the potential caller of brw_fast_clear(). Makes sense, though. Chad made the same comment to me in person yesterday. The reason I put it here is to accommodate patch 2/2 (which allows brw_workaround_depthstencil_alignment to avoid an unnecessary copy when clearing the whole miplevel). If I move the call to brw_workaround_depthstencil_alignment into brw_fast_clear_depth(), then the unnecessary copy will only be avoided when doing depth clears. If I leave it here, the unnecessary copy will be avoided for all clears. Correction: when I wrote this I momentarily forgot that the workaround is only needed for depth and stencil buffers. So leaving the call to brw_workaround_depthstencil_alignment here allows us to avoid the unnecessary copy for both depth and stencil clears, not just depth clears. I still think it's worth it, but it's a far less convincing case. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.
Hi, I have tested your changes but it looks like they fail to compile on Android. == Tested the patch(es) on top of the following commits: f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker === Failed to build for android f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_depth_stencil_state': src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_cc_state_gen6': src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor': src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_vs_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_wm_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_binding_table': src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_prog_cache': src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function 'brw_upload_vs_pull_constants': src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/main/errors.c: In function '_mesa_problem': src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared (first use in this function) src/mesa/main/errors.c:851:15: note: each undeclared identifier is reported only once for each function it appears in src/mesa/main/errors.c:852:43: error: expected ')' before 'PACKAGE_BUGREPORT' make: *** [out/target/product/samsungxe700t/obj/STATIC_LIBRARIES/libmesa_dricore_intermediates/main/errors.o] Error 1 FAILURE -- Regards, Aiaiai ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't
(switching over mesa-dev.. sent to the wrong list initially) On Wed, Mar 13, 2013 at 8:25 AM, Paul Menzel paulepan...@users.sourceforge.net wrote: Dear Rob, Am Dienstag, den 12.03.2013, 19:44 -0400 schrieb Rob Clark: »it« sounds strange in commit summary. If ddx does not support swap, don't advertise it. Hmm, yeah.. I somehow was having trouble coming up with something short enough So how is `dri2BindExtensions` changed. Some things passed beforehand are already available in `struct dri2_screen *psc`? yeah, I suppose I didn't have to remove the extensions arg, but the code seemed a bit cleaner this way and was trying to avoid dri2BindExtensions() growing to a huge # of args Are bugs fixed by this or did you find this reading through the code? yes, with DRI2: Don't disable GLX_INTEL_swap_event unconditionally and without this patch, gnome-shell (and probably I guess anything built on clutter) will be broken for ddx drivers which don't support swap. I noticed this when rebasing freedreno to latest mesa (since currently I have no good kernel interface for page flipping, so I only advertise DRI2 1.1 (DRI2InfoRec version==3). We might also be able to get rid of the vmwgfx check (I'm not quite sure the purpose of that check vs. just checking dri2Minor. Missing »)«. oh, whoops.. well that is easy enough to fix at least BR, -R Signed-off-by: Rob Clark robdcl...@gmail.com --- src/glx/dri2_glx.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index c4f6996..b2d712c 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1051,11 +1051,16 @@ static const struct glx_context_vtable dri2_context_vtable = { }; static void -dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions, +dri2BindExtensions(struct dri2_screen *psc, struct glx_display * priv, No space after the * in `* priv`? const char *driverName) { + const struct dri2_display *const pdp = (struct dri2_display *) + priv-dri2Display; + const __DRIextension **extensions; int i; + extensions = psc-core-getExtensions(psc-driScreen); + __glXEnableDirectExtension(psc-base, GLX_SGI_video_sync); __glXEnableDirectExtension(psc-base, GLX_SGI_swap_control); __glXEnableDirectExtension(psc-base, GLX_MESA_swap_control); @@ -1069,7 +1074,7 @@ dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions, * of disabling it uncondtionally, just disable it for drivers * which are known to not support it. */ - if (strcmp(driverName, vmwgfx) != 0) { + if (pdp-swapAvailable strcmp(driverName, vmwgfx) != 0) { __glXEnableDirectExtension(psc-base, GLX_INTEL_swap_event); } @@ -1212,8 +1217,7 @@ dri2CreateScreen(int screen, struct glx_display * priv) goto handle_error; } - extensions = psc-core-getExtensions(psc-driScreen); - dri2BindExtensions(psc, extensions, driverName); + dri2BindExtensions(psc, priv, driverName); configs = driConvertConfigs(psc-core, psc-base.configs, driver_configs); visuals = driConvertConfigs(psc-core, psc-base.visuals, driver_configs); Thanks, Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears is faster but 3D render is so slow
On 03/13/2013 04:11 AM, jupiter wrote: Hi Brian, Sorry for not being clear, let me clarify it. On 3/13/13, Brian Paulbri...@vmware.com wrote: Well, the Xlib/swrast driver does everything in software, unlike a DRI driver which does most things with the GPU. Xlib will always be slower. My local test machine graphic card does not have hardware acceleration, it does not support OpenGL, it does not have NVIDIA. I guess the DRI driver may still implement software GL even though it might access some basic functions of the low budget graphic card, but correct me, if I am using wrong terminology. Yes, swrast may also be used via DRI, when there's no DRI driver for the hardware GPU or when the hardware driver needs a software fallback. The gallium llvmpipe driver should be quite a bit faster. Just install LLVM first, then reconfigure/rebuild Mesa, set your LD_LIBRARY_PATH to the lib/gallium/ directory and you should get llvmpipe. Yes, I have already built the Mesalib with LLVM, I posted the configuration in my last email, let me post it again. ${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx --disable-dri --enable-gallium-llvm --with-llvm-shared-libs It did not produce a faster result, it was virtually not much differences when I run Chimera comparing to use of --with-gallium-drivers=swrast unless my configuration is wrong. Please let me know a correct version of configuration for llvmpipe. The libdrm version is 2.4.42. The libllvm version is 3.2. libdrm is irrelevant for llvmpipe. I can also use a test program to measure Mesa in different drivers if you could let me know which test program can be used for benchmarking the Mesalib using different drivers of swrast, or llvm or DRI? And where is the test program source code I can download from? The Mesa demos git tree can be cloned per http://www.mesa3d.org/repository.html Please also see attached glxinfo for Mesa llvm. The key line is OpenGL renderer string: Mesa X11. That's not llvmpipe. If you're really using llvmpipe it should say something like OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.2, 128 bits). Perhaps your LD_LIBRARY_PATH env is pointing at the wrong libGL.so -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OSMesa VTK
On 03/13/2013 04:04 AM, Kevin H. Hobbs wrote: My tests of nightly VTK built against nightly OSMesa failed last night. The onscreen builds are fine. The OSMesa build tests now fail with all black output but I haven't found any error message to inform me of what's going on. I know osmesa just switched to Gallium: that is reflected in VTK's LoadOpenGLExtension test output : GL_VERSION: 2.1 Mesa 9.2-devel (git-6173cc1) GL_RENDERER: Mesa OffScreen became GL_VERSION: 2.1 Mesa 9.2.0 (git-f7ef83c) GL_RENDERER: Gallium 0.4 on llvmpipe (LLVM 3.0, 128 bits) Mesa was built : ./autogen.sh \ --prefix=/home/kevin/mesa_nightly \ --enable-glx \ --enable-dri \ --enable-shared-glapi \ --enable-gallium-llvm \ --with-gallium-drivers=nouveau,swrast \ --enable-osmesa I just rebuilt with those options (but a different prefix) and OSMesa seems OK here. Can you tell me what the parameters are for your OSMesaCreateContext() and OSMesaMakeCurrent() calls (in particular the image format/type)? You could also try setting GALLIUM_DRIVER=softpipe and see if the softpipe driver works. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call
https://bugs.freedesktop.org/show_bug.cgi?id=58718 --- Comment #15 from José Fonseca jfons...@vmware.com --- (In reply to comment #14) I posted a series of patches to mesa3d-dev which seems to fix the inline issue. I pushed these now, the most important being commit 57cd1d1454653f778837eec0ee5d4060bc59c5ba Author: José Fonseca jfons...@vmware.com Date: Tue Mar 12 20:37:47 2013 + include: Fix build with VS 11 (i.e, 2012). NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul bri...@vmware.com After rebuilding Mesa with VS 2012 the framebuffer2.trace no longer crashes. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ __func__ to rule them all.
I think Vinson fixed that issue. Let me know if there are still build issues. Jose - Original Message - Sorry. I'll look into it. I did try running make locally, but it was not representative because it didn't have everything enabled. BTW, scons is also busted with the recent autotools changes. Jose - Original Message - I'm pretty sure this commit broke 'make check'. On 03/12/2013 03:07 PM, Jose Fonseca wrote: Module: Mesa Branch: master Commit: 70fe7c6d3e1c7534f6598c4616bebf672f42668b URL: https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3D70fe7c6d3e1c7534f6598c4616bebf672f42668bk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=c39db3454f1b8b1ca49172fd338781131296d92aa521862672d8bf538fbb786a Author: José Fonseca jfons...@vmware.com Date: Tue Mar 12 11:17:49 2013 + mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all. We were in four already... NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul bri...@vmware.com --- include/c99_compat.h | 105 + src/egl/main/eglcompiler.h| 44 ++ src/gallium/include/pipe/p_compiler.h | 74 ++- src/mapi/mapi/u_compiler.h| 26 +--- src/mesa/main/compiler.h | 56 ++ 5 files changed, 125 insertions(+), 180 deletions(-) diff --git a/include/c99_compat.h b/include/c99_compat.h new file mode 100644 index 000..39f958f --- /dev/null +++ b/include/c99_compat.h @@ -0,0 +1,105 @@ +/** + * + * Copyright 2007-2013 VMware, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#ifndef _C99_COMPAT_H_ +#define _C99_COMPAT_H_ + + +/* + * C99 inline keyword + */ +#ifndef inline +# ifdef __cplusplus + /* C++ supports inline keyword */ +# elif defined(__GNUC__) +#define inline __inline__ +# elif defined(_MSC_VER) +#define inline __inline +# elif defined(__ICL) +#define inline __inline +# elif defined(__INTEL_COMPILER) + /* Intel compiler supports inline keyword */ +# elif defined(__WATCOMC__) (__WATCOMC__ = 1100) +#define inline __inline +# elif defined(__SUNPRO_C) defined(__C99FEATURES__) + /* C99 supports inline keyword */ +# elif (__STDC_VERSION__ = 199901L) + /* C99 supports inline keyword */ +# else +#define inline +# endif +#endif + + +/* + * C99 restrict keyword + * + * See also: + * - https://urldefense.proofpoint.com/v1/url?u=http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=f43184c4b720b2a3a361edbfbdffd2faf83def468e6171389b627cae6991baf4 + */ +#ifndef restrict +# if (__STDC_VERSION__ = 199901L) + /* C99 */ +# elif defined(__SUNPRO_C) defined(__C99FEATURES__) + /* C99 */ +# elif defined(__GNUC__) +#define restrict __restrict__ +# elif defined(_MSC_VER) +#define restrict __restrict +# else +#define restrict /* */ +# endif +#endif + + +/* + * C99 __func__ macro + */ +#ifndef __func__ +# if (__STDC_VERSION__ = 199901L) + /* C99 */ +# elif defined(__SUNPRO_C)
Re: [Mesa-dev] [PATCH 10/12] Get rid of _mesa_vert_result_to_frag_attrib().
Paul Berry stereotype...@gmail.com writes: Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. But we still need to be able to detect when a given vertex output has no corresponding fragment input. So it is replaced by a new function, _mesa_varying_slot_in_fs(), which tells whether the given varying slot exists as an FS input or not. --- src/mesa/drivers/dri/i965/brw_fs.cpp| 12 - src/mesa/drivers/dri/i965/brw_vs_constval.c | 13 -- src/mesa/main/mtypes.h | 38 + 3 files changed, 27 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 86f8cbb..ea4a56c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1265,7 +1265,7 @@ fs_visitor::calculate_urb_setup() continue; if (c-key.vp_outputs_written BITFIELD64_BIT(i)) { - int fp_index = _mesa_vert_result_to_frag_attrib((gl_varying_slot) i); +bool exists_in_fs = _mesa_varying_slot_in_fs((gl_varying_slot) i); I'd rather see this call moved into the single usage in the if statement below, like has been done elsewhere (now that the function name explicitly talks about what's being tested in the if anyway) /* The back color slot is skipped when the front color is * also written to. In addition, some slots can be @@ -1273,8 +1273,8 @@ fs_visitor::calculate_urb_setup() * fragment shader. So the register number must always be * incremented, mapped or not. */ - if (fp_index = 0) -urb_setup[fp_index] = urb_next; + if (exists_in_fs) +urb_setup[i] = urb_next; urb_next++; /** - * Convert from a gl_varying_slot value for a vertex output to the - * corresponding gl_frag_attrib. - * - * Varying output values which have no corresponding gl_frag_attrib - * (VARYING_SLOT_PSIZ, VARYING_SLOT_BFC0, VARYING_SLOT_BFC1, and - * VARYING_SLOT_EDGE) are converted to a value of -1. + * Determine if the given gl_varying_slot appears in the fragment shader. */ -static inline int -_mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result) +static inline GLboolean +_mesa_varying_slot_in_fs(gl_varying_slot slot) { - if (vert_result = VARYING_SLOT_TEX7) - return vert_result; - else if (vert_result VARYING_SLOT_CLIP_DIST0) - return -1; - else if (vert_result = VARYING_SLOT_CLIP_DIST1) - return vert_result; - else if (vert_result VARYING_SLOT_VAR0) - return -1; - else - return vert_result; + switch (slot) { + case VARYING_SLOT_PSIZ: + case VARYING_SLOT_BFC0: + case VARYING_SLOT_BFC1: + case VARYING_SLOT_EDGE: + case VARYING_SLOT_CLIP_VERTEX: + case VARYING_SLOT_LAYER: + return GL_FALSE; + default: + return GL_TRUE; + } } I bet the compiler does a big switch statement instead of doing what we could do better with bitfields. Not a blocker, just a potential improvement. Other than that, I'm glad to see this series happen. Reviewed-by: Eric Anholt e...@anholt.net pgpodX4xwJpRR.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't
well, I'm more familiar w/ EGL where we don't have the xserver advertising anything, and it is all on the client side.. but when it is an inexpensive check, it seems reasonable to want mesa to do the right thing where possible. It's simply silly. In the same sense that adding yet another if (ptr) to if (ptr) if (ptr) FREE(ptr); while not technically wrong is simply silly. Like I said we already check whether those extensions are advertised by the server and don't advertise the ones that aren't. Probably there are other cases where we should do the same thing. I can update my patch to also exclude other extensions No, the point it that we don't want to do that. It's fundamentally broken and you know that it's broken because you'll notice that this extension is still advertised by the server (for our sake that's all required to fix Clutter, but it's still broken). It's a weird thing for an extension which is implemented by the server to be advertised by the server and yet having a client which is essentially not involved at all, not be advertising it. The only reason we have to worry about this is that the server is broken. So while we might want to make things easier on us by not forcing users to keep repatching the Xserver we shouldn't have any illusions about what this is: it's a nasty hack required by a bug in the Xserver. As such that code has only two requirements: 1) That all drivers requiring that hack go through the same codepath and that it's as minimal as possible so it's trivial to remove it once a fixed Xserver gets into most distros. 2) That it's clearly documented as hack thanks to which anyone reading this code will immediately understand what's the purpose of the weird code and what are the prerequisites for removing it. Everything else is of no consequence in this case. So whether you'll decide to use names or some any number of other extensions that came after dri2inforec version 4 to check for makes no difference as long as it fulfills the two above goals. true, it is not shipping in any distro yet, so anyone who wants to try it gets to try git master of mesa, which runs into problems because of advertising the INTEL_swap extension. Asking everyone to rebuild xserver with some extra patch which is not merged yet is a big pita. Sure, but at the same time adding hacks to shared mesa code to make it easier to try a dev driver doesn't make terribly convincing argument. In the end though, at least in this case, the bug is severe enough that a hack in mesa makes sense and we've spent too much time discussing a very simple issue, so whatever you do just please make sure to fulfill the two requirements above and everything will be ok. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Apply depthstencil alignment workaround when doing fast clears.
Paul Berry stereotype...@gmail.com writes: On 12 March 2013 12:53, Paul Berry stereotype...@gmail.com wrote: On 12 March 2013 12:28, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: Fast depth clears have the same depth/stencil alignment requirements as other drawing operations. Therefore, we need to call brw_workaround_depthstencil_alignment() from both the clear and drawing paths. Without this fix, we get image corruption if the following conditions hold: (a) the first ever drawing operation to a depth miplevel (or the first drawing operation after having used the texture for sampling) is a clear, (b) the depth miplevel has a size that is eligible for fast depth clears, and (c) the depth miplevel has an offset within the miptree that isn't 8x8 aligned. Fixes piglit depthstencil-render-miplevels tests with size 273. NOTE: This is a candidate for stable branches --- src/mesa/drivers/dri/i965/brw_clear.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c index 53d8e54..cde1a06 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -40,6 +40,8 @@ #include intel_mipmap_tree.h #include intel_regions.h +#include brw_context.h + #define FILE_DEBUG_FLAG DEBUG_BLIT static const char *buffer_names[] = { @@ -219,7 +221,8 @@ brw_fast_clear_depth(struct gl_context *ctx) static void brw_clear(struct gl_context *ctx, GLbitfield mask) { - struct intel_context *intel = intel_context(ctx); + struct brw_context *brw = brw_context(ctx); + struct intel_context *intel = brw-intel; if (!_mesa_check_conditional_render(ctx)) return; @@ -229,6 +232,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask) } intel_prepare_render(intel); + brw_workaround_depthstencil_alignment(brw); It seems like this should be happening in brw_fast_clear(), either before before calling blorp or inside of it, instead of in the potential caller of brw_fast_clear(). Makes sense, though. Chad made the same comment to me in person yesterday. The reason I put it here is to accommodate patch 2/2 (which allows brw_workaround_depthstencil_alignment to avoid an unnecessary copy when clearing the whole miplevel). If I move the call to brw_workaround_depthstencil_alignment into brw_fast_clear_depth(), then the unnecessary copy will only be avoided when doing depth clears. If I leave it here, the unnecessary copy will be avoided for all clears. Correction: when I wrote this I momentarily forgot that the workaround is only needed for depth and stencil buffers. So leaving the call to brw_workaround_depthstencil_alignment here allows us to avoid the unnecessary copy for both depth and stencil clears, not just depth clears. I still think it's worth it, but it's a far less convincing case. You convinced me, though. Reviewed-by: Eric Anholt e...@anholt.net for both. pgp91Ike56JRF.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't
On Wed, Mar 13, 2013 at 11:19 AM, Zack Rusin za...@vmware.com wrote: well, I'm more familiar w/ EGL where we don't have the xserver advertising anything, and it is all on the client side.. but when it is an inexpensive check, it seems reasonable to want mesa to do the right thing where possible. It's simply silly. In the same sense that adding yet another if (ptr) to if (ptr) if (ptr) FREE(ptr); while not technically wrong is simply silly. Like I said we already check whether those extensions are advertised by the server and don't advertise the ones that aren't. well, if the other component that provided FREE() had a bug that it didn't check for null, it wouldn't be completely silly. But still a hack/workaround.. Probably there are other cases where we should do the same thing. I can update my patch to also exclude other extensions No, the point it that we don't want to do that. It's fundamentally broken and you know that it's broken because you'll notice that this extension is still advertised by the server (for our sake that's all required to fix Clutter, but it's still broken). It's a weird thing for an extension which is implemented by the server to be advertised by the server and yet having a client which is essentially not involved at all, not be advertising it. The only reason we have to worry about this is that the server is broken. So while we might want to make things easier on us by not forcing users to keep repatching the Xserver we shouldn't have any illusions about what this is: it's a nasty hack required by a bug in the Xserver. As such that code has only two requirements: 1) That all drivers requiring that hack go through the same codepath and that it's as minimal as possible so it's trivial to remove it once a fixed Xserver gets into most distros. 2) That it's clearly documented as hack thanks to which anyone reading this code will immediately understand what's the purpose of the weird code and what are the prerequisites for removing it. Everything else is of no consequence in this case. So whether you'll decide to use names or some any number of other extensions that came after dri2inforec version 4 to check for makes no difference as long as it fulfills the two above goals. I'm ok with documenting it as a hack, and removing it once updated xserver is in most distro's. But it does seem useful to have at least in the short term. true, it is not shipping in any distro yet, so anyone who wants to try it gets to try git master of mesa, which runs into problems because of advertising the INTEL_swap extension. Asking everyone to rebuild xserver with some extra patch which is not merged yet is a big pita. Sure, but at the same time adding hacks to shared mesa code to make it easier to try a dev driver doesn't make terribly convincing argument. In the end though, at least in this case, the bug is severe enough that a hack in mesa makes sense and we've spent too much time discussing a very simple issue, so whatever you do just please make sure to fulfill the two requirements above and everything will be ok. true, I suppose.. although there are currently enough challenges getting proper linux running on some of these devices, I don't really like to make it harder than it already is. I'll re-submit the patch making it more clear that it is a hack. I think point #1 is already met, it is a pretty localized hack and should be easy to remove later. BR, -R z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap
If ddx does not support swap, don't advertise it. This is a hack to work around current xservers which advertise this extension even when it is clearly not supported. When: http://lists.x.org/archives/xorg-devel/2013-February/035449.html is merged in upstream xserver and makes it's way into most distros then this hack can be removed. In the mean time, it is required to allow gnome-shell/clutter/etc to work properly with a DDX driver which does not support ScheduleSwap. Signed-off-by: Rob Clark robdcl...@gmail.com --- src/glx/dri2_glx.c | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index c4f6996..7ce5775 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1051,11 +1051,16 @@ static const struct glx_context_vtable dri2_context_vtable = { }; static void -dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions, +dri2BindExtensions(struct dri2_screen *psc, struct glx_display * priv, const char *driverName) { + const struct dri2_display *const pdp = (struct dri2_display *) + priv-dri2Display; + const __DRIextension **extensions; int i; + extensions = psc-core-getExtensions(psc-driScreen); + __glXEnableDirectExtension(psc-base, GLX_SGI_video_sync); __glXEnableDirectExtension(psc-base, GLX_SGI_swap_control); __glXEnableDirectExtension(psc-base, GLX_MESA_swap_control); @@ -1066,10 +1071,15 @@ dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions, * currently unconditionally enabled. This completely breaks * systems running on drivers which don't support that extension. * There's no way to test for its presence on this side, so instead -* of disabling it uncondtionally, just disable it for drivers -* which are known to not support it. +* of disabling it unconditionally, just disable it for drivers +* which are known to not support it, or for DDX drivers supporting +* only an older (pre-ScheduleSwap) version of DRI2. +* +* This is a hack which is required until: +* http://lists.x.org/archives/xorg-devel/2013-February/035449.html +* is merged and updated xserver makes it's way into distros: */ - if (strcmp(driverName, vmwgfx) != 0) { + if (pdp-swapAvailable strcmp(driverName, vmwgfx) != 0) { __glXEnableDirectExtension(psc-base, GLX_INTEL_swap_event); } @@ -1212,8 +1222,7 @@ dri2CreateScreen(int screen, struct glx_display * priv) goto handle_error; } - extensions = psc-core-getExtensions(psc-driScreen); - dri2BindExtensions(psc, extensions, driverName); + dri2BindExtensions(psc, priv, driverName); configs = driConvertConfigs(psc-core, psc-base.configs, driver_configs); visuals = driConvertConfigs(psc-core, psc-base.visuals, driver_configs); -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.
Kenneth Graunke kenn...@whitecape.org writes: This optimization attempts to avoid extra attribute interpolation instructions for texture coordinates where the W-component is 1.0. Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes state atom (all the brw_vs_constval.c code) needs to run on each draw. It computes the input_size_masks array, then uses that to compute proj_attrib_mask. Differences in proj_attrib_mask can cause state-dependent fragment shader recompiles. We also often fail to guess proj_attrib_mask for the fragment shader precompile, causing us to needlessly compile it twice. Furthermore, this optimization only applies to fixed-function programs; it does not help modern GLSL-based programs at all. Generally, older fixed-function programs run fine on modern hardware anyway. The optimization has existed in some form since the initial commit. When we rewrote the fragment shader backend, we dropped it for a while. Eric readded it in commit eb30820f268608cf451da32de69723036dddbc62 as part of an attempt to cure a ~1% performance regression caused by converting the fixed-function fragment shader generation code from Mesa IR to GLSL IR. However, no performance data was included in the commit message, so it's unclear whether or not it was successful. Time has passed, so I decided to re-measure this. Surprisingly, Eric's OpenArena timedemo actually runs /faster/ after removing this and the brw_wm_input_sizes atom. On Ivybridge at 1024x768, I measured a 1.39532% +/- 0.91833% increase in FPS (n = 55). Removing it on SNB+ makes sense to me. But given the higher cost of math pre-gen6, I think we should test on one of those too. pgpJbb84JTksL.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] Fix glapi/tests/check_table.cpp for standardized OpenGL function names
On 02/27/2013 08:36 AM, Jon TURNEY wrote: It looks like this has been broken since commit 1a1db1746db82efc7f0643508886dfc78a15eb71 Standardize names of OpenGL functions. As far as I can tell, I run this test with every build, and I've never seen it fail. The ARB names are part of the Linux ABI, so if they're not working, something is catastrophically broken... changing the test just ignores the problem. Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk --- src/mapi/glapi/tests/check_table.cpp | 528 +- 1 files changed, 264 insertions(+), 264 deletions(-) diff --git a/src/mapi/glapi/tests/check_table.cpp b/src/mapi/glapi/tests/check_table.cpp index 807d3c3..dffec83 100644 --- a/src/mapi/glapi/tests/check_table.cpp +++ b/src/mapi/glapi/tests/check_table.cpp @@ -523,40 +523,40 @@ const struct name_offset linux_gl_abi[] = { { glTexImage3D, 371 }, { glTexSubImage3D, 372 }, { glCopyTexSubImage3D, 373 }, - { glActiveTextureARB, 374 }, - { glClientActiveTextureARB, 375 }, - { glMultiTexCoord1dARB, 376 }, - { glMultiTexCoord1dvARB, 377 }, + { glActiveTexture, 374 }, + { glClientActiveTexture, 375 }, + { glMultiTexCoord1d, 376 }, + { glMultiTexCoord1dv, 377 }, { glMultiTexCoord1fARB, 378 }, { glMultiTexCoord1fvARB, 379 }, - { glMultiTexCoord1iARB, 380 }, - { glMultiTexCoord1ivARB, 381 }, - { glMultiTexCoord1sARB, 382 }, - { glMultiTexCoord1svARB, 383 }, - { glMultiTexCoord2dARB, 384 }, - { glMultiTexCoord2dvARB, 385 }, + { glMultiTexCoord1i, 380 }, + { glMultiTexCoord1iv, 381 }, + { glMultiTexCoord1s, 382 }, + { glMultiTexCoord1sv, 383 }, + { glMultiTexCoord2d, 384 }, + { glMultiTexCoord2dv, 385 }, { glMultiTexCoord2fARB, 386 }, { glMultiTexCoord2fvARB, 387 }, - { glMultiTexCoord2iARB, 388 }, - { glMultiTexCoord2ivARB, 389 }, - { glMultiTexCoord2sARB, 390 }, - { glMultiTexCoord2svARB, 391 }, - { glMultiTexCoord3dARB, 392 }, - { glMultiTexCoord3dvARB, 393 }, + { glMultiTexCoord2i, 388 }, + { glMultiTexCoord2iv, 389 }, + { glMultiTexCoord2s, 390 }, + { glMultiTexCoord2sv, 391 }, + { glMultiTexCoord3d, 392 }, + { glMultiTexCoord3dv, 393 }, { glMultiTexCoord3fARB, 394 }, { glMultiTexCoord3fvARB, 395 }, - { glMultiTexCoord3iARB, 396 }, - { glMultiTexCoord3ivARB, 397 }, - { glMultiTexCoord3sARB, 398 }, - { glMultiTexCoord3svARB, 399 }, - { glMultiTexCoord4dARB, 400 }, - { glMultiTexCoord4dvARB, 401 }, + { glMultiTexCoord3i, 396 }, + { glMultiTexCoord3iv, 397 }, + { glMultiTexCoord3s, 398 }, + { glMultiTexCoord3sv, 399 }, + { glMultiTexCoord4d, 400 }, + { glMultiTexCoord4dv, 401 }, { glMultiTexCoord4fARB, 402 }, { glMultiTexCoord4fvARB, 403 }, - { glMultiTexCoord4iARB, 404 }, - { glMultiTexCoord4ivARB, 405 }, - { glMultiTexCoord4sARB, 406 }, - { glMultiTexCoord4svARB, 407 }, + { glMultiTexCoord4i, 404 }, + { glMultiTexCoord4iv, 405 }, + { glMultiTexCoord4s, 406 }, + { glMultiTexCoord4sv, 407 }, { NULL, 0 } }; @@ -937,40 +937,40 @@ const struct name_offset known_dispatch[] = { { glTexImage3D, _O(TexImage3D) }, { glTexSubImage3D, _O(TexSubImage3D) }, { glCopyTexSubImage3D, _O(CopyTexSubImage3D) }, - { glActiveTextureARB, _O(ActiveTextureARB) }, - { glClientActiveTextureARB, _O(ClientActiveTextureARB) }, - { glMultiTexCoord1dARB, _O(MultiTexCoord1dARB) }, - { glMultiTexCoord1dvARB, _O(MultiTexCoord1dvARB) }, + { glActiveTexture, _O(ActiveTexture) }, + { glClientActiveTexture, _O(ClientActiveTexture) }, + { glMultiTexCoord1d, _O(MultiTexCoord1d) }, + { glMultiTexCoord1dv, _O(MultiTexCoord1dv) }, { glMultiTexCoord1fARB, _O(MultiTexCoord1fARB) }, { glMultiTexCoord1fvARB, _O(MultiTexCoord1fvARB) }, - { glMultiTexCoord1iARB, _O(MultiTexCoord1iARB) }, - { glMultiTexCoord1ivARB, _O(MultiTexCoord1ivARB) }, - { glMultiTexCoord1sARB, _O(MultiTexCoord1sARB) }, - { glMultiTexCoord1svARB, _O(MultiTexCoord1svARB) }, - { glMultiTexCoord2dARB, _O(MultiTexCoord2dARB) }, - { glMultiTexCoord2dvARB, _O(MultiTexCoord2dvARB) }, + { glMultiTexCoord1i, _O(MultiTexCoord1i) }, + { glMultiTexCoord1iv, _O(MultiTexCoord1iv) }, + { glMultiTexCoord1s, _O(MultiTexCoord1s) }, + { glMultiTexCoord1sv, _O(MultiTexCoord1sv) }, + { glMultiTexCoord2d, _O(MultiTexCoord2d) }, + { glMultiTexCoord2dv, _O(MultiTexCoord2dv) }, { glMultiTexCoord2fARB, _O(MultiTexCoord2fARB) }, { glMultiTexCoord2fvARB, _O(MultiTexCoord2fvARB) }, - { glMultiTexCoord2iARB, _O(MultiTexCoord2iARB) }, - { glMultiTexCoord2ivARB, _O(MultiTexCoord2ivARB) }, - { glMultiTexCoord2sARB, _O(MultiTexCoord2sARB) }, - { glMultiTexCoord2svARB, _O(MultiTexCoord2svARB) }, - { glMultiTexCoord3dARB, _O(MultiTexCoord3dARB) }, - { glMultiTexCoord3dvARB, _O(MultiTexCoord3dvARB) }, + { glMultiTexCoord2i, _O(MultiTexCoord2i) }, + { glMultiTexCoord2iv, _O(MultiTexCoord2iv) }, + { glMultiTexCoord2s,
Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.
On 03/13/2013 06:07 AM, Adrian M Negreanu wrote: Hi, I have tested your changes but it looks like they fail to compile on Android. == Tested the patch(es) on top of the following commits: f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker === Failed to build for android f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_depth_stencil_state': src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_cc_state_gen6': src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor': src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_vs_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_wm_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_binding_table': src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_prog_cache': src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function 'brw_upload_vs_pull_constants': src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/main/errors.c: In function '_mesa_problem': src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared (first use in this function) That's definitely Matt's build system changes, not my code. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ __func__ to rule them all.
On 03/13/2013 01:14 AM, Jose Fonseca wrote: Sorry. I'll look into it. I did try running make locally, but it was not representative because it didn't have everything enabled. BTW, scons is also busted with the recent autotools changes. Blarg. Of course it did. :( Has the break been reported to the guilty party? I haven't been following any of the build system discussion very closely over the last few weeks... Jose - Original Message - I'm pretty sure this commit broke 'make check'. On 03/12/2013 03:07 PM, Jose Fonseca wrote: Module: Mesa Branch: master Commit: 70fe7c6d3e1c7534f6598c4616bebf672f42668b URL: https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3D70fe7c6d3e1c7534f6598c4616bebf672f42668bk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=c39db3454f1b8b1ca49172fd338781131296d92aa521862672d8bf538fbb786a Author: José Fonseca jfons...@vmware.com Date: Tue Mar 12 11:17:49 2013 + mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all. We were in four already... NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul bri...@vmware.com --- include/c99_compat.h | 105 + src/egl/main/eglcompiler.h| 44 ++ src/gallium/include/pipe/p_compiler.h | 74 ++- src/mapi/mapi/u_compiler.h| 26 +--- src/mesa/main/compiler.h | 56 ++ 5 files changed, 125 insertions(+), 180 deletions(-) diff --git a/include/c99_compat.h b/include/c99_compat.h new file mode 100644 index 000..39f958f --- /dev/null +++ b/include/c99_compat.h @@ -0,0 +1,105 @@ +/** + * + * Copyright 2007-2013 VMware, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#ifndef _C99_COMPAT_H_ +#define _C99_COMPAT_H_ + + +/* + * C99 inline keyword + */ +#ifndef inline +# ifdef __cplusplus + /* C++ supports inline keyword */ +# elif defined(__GNUC__) +#define inline __inline__ +# elif defined(_MSC_VER) +#define inline __inline +# elif defined(__ICL) +#define inline __inline +# elif defined(__INTEL_COMPILER) + /* Intel compiler supports inline keyword */ +# elif defined(__WATCOMC__) (__WATCOMC__ = 1100) +#define inline __inline +# elif defined(__SUNPRO_C) defined(__C99FEATURES__) + /* C99 supports inline keyword */ +# elif (__STDC_VERSION__ = 199901L) + /* C99 supports inline keyword */ +# else +#define inline +# endif +#endif + + +/* + * C99 restrict keyword + * + * See also: + * - https://urldefense.proofpoint.com/v1/url?u=http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=f43184c4b720b2a3a361edbfbdffd2faf83def468e6171389b627cae6991baf4 + */ +#ifndef restrict +# if (__STDC_VERSION__ = 199901L) + /* C99 */ +# elif defined(__SUNPRO_C) defined(__C99FEATURES__) + /* C99 */ +# elif defined(__GNUC__) +#define restrict __restrict__ +# elif defined(_MSC_VER) +#define restrict __restrict +# else +#define restrict /* */ +# endif +#endif + + +/* + * C99 __func__ macro + */ +#ifndef __func__ +# if (__STDC_VERSION__ = 199901L) + /* C99 */ +# elif defined(__SUNPRO_C) defined(__C99FEATURES__) + /* C99 */ +# elif defined(__GNUC__) +#if __GNUC__ = 2 +# define __func__ __FUNCTION__ +#else +# define __func__ unknown +#endif +# elif defined(_MSC_VER) +#if _MSC_VER = 1300 +# define __func__
Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ __func__ to rule them all.
- Original Message - On 03/13/2013 01:14 AM, Jose Fonseca wrote: Sorry. I'll look into it. I did try running make locally, but it was not representative because it didn't have everything enabled. BTW, scons is also busted with the recent autotools changes. Blarg. Of course it did. :( Has the break been reported to the guilty party? It's fixed now. It was simple stuff. I haven't been following any of the build system discussion very closely over the last few weeks... I also tend to glean over autotools review requests, as I'm not very familiar with it, but I keep forgeting to look out for impact into scons. Anyway, no biggie, at least as far I'm concerned. It's so hard to build mesa with all possible build systems, platforms, and required dependencies, so I don't think it's anybody fault builds get broken every now and then, just the nature of the beast. As long as the build gets fixed in a timely fashion, then bisecting through the past shouldn't be too painful. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix for dispatch table entries and es2-compatibility mode
Please don't reply off list. Subscribe to the list and participate. We can't make decisions that affect everyone behind closed doors. On 03/13/2013 02:32 AM, Zawistowski, Bartosz L wrote: Hi Ian, Thank you for quick feedback. This fix is aimed at separating EXT and ARB framebuffer_object extensions. According to GL spec entry points of these two extensions need to have different implementations. There is no such distinction now in DRI drivers nor in glapi interface. This patch allows other DRI drivers (not delivered with mesa) that use glapi to take advantage of both extensions. In order to satisfy compilation of mesa DRI drivers, name wrappers have been provided for new api calls. We've had some discussion about this on the mesa-dev list before. You should search the archives. I believe the only functions that are different are glBindRenderbuffer and glBindFramebuffer. All of the other functions have the same GLX protocol opcode, so they are indistinguishable in indirect rendering. Splitting functions other than glBind{Frame,Render}buffer is definitely wrong. Could you please provide more details regarding compatibility issues between libGL and existing DRI drivers you have mentioned? The driver asks libGL where various functions belong in the dispatch table. Right now, drivers only ask for one of the names and assume all variations will go through the same dispatch entry. With this change, existing driver binaries won't set the dispatch pointer for glBindFramebuffer (only for glBindFramebufferEXT), for example. Since the same mechanism is used inside the xserver, changing the drivers would also require changes in the server's GLX code. What we have done is implement the ARB behavior for both versions. If you ask 100 developers what the difference is, I would bet at least 97 respond, There's a difference? :) And the other 3 will wonder what the point of using non-Gen names is. In short... Yes, there's a bug, but fixing it makes things worse for very little, if any, actual benefit. The bigger thing to worry about is that glBindFramebuffer in OpenGL ES 2.0 (and 3.0) behaves different from glBindFramebuffer in desktop OpenGL 3.0. Since you know which API was used to create the context, just select the behavior in the implementation of glBindFramebuffer based on the API. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Google Summer of Code ideas needed
Hi, It's time again for Google Summer of Code, so we need to start updating the X.Org ideas page (http://www.x.org/wiki/SummerOfCodeIdeas) with new ideas. Since there have been a few issues with the wikis lately, if you have any ideas please respond to this thread, and I will make sure they get onto the official ideas page (but still feel free to update the wiki page yourself if you can). A good project description should contain: - A brief description of the project - A difficulty rating (e.g. easy, medium, hard) - The skills / programming languages required Also, I am going to purge all the old ideas from the ideas page in the next week, so if there are any of the old ideas that you think are still relevant, let me know and I will keep it. The ideas page is used as one of the criteria by Google for selecting mentoring organizations and part of the reason X.Org was not selected last year was that the ideas page was not up to par, so if we want to participate in Google Summer of Code this year, it is important we have a good ideas page with lots of ideas. Thanks, Tom Stellard ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Split shader_time entries into separate cachelines.
This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). v2: Add a define for the stride with a comment explaining its units and why. --- src/mesa/drivers/dri/i965/brw_context.h |8 src/mesa/drivers/dri/i965/brw_fs.cpp|2 +- src/mesa/drivers/dri/i965/brw_program.c |5 +++-- src/mesa/drivers/dri/i965/brw_vec4.cpp |2 +- 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index c34d6b1..d042dd6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -571,6 +571,14 @@ struct brw_vs_prog_data { #define SURF_INDEX_SOL_BINDING(t)((t)) #define BRW_MAX_GS_SURFACES SURF_INDEX_SOL_BINDING(BRW_MAX_SOL_BINDINGS) +/** + * Stride in bytes between shader_time entries. + * + * We separate entries by a cacheline to reduce traffic between EUs writing to + * different entries. + */ +#define SHADER_TIME_STRIDE 64 + enum brw_cache_id { BRW_BLEND_STATE, BRW_DEPTH_STENCIL_STATE, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8ce3954..8476bb5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum shader_time_shader_type type, fs_reg offset_mrf = fs_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, fs_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, fs_reg(shader_time_index * SHADER_TIME_STRIDE))); fs_reg time_mrf = fs_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 75eb6bc..62954d3 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -228,7 +228,8 @@ brw_init_shader_time(struct brw_context *brw) const int max_entries = 4096; brw-shader_time.bo = drm_intel_bo_alloc(intel-bufmgr, shader time, -max_entries * 4, 4096); +max_entries * SHADER_TIME_STRIDE, +4096); brw-shader_time.programs = rzalloc_array(brw, struct gl_shader_program *, max_entries); brw-shader_time.types = rzalloc_array(brw, enum shader_time_shader_type, @@ -409,7 +410,7 @@ brw_collect_shader_time(struct brw_context *brw) uint32_t *times = brw-shader_time.bo-virtual; for (int i = 0; i brw-shader_time.num_entries; i++) { - brw-shader_time.cumulative[i] += times[i]; + brw-shader_time.cumulative[i] += times[i * SHADER_TIME_STRIDE / 4]; } /* Zero the BO out to clear it out for our next collection. diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f319f32..d759710 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, dst_reg offset_mrf = dst_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, src_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, src_reg(shader_time_index * SHADER_TIME_STRIDE))); dst_reg time_mrf = dst_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.
On Mar 13, 2013 9:25 AM, Kenneth Graunke kenn...@whitecape.org wrote: On 03/13/2013 06:07 AM, Adrian M Negreanu wrote: Hi, I have tested your changes but it looks like they fail to compile on Android. == Tested the patch(es) on top of the following commits: f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker === Failed to build for android f7ef83c scons: Define PACKAGE_xxx 6f86b93 docs: rewrite the OSMesa info / instructions 79eac7d configure: wire-up new OSMesa gallium state tracker and target be51f12 target/osmesa: add new Makefile.am 94263da targets/osmesa: new OSMesa gallium target 7114b6a st/osmesa: add new Makefile.am 73436a9 st/osmesa: new OSMesa gallium state tracker src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_depth_stencil_state': src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_cc_state_gen6': src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor': src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_vs_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_wm_constants': src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_binding_table': src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_prog_cache': src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function 'brw_upload_vs_pull_constants': src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith] src/mesa/main/errors.c: In function '_mesa_problem': src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared (first use in this function) That's definitely Matt's build system changes, not my code. It was an automated message but after checking the merged patches I've seen Matt patch on replacing the MESA PACKAGE variable. Apologies for the wrong report. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OSMesa VTK
On 03/13/2013 10:40 AM, Brian Paul wrote: Can you tell me what the parameters are for your OSMesaCreateContext() and OSMesaMakeCurrent() calls (in particular the image format/type)? I do not know. I'll do some digging into the VTK source and ask on the VTK list. You could also try setting GALLIUM_DRIVER=softpipe and see if the softpipe driver works. The test still fails, however it fails differently, instead of a black image I get a strip of colored snow at the top of the output image... How about I just show you? The output (boring black image) of last night's VTK LoadOpenGLExtension test which used llvmpipe is here : http://open.cdash.org/testDetails.php?test=180871306build=2844006 The output of: $ env GALLIUM_DRIVER=softpipe \ /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkRenderingOpenGLCxxTests \ LoadOpenGLExtension -D /home/kevin/kitware/VTKData \ -T /home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary \ -V Baseline/Rendering/LoadOpenGLExtension.png is here : http://crab-lab.zool.ohiou.edu/kevin/LoadOpenGLExtension.png signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2
From: Tom Stellard thomas.stell...@amd.com This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes --- src/gallium/docs/source/screen.rst |8 +- src/gallium/drivers/r600/r600_llvm.c | 63 - src/gallium/drivers/r600/r600_llvm.h |2 - src/gallium/drivers/r600/r600_pipe.c | 74 ++- src/gallium/drivers/r600/r600_pipe.h |2 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 11 +++ src/gallium/drivers/radeonsi/radeonsi_pipe.h |1 + src/gallium/drivers/radeonsi/radeonsi_shader.c |4 +- .../state_trackers/clover/llvm/invocation.cpp | 18 -- 9 files changed, 104 insertions(+), 79 deletions(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 68d1a35..10836f1 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -222,10 +222,10 @@ PIPE_COMPUTE_CAP_* Compute-specific capabilities. They can be queried using pipe_screen::get_compute_param. -* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target as a target - triple specification of the form ``processor-manufacturer-os`` that will - be passed on to the compiler. This CAP is only relevant for drivers - that specify PIPE_SHADER_IR_LLVM for their preferred IR. +* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target of the form + ``processor-arch-manufacturer-os`` that will be passed on to the compiler. + This CAP is only relevant for drivers that specify PIPE_SHADER_IR_LLVM for + their preferred IR. Value type: null-terminated string. * ``PIPE_COMPUTE_CAP_GRID_DIMENSION``: Number of supported dimensions for grid and block coordinates. Value type: ``uint64_t``. diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 042193c..1552ccb 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -561,69 +561,6 @@ LLVMModuleRef r600_tgsi_llvm( return ctx-gallivm.module; } -const char * r600_llvm_gpu_string(enum radeon_family family) -{ - const char * gpu_family; - - switch (family) { - case CHIP_R600: - case CHIP_RV610: - case CHIP_RV630: - case CHIP_RV620: - case CHIP_RV635: - case CHIP_RV670: - case CHIP_RS780: - case CHIP_RS880: - gpu_family = r600; - break; - case CHIP_RV710: - gpu_family = rv710; - break; - case CHIP_RV730: - gpu_family = rv730; - break; - case CHIP_RV740: - case CHIP_RV770: - gpu_family = rv770; - break; - case CHIP_PALM: - case CHIP_CEDAR: - gpu_family = cedar; - break; - case CHIP_SUMO: - case CHIP_SUMO2: - case CHIP_REDWOOD: - gpu_family = redwood; - break; - case CHIP_JUNIPER: - gpu_family = juniper; - break; - case CHIP_HEMLOCK: - case CHIP_CYPRESS: - gpu_family = cypress; - break; - case CHIP_BARTS: - gpu_family = barts; - break; - case CHIP_TURKS: - gpu_family = turks; - break; - case CHIP_CAICOS: - gpu_family = caicos; - break; - case CHIP_CAYMAN: -case CHIP_ARUBA: - gpu_family = cayman; - break; - default: - gpu_family = ; - fprintf(stderr, Chip not supported by r600 llvm - backend, please file a bug at PACKAGE_BUGREPORT \n); - break; - } - return gpu_family; -} - unsigned r600_llvm_compile( LLVMModuleRef mod, unsigned char ** inst_bytes, diff --git a/src/gallium/drivers/r600/r600_llvm.h b/src/gallium/drivers/r600/r600_llvm.h index 090d909..b5e2af2 100644 --- a/src/gallium/drivers/r600/r600_llvm.h +++ b/src/gallium/drivers/r600/r600_llvm.h @@ -15,8 +15,6 @@ LLVMModuleRef r600_tgsi_llvm( struct radeon_llvm_context * ctx, const struct tgsi_token * tokens); -const char * r600_llvm_gpu_string(enum radeon_family family); - unsigned r600_llvm_compile( LLVMModuleRef mod, unsigned char ** inst_bytes, diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 60a0247..66dac62 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -760,18 +760,84 @@ static int r600_get_video_param(struct pipe_screen *screen, } } +const char *
[Mesa-dev] [PATCH 2/4] radeonsi: Remove si_pm4_inval_vertex_cache()
From: Tom Stellard thomas.stell...@amd.com This function is a holdover from r600g and is identical to si_pm4_inval_texture_cache(), so it is not needed. --- src/gallium/drivers/radeonsi/radeonsi_pm4.c |6 -- src/gallium/drivers/radeonsi/radeonsi_pm4.h |1 - src/gallium/drivers/radeonsi/si_state_draw.c |2 +- 3 files changed, 1 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.c b/src/gallium/drivers/radeonsi/radeonsi_pm4.c index 79a2521..9a884f7 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pm4.c +++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.c @@ -139,12 +139,6 @@ void si_pm4_inval_texture_cache(struct si_pm4_state *state) state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1); } -void si_pm4_inval_vertex_cache(struct si_pm4_state *state) -{ -/* Some GPUs don't have the vertex cache and must use the texture cache instead. */ - state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1); -} - void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs) { state-cp_coher_cntl |= S_0085F0_CB_ACTION_ENA(1); diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.h b/src/gallium/drivers/radeonsi/radeonsi_pm4.h index 2ad62d6..bdeb930 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pm4.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.h @@ -75,7 +75,6 @@ void si_pm4_sh_data_end(struct si_pm4_state *state, unsigned base, unsigned idx) void si_pm4_inval_shader_cache(struct si_pm4_state *state); void si_pm4_inval_texture_cache(struct si_pm4_state *state); -void si_pm4_inval_vertex_cache(struct si_pm4_state *state); void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs); void si_pm4_inval_zsbuf_cache(struct si_pm4_state *state); diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 1049d2b..b78f20a 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -416,7 +416,7 @@ static void si_vertex_buffer_update(struct r600_context *rctx) unsigned i, count; uint64_t va; - si_pm4_inval_vertex_cache(pm4); + si_pm4_inval_texture_cache(pm4); /* bind vertex buffer once */ count = rctx-vertex_elements-count; -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] radeonsi: Use TCL1_ACTION_ENA when invalidating the texture cache
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_pm4.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.c b/src/gallium/drivers/radeonsi/radeonsi_pm4.c index 9a884f7..4ea30f6 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pm4.c +++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.c @@ -137,6 +137,7 @@ void si_pm4_inval_shader_cache(struct si_pm4_state *state) void si_pm4_inval_texture_cache(struct si_pm4_state *state) { state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1); + state-cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1); } void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs) -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] radeonsi: Add compute support v3
From: Tom Stellard thomas.stell...@amd.com v2: - Only dump shaders when env variable is set. v3: - Don't emit VGT registers --- src/gallium/drivers/radeon/radeon_llvm_util.c |2 +- src/gallium/drivers/radeon/radeon_llvm_util.h |2 + src/gallium/drivers/radeonsi/Makefile.sources |1 + src/gallium/drivers/radeonsi/radeonsi_compute.c | 234 +++ src/gallium/drivers/radeonsi/radeonsi_pipe.c| 61 ++- src/gallium/drivers/radeonsi/radeonsi_pipe.h| 10 + src/gallium/drivers/radeonsi/radeonsi_pm4.c |5 +- src/gallium/drivers/radeonsi/radeonsi_pm4.h |2 + src/gallium/drivers/radeonsi/radeonsi_shader.c | 96 ++ src/gallium/drivers/radeonsi/radeonsi_shader.h |4 + src/gallium/drivers/radeonsi/sid.h |6 + 11 files changed, 378 insertions(+), 45 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/radeonsi_compute.c diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.c b/src/gallium/drivers/radeon/radeon_llvm_util.c index b2ecb1a..2582d9c 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_util.c +++ b/src/gallium/drivers/radeon/radeon_llvm_util.c @@ -30,7 +30,7 @@ #include llvm-c/BitReader.h #include llvm-c/Core.h -static LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode, +LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode, unsigned bitcode_len) { LLVMMemoryBufferRef buf; diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.h b/src/gallium/drivers/radeon/radeon_llvm_util.h index 7db25bb..b851648 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_util.h +++ b/src/gallium/drivers/radeon/radeon_llvm_util.h @@ -29,6 +29,8 @@ #include llvm-c/Core.h +LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode, + unsigned bitcode_len); unsigned radeon_llvm_get_num_kernels(const unsigned char *bitcode, unsigned bitcode_len); LLVMModuleRef radeon_llvm_get_kernel_module(unsigned index, const unsigned char *bitcode, unsigned bitcode_len); diff --git a/src/gallium/drivers/radeonsi/Makefile.sources b/src/gallium/drivers/radeonsi/Makefile.sources index 65da1ac..5e1cc4f 100644 --- a/src/gallium/drivers/radeonsi/Makefile.sources +++ b/src/gallium/drivers/radeonsi/Makefile.sources @@ -9,6 +9,7 @@ C_SOURCES := \ r600_texture.c \ r600_translate.c \ radeonsi_pm4.c \ + radeonsi_compute.c \ si_state.c \ si_state_streamout.c \ si_state_draw.c \ diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c new file mode 100644 index 000..1e8978c --- /dev/null +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -0,0 +1,234 @@ +#include util/u_memory.h + +#include radeonsi_pipe.h +#include radeonsi_shader.h + +#include radeon_llvm_util.h + +struct si_pipe_compute { + struct r600_context *ctx; + + unsigned local_size; + unsigned private_size; + unsigned input_size; + struct si_pipe_shader shader; + unsigned num_user_sgprs; + +struct si_pm4_state *pm4_buffers; + +}; + +static void *radeonsi_create_compute_state( + struct pipe_context *ctx, + const struct pipe_compute_state *cso) +{ + struct r600_context *rctx = (struct r600_context *)ctx; + struct si_pipe_compute *program = + CALLOC_STRUCT(si_pipe_compute); + const struct pipe_llvm_program_header *header; + const unsigned char *code; + LLVMModuleRef mod; + + header = cso-prog; + code = cso-prog + sizeof(struct pipe_llvm_program_header); + + program-ctx = rctx; + program-local_size = cso-req_local_mem; + program-private_size = cso-req_private_mem; + program-input_size = cso-req_input_mem; + + mod = radeon_llvm_parse_bitcode(code, header-num_bytes); + si_compile_llvm(rctx, program-shader, mod); + + return program; +} + +static void radeonsi_bind_compute_state(struct pipe_context *ctx, void *state) +{ + struct r600_context *rctx = (struct r600_context*)ctx; + rctx-cs_shader_state.program = (struct si_pipe_compute*)state; +} + +static void radeonsi_set_global_binding( + struct pipe_context *ctx, unsigned first, unsigned n, + struct pipe_resource **resources, + uint32_t **handles) +{ + unsigned i; + struct r600_context *rctx = (struct r600_context*)ctx; + struct si_pipe_compute *program = rctx-cs_shader_state.program; + struct si_pm4_state *pm4; + + if (!program-pm4_buffers) { + program-pm4_buffers = CALLOC_STRUCT(si_pm4_state); + } + pm4 = program-pm4_buffers; + pm4-compute_pkt = true; + + if (!resources) { + return; + } + + for (i = first; i first + n; i++) {
Re: [Mesa-dev] [PATCH 1/4] gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2
On Wed, Mar 13, 2013 at 2:11 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes for the series: Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- src/gallium/docs/source/screen.rst |8 +- src/gallium/drivers/r600/r600_llvm.c | 63 - src/gallium/drivers/r600/r600_llvm.h |2 - src/gallium/drivers/r600/r600_pipe.c | 74 ++- src/gallium/drivers/r600/r600_pipe.h |2 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 11 +++ src/gallium/drivers/radeonsi/radeonsi_pipe.h |1 + src/gallium/drivers/radeonsi/radeonsi_shader.c |4 +- .../state_trackers/clover/llvm/invocation.cpp | 18 -- 9 files changed, 104 insertions(+), 79 deletions(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 68d1a35..10836f1 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -222,10 +222,10 @@ PIPE_COMPUTE_CAP_* Compute-specific capabilities. They can be queried using pipe_screen::get_compute_param. -* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target as a target - triple specification of the form ``processor-manufacturer-os`` that will - be passed on to the compiler. This CAP is only relevant for drivers - that specify PIPE_SHADER_IR_LLVM for their preferred IR. +* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target of the form + ``processor-arch-manufacturer-os`` that will be passed on to the compiler. + This CAP is only relevant for drivers that specify PIPE_SHADER_IR_LLVM for + their preferred IR. Value type: null-terminated string. * ``PIPE_COMPUTE_CAP_GRID_DIMENSION``: Number of supported dimensions for grid and block coordinates. Value type: ``uint64_t``. diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 042193c..1552ccb 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -561,69 +561,6 @@ LLVMModuleRef r600_tgsi_llvm( return ctx-gallivm.module; } -const char * r600_llvm_gpu_string(enum radeon_family family) -{ - const char * gpu_family; - - switch (family) { - case CHIP_R600: - case CHIP_RV610: - case CHIP_RV630: - case CHIP_RV620: - case CHIP_RV635: - case CHIP_RV670: - case CHIP_RS780: - case CHIP_RS880: - gpu_family = r600; - break; - case CHIP_RV710: - gpu_family = rv710; - break; - case CHIP_RV730: - gpu_family = rv730; - break; - case CHIP_RV740: - case CHIP_RV770: - gpu_family = rv770; - break; - case CHIP_PALM: - case CHIP_CEDAR: - gpu_family = cedar; - break; - case CHIP_SUMO: - case CHIP_SUMO2: - case CHIP_REDWOOD: - gpu_family = redwood; - break; - case CHIP_JUNIPER: - gpu_family = juniper; - break; - case CHIP_HEMLOCK: - case CHIP_CYPRESS: - gpu_family = cypress; - break; - case CHIP_BARTS: - gpu_family = barts; - break; - case CHIP_TURKS: - gpu_family = turks; - break; - case CHIP_CAICOS: - gpu_family = caicos; - break; - case CHIP_CAYMAN: -case CHIP_ARUBA: - gpu_family = cayman; - break; - default: - gpu_family = ; - fprintf(stderr, Chip not supported by r600 llvm - backend, please file a bug at PACKAGE_BUGREPORT \n); - break; - } - return gpu_family; -} - unsigned r600_llvm_compile( LLVMModuleRef mod, unsigned char ** inst_bytes, diff --git a/src/gallium/drivers/r600/r600_llvm.h b/src/gallium/drivers/r600/r600_llvm.h index 090d909..b5e2af2 100644 --- a/src/gallium/drivers/r600/r600_llvm.h +++ b/src/gallium/drivers/r600/r600_llvm.h @@ -15,8 +15,6 @@ LLVMModuleRef r600_tgsi_llvm( struct radeon_llvm_context * ctx, const struct tgsi_token * tokens); -const char * r600_llvm_gpu_string(enum radeon_family family); - unsigned r600_llvm_compile( LLVMModuleRef mod, unsigned char ** inst_bytes, diff --git a/src/gallium/drivers/r600/r600_pipe.c
Re: [Mesa-dev] [PATCH 2/2] i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
On 12 March 2013 16:33, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: void -brw_workaround_depthstencil_alignment(struct brw_context *brw) +brw_workaround_depthstencil_alignment(struct brw_context *brw, + GLbitfield clear_mask) { struct intel_context *intel = brw-intel; struct gl_context *ctx = intel-ctx; @@ -341,10 +343,24 @@ brw_workaround_depthstencil_alignment(struct brw_context *brw) struct intel_mipmap_tree *stencil_mt = get_stencil_miptree(stencil_irb); uint32_t tile_x = 0, tile_y = 0, stencil_tile_x = 0, stencil_tile_y = 0; uint32_t stencil_draw_x = 0, stencil_draw_y = 0; + bool invalidate_depth = clear_mask GL_DEPTH_BUFFER_BIT; + bool invalidate_stencil = clear_mask GL_STENCIL_BUFFER_BIT; if (depth_irb) depth_mt = depth_irb-mt; + if (depth_irb invalidate_depth +_mesa_is_depthstencil_format( + _mesa_get_format_base_format(depth_mt-format)) +!depth_mt-stencil_mt) { The only _mesa_is_depthstencil_format() returned by _mesa_get_format_base_format() is GL_DEPTH_STENCIL, so calling that seems kinda overkill. Good point. I'll fix that before pushing. I'll also make a follow-up patch to fix the function I borrowed this test from (intel_miptree_create_layout). If depth_mt-stencil_mt, then depth_mt-format's base format will not be GL_DEPTH_STENCIL. I'm concerned that you're going to lose the depth_mt-stencil_mt contents of a gl-level packed depth/stencil texture that's backed by separate stencil. I think you inverted your logic there (if depth_mt-stencil_mt, then depth_mt-format's base format *will* be GL_DEPTH_STENCIL). Am I correct in inferring that the cases you're worried about are cases like: - Client creates a GL_DEPTH_STENCIL texture on a platform such as Gen7 that uses separate stencil - Client executes a glClear(GL_DEPTH_BIT) on a miplevel that needs the workaround, expecting stencil data to be preserved I think you may be right to worry about this--previously I had assumed that calling intel_renderbuffer_move_to_temp(depth_mt) on a depth/stencil texture backed by separate stencil would only relocate the depth miptree layer, and leave the stencil miptree layer alone. But rereading the code again, I'm less certain of that. I'll write a piglit test to exercise exactly this situation, and post a v2 of the patch if necessary. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X
https://bugs.freedesktop.org/show_bug.cgi?id=61416 --- Comment #5 from Mike Lothian m...@fireburn.co.uk --- Created attachment 76494 -- https://bugs.freedesktop.org/attachment.cgi?id=76494action=edit Xorg log old -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X
https://bugs.freedesktop.org/show_bug.cgi?id=61416 Mike Lothian m...@fireburn.co.uk changed: What|Removed |Added Attachment #75466|0 |1 is obsolete|| --- Comment #6 from Mike Lothian m...@fireburn.co.uk --- Created attachment 76495 -- https://bugs.freedesktop.org/attachment.cgi?id=76495action=edit Xorg log -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD
Second attempt, 2 years ago no one replied or cared ... We really need to know about these on nvc0 because there are only 8 fixed hardware locations that can be overwritten by sprite coordinates, and one location that represents gl_PointCoord and unconditionally returns sprite coordinates. So far this was solved via a hack, which works since the locations the state tracker picks aren't dynamic (and likely will never be, to facilitate ARB_separate_shader_objects), but it still isn't nice to do it this way. It looks like nv30 was using a hack, too, since it had a check for Semantic.Index == 9, which is what mesa uses for PointCoord. Implementing a safe, non-mesa-dependent way without these SEMANTICs would be jumping through hoops and doing expensive shader recompilations just because we like to destroy information at the gallium threshold, and that's unacceptable. I started to (try) fix up the other drivers, but maybe we just want a CAP for this instead, since the default solution - if this is TEXCOORD then treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really look that nicer either. E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker should use the TEXCOORD and PCOORD semantics, otherwise it should just use GENERICs as before. --- src/gallium/auxiliary/draw/draw_pipe_wide_point.c | 39 src/gallium/auxiliary/tgsi/tgsi_dump.c |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |2 + src/gallium/docs/source/cso/rasterizer.rst |2 +- src/gallium/docs/source/tgsi.rst | 23 +- src/gallium/drivers/freedreno/freedreno_compiler.c |2 + src/gallium/drivers/i915/i915_fpc_translate.c |2 + src/gallium/drivers/i915/i915_state_derived.c |4 ++ src/gallium/drivers/llvmpipe/lp_setup_point.c | 29 ++-- src/gallium/drivers/nv30/nvfx_fragprog.c | 39 src/gallium/drivers/nv50/nv50_shader_state.c |8 +-- src/gallium/drivers/nv50/nv50_surface.c|5 +- src/gallium/drivers/nvc0/nvc0_program.c| 37 +-- src/gallium/drivers/r300/r300_fs.c |2 + src/gallium/drivers/r300/r300_shader_semantics.h |3 +- src/gallium/drivers/r300/r300_vs.c |2 + src/gallium/drivers/r600/evergreen_state.c |7 ++- src/gallium/drivers/r600/r600_shader.c |3 +- src/gallium/drivers/r600/r600_state.c |7 ++- src/gallium/drivers/radeonsi/radeonsi_shader.c |1 + src/gallium/drivers/radeonsi/si_state.c|2 +- src/gallium/drivers/radeonsi/si_state_draw.c |5 +- src/gallium/include/pipe/p_shader_tokens.h | 36 +-- src/gallium/include/pipe/p_state.h |2 +- src/mesa/state_tracker/st_atom_rasterizer.c|6 +-- src/mesa/state_tracker/st_program.c| 48 +-- 26 files changed, 162 insertions(+), 155 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c index 8e0a117..d4ed0f7 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c +++ b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c @@ -233,28 +233,29 @@ widepoint_first_point(struct draw_stage *stage, wide-num_texcoord_gen = 0; - /* Loop over fragment shader inputs looking for generic inputs - * for which bit 'k' in sprite_coord_enable is set. + /* Loop over fragment shader inputs looking for the PCOORD input or + * TEXCOORD inputs for which bit 'k' in sprite_coord_enable is set. */ for (i = 0; i fs-info.num_inputs; i++) { - if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_GENERIC) { -const int generic_index = fs-info.input_semantic_index[i]; -/* Note that sprite_coord enable is a bitfield of - * PIPE_MAX_SHADER_OUTPUTS bits. - */ -if (generic_index PIPE_MAX_SHADER_OUTPUTS -(rast-sprite_coord_enable (1 generic_index))) { - /* OK, this generic attribute needs to be replaced with a -* texcoord (see above). -*/ - int slot = draw_alloc_extra_vertex_attrib(draw, - TGSI_SEMANTIC_GENERIC, - generic_index); - - /* add this slot to the texcoord-gen list */ - wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot; -} + int slot; + const unsigned sn = fs-info.input_semantic_name[i]; + const unsigned si = fs-info.input_semantic_index[i]; + + if (sn == TGSI_SEMANTIC_TEXCOORD) { +/* Note that sprite_coord enable is a bitfield of 8 bits. */ +if (si = 8 || !(rast-sprite_coord_enable (1 si))) +
[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X
https://bugs.freedesktop.org/show_bug.cgi?id=61416 --- Comment #7 from Mike Lothian m...@fireburn.co.uk --- Created attachment 76498 -- https://bugs.freedesktop.org/attachment.cgi?id=76498action=edit Glibc error This shows the glibc lined lists issue -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61416] Clover doesn't authenticate when not run as a privileged user
https://bugs.freedesktop.org/show_bug.cgi?id=61416 Mike Lothian m...@fireburn.co.uk changed: What|Removed |Added Summary|Clover doesn't work on a|Clover doesn't authenticate |PRIME system when run under |when not run as a |X |privileged user -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/12] Get rid of _mesa_vert_result_to_frag_attrib().
On 12 March 2013 19:29, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. But we still need to be able to detect when a given vertex output has no corresponding fragment input. So it is replaced by a new function, _mesa_varying_slot_in_fs(), which tells whether the given varying slot exists as an FS input or not. --- src/mesa/drivers/dri/i965/brw_fs.cpp| 12 - src/mesa/drivers/dri/i965/brw_vs_constval.c | 13 -- src/mesa/main/mtypes.h | 38 + 3 files changed, 27 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 86f8cbb..ea4a56c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1265,7 +1265,7 @@ fs_visitor::calculate_urb_setup() continue; if (c-key.vp_outputs_written BITFIELD64_BIT(i)) { - int fp_index = _mesa_vert_result_to_frag_attrib((gl_varying_slot) i); +bool exists_in_fs = _mesa_varying_slot_in_fs((gl_varying_slot) i); I'd rather see this call moved into the single usage in the if statement below, like has been done elsewhere (now that the function name explicitly talks about what's being tested in the if anyway) Fair enough--I'll fix that. /* The back color slot is skipped when the front color is * also written to. In addition, some slots can be @@ -1273,8 +1273,8 @@ fs_visitor::calculate_urb_setup() * fragment shader. So the register number must always be * incremented, mapped or not. */ - if (fp_index = 0) -urb_setup[fp_index] = urb_next; + if (exists_in_fs) +urb_setup[i] = urb_next; urb_next++; /** - * Convert from a gl_varying_slot value for a vertex output to the - * corresponding gl_frag_attrib. - * - * Varying output values which have no corresponding gl_frag_attrib - * (VARYING_SLOT_PSIZ, VARYING_SLOT_BFC0, VARYING_SLOT_BFC1, and - * VARYING_SLOT_EDGE) are converted to a value of -1. + * Determine if the given gl_varying_slot appears in the fragment shader. */ -static inline int -_mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result) +static inline GLboolean +_mesa_varying_slot_in_fs(gl_varying_slot slot) { - if (vert_result = VARYING_SLOT_TEX7) - return vert_result; - else if (vert_result VARYING_SLOT_CLIP_DIST0) - return -1; - else if (vert_result = VARYING_SLOT_CLIP_DIST1) - return vert_result; - else if (vert_result VARYING_SLOT_VAR0) - return -1; - else - return vert_result; + switch (slot) { + case VARYING_SLOT_PSIZ: + case VARYING_SLOT_BFC0: + case VARYING_SLOT_BFC1: + case VARYING_SLOT_EDGE: + case VARYING_SLOT_CLIP_VERTEX: + case VARYING_SLOT_LAYER: + return GL_FALSE; + default: + return GL_TRUE; + } } I bet the compiler does a big switch statement instead of doing what we could do better with bitfields. Not a blocker, just a potential improvement. Hmm, now I'm curious. Amazingly enough, gcc with -O2 is actually smart enough to use a bitfield: _Z24_mesa_varying_slot_in_fs15gl_varying_slot: .LFB0: .cfi_startproc cmpl$20, %edi ja.L4 movl$1, %edx movl%edi, %ecx xorl%eax, %eax salq%cl, %rdx testl$1175552, %edx je.L4 rep ret .p2align 4,,10 .p2align 3 .L4: movl$1, %eax ret .cfi_endproc I'm impressed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation
--- lib/Target/R600/AMDILISelDAGToDAG.cpp| 34 ++ lib/Target/R600/R600InstrInfo.cpp| 54 ++ lib/Target/R600/R600InstrInfo.h | 3 ++ lib/Target/R600/R600MachineScheduler.cpp | 77 lib/Target/R600/R600MachineScheduler.h | 3 +- test/CodeGen/R600/kcache-fold-2.ll | 52 + 6 files changed, 144 insertions(+), 79 deletions(-) create mode 100644 test/CodeGen/R600/kcache-fold-2.ll diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp b/lib/Target/R600/AMDILISelDAGToDAG.cpp index 0c7880d..05a1ea7 100644 --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp @@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) { return Result; } + bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, const R600InstrInfo *TII, std::vectorSDValue Ops) { int OperandIdx[] = { @@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, SDValue Operand = Ops[OperandIdx[i] - 1]; switch (Operand.getOpcode()) { case AMDGPUISD::CONST_ADDRESS: { - if (i == 2) -break; SDValue CstOffset; - if (!Operand.getValueType().isVector() - SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) { -Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); -Ops[SelIdx[i] - 1] = CstOffset; -return true; + if (Operand.getValueType().isVector() || + !SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) +break; + + // Gather others constants values + std::vectorunsigned Consts; + for (unsigned j = 0; j 3; j++) { +int SrcIdx = OperandIdx[j]; +if (SrcIdx 0) + break; +if (RegisterSDNode *Reg = dyn_castRegisterSDNode(Ops[SrcIdx - 1])) { + if (Reg-getReg() == AMDGPU::ALU_CONST) { +ConstantSDNode *Cst = dyn_castConstantSDNode(Ops[SelIdx[j] - 1]); +Consts.push_back(Cst-getZExtValue()); + } +} } + + ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset); + Consts.push_back(Cst-getZExtValue()); + if (!TII-fitsConstReadLimitations(Consts)) +break; + + Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); + Ops[SelIdx[i] - 1] = CstOffset; + return true; } - break; case ISD::FNEG: if (NegIdx[i] 0) break; diff --git a/lib/Target/R600/R600InstrInfo.cpp b/lib/Target/R600/R600InstrInfo.cpp index be3318a..0865098 100644 --- a/lib/Target/R600/R600InstrInfo.cpp +++ b/lib/Target/R600/R600InstrInfo.cpp @@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const { (TargetFlags R600_InstFlag::OP3)); } +bool +R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned Consts) +const { + assert (Consts.size() = 12 Too many operands in instructions group); + unsigned Pair1 = 0, Pair2 = 0; + for (unsigned i = 0, n = Consts.size(); i n; ++i) { +unsigned ReadConstHalf = Consts[i] 2; +unsigned ReadConstIndex = Consts[i] (~3); +unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf; +if (!Pair1) { + Pair1 = ReadHalfConst; + continue; +} +if (Pair1 == ReadHalfConst) + continue; +if (!Pair2) { + Pair2 = ReadHalfConst; + continue; +} +if (Pair2 != ReadHalfConst) + return false; + } + return true; +} + +bool +R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) const { + std::vectorunsigned Consts; + for (unsigned i = 0, n = MIs.size(); i n; i++) { +const MachineInstr *MI = MIs[i]; + +const R600Operands::Ops OpTable[3][2] = { + {R600Operands::SRC0, R600Operands::SRC0_SEL}, + {R600Operands::SRC1, R600Operands::SRC1_SEL}, + {R600Operands::SRC2, R600Operands::SRC2_SEL}, +}; + +if (!isALUInstr(MI-getOpcode())) + continue; + +for (unsigned j = 0; j 3; j++) { + int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]); + if (SrcIdx 0) +break; + if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) { +unsigned Const = MI-getOperand( +getOperandIdx(MI-getOpcode(), OpTable[j][1])).getImm(); +Consts.push_back(Const); + } +} + } + return fitsConstReadLimitations(Consts); +} + DFAPacketizer *R600InstrInfo::CreateTargetScheduleState(const TargetMachine *TM, const ScheduleDAG *DAG) const { const InstrItineraryData *II = TM-getInstrItineraryData(); diff --git a/lib/Target/R600/R600InstrInfo.h b/lib/Target/R600/R600InstrInfo.h index efe721c..bf9569e 100644 --- a/lib/Target/R600/R600InstrInfo.h +++ b/lib/Target/R600/R600InstrInfo.h @@ -53,6 +53,9 @@ namespace llvm { /// \returns true if this \p Opcode represents an ALU instruction. bool isALUInstr(unsigned Opcode) const; + bool fitsConstReadLimitations(const std::vectorunsigned) const; +
Re: [Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD
- Original Message - Second attempt, 2 years ago no one replied or cared ... We really need to know about these on nvc0 because there are only 8 fixed hardware locations that can be overwritten by sprite coordinates, and one location that represents gl_PointCoord and unconditionally returns sprite coordinates. So far this was solved via a hack, which works since the locations the state tracker picks aren't dynamic (and likely will never be, to facilitate ARB_separate_shader_objects), but it still isn't nice to do it this way. It looks like nv30 was using a hack, too, since it had a check for Semantic.Index == 9, which is what mesa uses for PointCoord. Implementing a safe, non-mesa-dependent way without these SEMANTICs would be jumping through hoops and doing expensive shader recompilations just because we like to destroy information at the gallium threshold, and that's unacceptable. I started to (try) fix up the other drivers, but maybe we just want a CAP for this instead, since the default solution - if this is TEXCOORD then treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really look that nicer either. E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker should use the TEXCOORD and PCOORD semantics, otherwise it should just use GENERICs as before. Personally I have no objection with this FWIW. But please append the new TGSI_SEMANTIC_xxx without renumbering the existing ones. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation
On Wed, Mar 13, 2013 at 09:12:41PM +0100, Vincent Lejeune wrote: --- lib/Target/R600/AMDILISelDAGToDAG.cpp| 34 ++ lib/Target/R600/R600InstrInfo.cpp| 54 ++ lib/Target/R600/R600InstrInfo.h | 3 ++ lib/Target/R600/R600MachineScheduler.cpp | 77 lib/Target/R600/R600MachineScheduler.h | 3 +- test/CodeGen/R600/kcache-fold-2.ll | 52 + 6 files changed, 144 insertions(+), 79 deletions(-) create mode 100644 test/CodeGen/R600/kcache-fold-2.ll diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp b/lib/Target/R600/AMDILISelDAGToDAG.cpp index 0c7880d..05a1ea7 100644 --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp @@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) { return Result; } + Whitespace bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, const R600InstrInfo *TII, std::vectorSDValue Ops) { int OperandIdx[] = { @@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, SDValue Operand = Ops[OperandIdx[i] - 1]; switch (Operand.getOpcode()) { case AMDGPUISD::CONST_ADDRESS: { - if (i == 2) -break; SDValue CstOffset; - if (!Operand.getValueType().isVector() - SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) { -Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); -Ops[SelIdx[i] - 1] = CstOffset; -return true; + if (Operand.getValueType().isVector() || + !SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) +break; + + // Gather others constants values + std::vectorunsigned Consts; + for (unsigned j = 0; j 3; j++) { +int SrcIdx = OperandIdx[j]; +if (SrcIdx 0) + break; +if (RegisterSDNode *Reg = dyn_castRegisterSDNode(Ops[SrcIdx - 1])) { + if (Reg-getReg() == AMDGPU::ALU_CONST) { +ConstantSDNode *Cst = dyn_castConstantSDNode(Ops[SelIdx[j] - 1]); +Consts.push_back(Cst-getZExtValue()); + } +} } + + ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset); + Consts.push_back(Cst-getZExtValue()); + if (!TII-fitsConstReadLimitations(Consts)) +break; + + Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); + Ops[SelIdx[i] - 1] = CstOffset; + return true; } - break; case ISD::FNEG: if (NegIdx[i] 0) break; diff --git a/lib/Target/R600/R600InstrInfo.cpp b/lib/Target/R600/R600InstrInfo.cpp index be3318a..0865098 100644 --- a/lib/Target/R600/R600InstrInfo.cpp +++ b/lib/Target/R600/R600InstrInfo.cpp @@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const { (TargetFlags R600_InstFlag::OP3)); } +bool +R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned Consts) +const { + assert (Consts.size() = 12 Too many operands in instructions group); + unsigned Pair1 = 0, Pair2 = 0; + for (unsigned i = 0, n = Consts.size(); i n; ++i) { +unsigned ReadConstHalf = Consts[i] 2; +unsigned ReadConstIndex = Consts[i] (~3); +unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf; +if (!Pair1) { + Pair1 = ReadHalfConst; + continue; +} +if (Pair1 == ReadHalfConst) + continue; +if (!Pair2) { + Pair2 = ReadHalfConst; + continue; +} +if (Pair2 != ReadHalfConst) + return false; + } + return true; +} + +bool +R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) const { + std::vectorunsigned Consts; + for (unsigned i = 0, n = MIs.size(); i n; i++) { +const MachineInstr *MI = MIs[i]; + +const R600Operands::Ops OpTable[3][2] = { + {R600Operands::SRC0, R600Operands::SRC0_SEL}, + {R600Operands::SRC1, R600Operands::SRC1_SEL}, + {R600Operands::SRC2, R600Operands::SRC2_SEL}, +}; + +if (!isALUInstr(MI-getOpcode())) + continue; + +for (unsigned j = 0; j 3; j++) { + int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]); + if (SrcIdx 0) +break; + if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) { +unsigned Const = MI-getOperand( +getOperandIdx(MI-getOpcode(), OpTable[j][1])).getImm(); +Consts.push_back(Const); + } +} + } + return fitsConstReadLimitations(Consts); +} + DFAPacketizer *R600InstrInfo::CreateTargetScheduleState(const TargetMachine *TM, const ScheduleDAG *DAG) const { const InstrItineraryData *II = TM-getInstrItineraryData(); diff --git a/lib/Target/R600/R600InstrInfo.h b/lib/Target/R600/R600InstrInfo.h index efe721c..bf9569e 100644 --- a/lib/Target/R600/R600InstrInfo.h +++
[Mesa-dev] [PATCH 1/3] softpipe: don't assert when creating surfaces with multiple layers
From: Roland Scheidegger srol...@vmware.com We can't handle them yet, however we can safely just warn (we will just render to first layer, which is fine since we can't handle rendertarget system value neither). Also make behavior more predictable with buffer surfaces (it would sometimes hit bogus asserts because of the union in the surface, instead create the surface but assert when trying to set a buffer in the framebuffer). --- src/gallium/drivers/softpipe/sp_texture.c| 30 +- src/gallium/drivers/softpipe/sp_tile_cache.c | 18 ++-- 2 files changed, 32 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_texture.c b/src/gallium/drivers/softpipe/sp_texture.c index 0d1481a..2db0de8 100644 --- a/src/gallium/drivers/softpipe/sp_texture.c +++ b/src/gallium/drivers/softpipe/sp_texture.c @@ -283,10 +283,6 @@ softpipe_create_surface(struct pipe_context *pipe, const struct pipe_surface *surf_tmpl) { struct pipe_surface *ps; - unsigned level = surf_tmpl-u.tex.level; - - assert(level = pt-last_level); - assert(surf_tmpl-u.tex.first_layer == surf_tmpl-u.tex.last_layer); ps = CALLOC_STRUCT(pipe_surface); if (ps) { @@ -294,12 +290,26 @@ softpipe_create_surface(struct pipe_context *pipe, pipe_resource_reference(ps-texture, pt); ps-context = pipe; ps-format = surf_tmpl-format; - ps-width = u_minify(pt-width0, level); - ps-height = u_minify(pt-height0, level); - - ps-u.tex.level = level; - ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer; - ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer; + if (pt-target != PIPE_BUFFER) { + assert(surf_tmpl-u.tex.level = pt-last_level); + ps-width = u_minify(pt-width0, surf_tmpl-u.tex.level); + ps-height = u_minify(pt-height0, surf_tmpl-u.tex.level); + ps-u.tex.level = surf_tmpl-u.tex.level; + ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer; + ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer; + if (ps-u.tex.first_layer != ps-u.tex.last_layer) { +debug_printf(creating surface with multiple layers, rendering to first layer only\n); + } + } + else { + /* setting width as number of elements should get us correct renderbuffer width */ + ps-width = surf_tmpl-u.buf.last_element - surf_tmpl-u.buf.first_element + 1; + ps-height = pt-height0; + ps-u.buf.first_element = surf_tmpl-u.buf.first_element; + ps-u.buf.last_element = surf_tmpl-u.buf.last_element; + assert(ps-u.buf.first_element = ps-u.buf.last_element); + assert(ps-u.buf.last_element ps-width); + } } return ps; } diff --git a/src/gallium/drivers/softpipe/sp_tile_cache.c b/src/gallium/drivers/softpipe/sp_tile_cache.c index dded0e1..b6dd6af 100644 --- a/src/gallium/drivers/softpipe/sp_tile_cache.c +++ b/src/gallium/drivers/softpipe/sp_tile_cache.c @@ -170,12 +170,18 @@ sp_tile_cache_set_surface(struct softpipe_tile_cache *tc, tc-surface = ps; if (ps) { - tc-transfer_map = pipe_transfer_map(pipe, ps-texture, - ps-u.tex.level, ps-u.tex.first_layer, - PIPE_TRANSFER_READ_WRITE | - PIPE_TRANSFER_UNSYNCHRONIZED, - 0, 0, ps-width, ps-height, - tc-transfer); + if (ps-texture-target != PIPE_BUFFER) { + tc-transfer_map = pipe_transfer_map(pipe, ps-texture, + ps-u.tex.level, ps-u.tex.first_layer, + PIPE_TRANSFER_READ_WRITE | + PIPE_TRANSFER_UNSYNCHRONIZED, + 0, 0, ps-width, ps-height, + tc-transfer); + } + else { + /* can't render to buffers */ + assert(0); + } tc-depth_stencil = util_format_is_depth_or_stencil(ps-format); } -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] llvmpipe: don't assert when trying to render to surfaces with multiple layers
From: Roland Scheidegger srol...@vmware.com instead just warn when creating the surface, rendering will simply happen to first layer. --- src/gallium/drivers/llvmpipe/lp_scene.c |2 -- src/gallium/drivers/llvmpipe/lp_texture.c |3 +++ 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c b/src/gallium/drivers/llvmpipe/lp_scene.c index a0912eb..a888586 100644 --- a/src/gallium/drivers/llvmpipe/lp_scene.c +++ b/src/gallium/drivers/llvmpipe/lp_scene.c @@ -157,7 +157,6 @@ lp_scene_begin_rasterization(struct lp_scene *scene) for (i = 0; i scene-fb.nr_cbufs; i++) { struct pipe_surface *cbuf = scene-fb.cbufs[i]; if (llvmpipe_resource_is_texture(cbuf-texture)) { - assert(cbuf-u.tex.first_layer == cbuf-u.tex.last_layer); scene-cbufs[i].stride = llvmpipe_resource_stride(cbuf-texture, cbuf-u.tex.level); @@ -178,7 +177,6 @@ lp_scene_begin_rasterization(struct lp_scene *scene) if (fb-zsbuf) { struct pipe_surface *zsbuf = scene-fb.zsbuf; - assert(zsbuf-u.tex.first_layer == zsbuf-u.tex.last_layer); scene-zsbuf.stride = llvmpipe_resource_stride(zsbuf-texture, zsbuf-u.tex.level); scene-zsbuf.blocksize = util_format_get_blocksize(zsbuf-texture-format); diff --git a/src/gallium/drivers/llvmpipe/lp_texture.c b/src/gallium/drivers/llvmpipe/lp_texture.c index 9de05e7..99bd6d3 100644 --- a/src/gallium/drivers/llvmpipe/lp_texture.c +++ b/src/gallium/drivers/llvmpipe/lp_texture.c @@ -593,6 +593,9 @@ llvmpipe_create_surface(struct pipe_context *pipe, ps-u.tex.level = surf_tmpl-u.tex.level; ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer; ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer; + if (ps-u.tex.first_layer != ps-u.tex.last_layer) { +debug_printf(creating surface with multiple layers, rendering to first layer only\n); + } } else { /* setting width as number of elements should get us correct renderbuffer width */ -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] tgsi: fix sample_d emit for arrays
From: Roland Scheidegger srol...@vmware.com Those cases were apparently forgotten. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 30 +++--- 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 4488397..3df3ac3 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -2371,50 +2371,42 @@ exec_sample_d(struct tgsi_exec_machine *mach, /* always fetch all 3 offsets, overkill but keeps code simple */ fetch_texel_offsets(mach, inst, offsets); + FETCH(r[0], 0, TGSI_CHAN_X); + switch (mach-SamplerViews[resource_unit].Resource) { case TGSI_TEXTURE_1D: - FETCH(r[0], 0, TGSI_CHAN_X); + case TGSI_TEXTURE_1D_ARRAY: + /* only 1D array actually needs Y */ + FETCH(r[1], 0, TGSI_CHAN_Y); fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]); fetch_texel(mach-Sampler, resource_unit, sampler_unit, - r[0], ZeroVec, ZeroVec, ZeroVec, ZeroVec, /* S, T, P, C, LOD */ + r[0], r[1], ZeroVec, ZeroVec, ZeroVec, /* S, T, P, C, LOD */ derivs, offsets, tgsi_sampler_derivs_explicit, r[0], r[1], r[2], r[3]); /* R, G, B, A */ break; case TGSI_TEXTURE_2D: case TGSI_TEXTURE_RECT: - FETCH(r[0], 0, TGSI_CHAN_X); + case TGSI_TEXTURE_2D_ARRAY: + /* only 2D array actually needs Z */ FETCH(r[1], 0, TGSI_CHAN_Y); + FETCH(r[2], 0, TGSI_CHAN_Z); fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]); fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Y, derivs[1]); fetch_texel(mach-Sampler, resource_unit, sampler_unit, - r[0], r[1], ZeroVec, ZeroVec, ZeroVec, /* inputs */ + r[0], r[1], r[2], ZeroVec, ZeroVec, /* inputs */ derivs, offsets, tgsi_sampler_derivs_explicit, r[0], r[1], r[2], r[3]); /* outputs */ break; case TGSI_TEXTURE_3D: case TGSI_TEXTURE_CUBE: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - FETCH(r[2], 0, TGSI_CHAN_Z); - - fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]); - fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Y, derivs[1]); - fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Z, derivs[2]); - - fetch_texel(mach-Sampler, resource_unit, sampler_unit, - r[0], r[1], r[2], ZeroVec, ZeroVec, - derivs, offsets, tgsi_sampler_derivs_explicit, - r[0], r[1], r[2], r[3]); - break; - case TGSI_TEXTURE_CUBE_ARRAY: - FETCH(r[0], 0, TGSI_CHAN_X); + /* only cube array actually needs W */ FETCH(r[1], 0, TGSI_CHAN_Y); FETCH(r[2], 0, TGSI_CHAN_Z); FETCH(r[3], 0, TGSI_CHAN_W); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Simplify separate stencil check
The only format returned by _mesa_get_format_base_format() that satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we can simplify the check. --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 1640590..a47f6d8 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -247,7 +247,7 @@ intel_miptree_create_layout(struct intel_context *intel, mt-physical_depth0 = depth0; if (!for_region - _mesa_is_depthstencil_format(_mesa_get_format_base_format(format)) + _mesa_get_format_base_format(format) == GL_DEPTH_STENCIL (intel-must_use_separate_stencil || (intel-has_separate_stencil intel-vtbl.is_hiz_depth_format(intel, format { -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600: Lower clamp constant to constant
--- lib/Target/R600/R600ISelLowering.cpp | 23 +++ test/CodeGen/R600/clamp-constants.ll | 20 2 files changed, 43 insertions(+) create mode 100644 test/CodeGen/R600/clamp-constants.ll diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp index a73691d..96686e6 100644 --- a/lib/Target/R600/R600ISelLowering.cpp +++ b/lib/Target/R600/R600ISelLowering.cpp @@ -394,6 +394,29 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, SelectionDAG DAG) const return SDValue(interp, slot % 2); } +case AMDGPUIntrinsic::AMDIL_clamp: { + ConstantFPSDNode *Min = dyn_castConstantFPSDNode(Op.getOperand(2)); + ConstantFPSDNode *Max = dyn_castConstantFPSDNode(Op.getOperand(3)); + if (ConstantFPSDNode *C = dyn_castConstantFPSDNode(Op.getOperand(1))) { +switch (C-getValueAPF().compare(Max-getValueAPF())) { +case APFloat::cmpGreaterThan: +case APFloat::cmpEqual: + return Op.getOperand(3); +default: + break; +} + +switch (C-getValueAPF().compare(Min-getValueAPF())) { +case APFloat::cmpLessThan: +case APFloat::cmpEqual: + return Op.getOperand(2); +default: + break; +} +return Op.getOperand(1); + } + break; +} case r600_read_ngroups_x: return LowerImplicitParameter(DAG, VT, DL, 0); diff --git a/test/CodeGen/R600/clamp-constants.ll b/test/CodeGen/R600/clamp-constants.ll new file mode 100644 index 000..cf4d35f --- /dev/null +++ b/test/CodeGen/R600/clamp-constants.ll @@ -0,0 +1,20 @@ +;RUN: llc %s -march=r600 -mcpu=redwood | FileCheck %s + +;CHECK-NOT: MOV + +define void @main() { +main_body: + %0 = call float @llvm.AMDIL.clamp.(float 1.50e+00, float 0.00e+00, float 1.00e+00) + %1 = call float @llvm.AMDIL.clamp.(float 0.00e+00, float 0.00e+00, float 1.00e+00) + %2 = call float @llvm.AMDIL.clamp.(float 1.00e+00, float 0.00e+00, float 1.00e+00) + %3 = call float @llvm.AMDIL.clamp.(float -0.50e+00, float 0.00e+00, float 1.00e+00) + %4 = insertelement 4 x float undef, float %0, i32 0 + %5 = insertelement 4 x float %4, float %1, i32 1 + %6 = insertelement 4 x float %5, float %2, i32 2 + %7 = insertelement 4 x float %6, float %3, i32 3 + call void @llvm.R600.store.swizzle(4 x float %7, i32 0, i32 0) + ret void +} + +declare float @llvm.AMDIL.clamp.(float, float, float) readnone +declare void @llvm.R600.store.swizzle(4 x float, i32, i32) -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600: Lower clamp constant to constant
On Wed, Mar 13, 2013 at 10:26:38PM +0100, Vincent Lejeune wrote: --- lib/Target/R600/R600ISelLowering.cpp | 23 +++ test/CodeGen/R600/clamp-constants.ll | 20 2 files changed, 43 insertions(+) create mode 100644 test/CodeGen/R600/clamp-constants.ll I like this idea, but I think a better solution would be to replace llvm.AMDIL.clamp with LLVM IR in the frontend and add a pattern for clamp in the backend. This way the LLVM optimizers would handle the constant folding for us, and we would also be able to optimize open-coded clamps. -Tom diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp index a73691d..96686e6 100644 --- a/lib/Target/R600/R600ISelLowering.cpp +++ b/lib/Target/R600/R600ISelLowering.cpp @@ -394,6 +394,29 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, SelectionDAG DAG) const return SDValue(interp, slot % 2); } +case AMDGPUIntrinsic::AMDIL_clamp: { + ConstantFPSDNode *Min = dyn_castConstantFPSDNode(Op.getOperand(2)); + ConstantFPSDNode *Max = dyn_castConstantFPSDNode(Op.getOperand(3)); + if (ConstantFPSDNode *C = dyn_castConstantFPSDNode(Op.getOperand(1))) { +switch (C-getValueAPF().compare(Max-getValueAPF())) { +case APFloat::cmpGreaterThan: +case APFloat::cmpEqual: + return Op.getOperand(3); +default: + break; +} + +switch (C-getValueAPF().compare(Min-getValueAPF())) { +case APFloat::cmpLessThan: +case APFloat::cmpEqual: + return Op.getOperand(2); +default: + break; +} +return Op.getOperand(1); + } + break; +} case r600_read_ngroups_x: return LowerImplicitParameter(DAG, VT, DL, 0); diff --git a/test/CodeGen/R600/clamp-constants.ll b/test/CodeGen/R600/clamp-constants.ll new file mode 100644 index 000..cf4d35f --- /dev/null +++ b/test/CodeGen/R600/clamp-constants.ll @@ -0,0 +1,20 @@ +;RUN: llc %s -march=r600 -mcpu=redwood | FileCheck %s + +;CHECK-NOT: MOV + +define void @main() { +main_body: + %0 = call float @llvm.AMDIL.clamp.(float 1.50e+00, float 0.00e+00, float 1.00e+00) + %1 = call float @llvm.AMDIL.clamp.(float 0.00e+00, float 0.00e+00, float 1.00e+00) + %2 = call float @llvm.AMDIL.clamp.(float 1.00e+00, float 0.00e+00, float 1.00e+00) + %3 = call float @llvm.AMDIL.clamp.(float -0.50e+00, float 0.00e+00, float 1.00e+00) + %4 = insertelement 4 x float undef, float %0, i32 0 + %5 = insertelement 4 x float %4, float %1, i32 1 + %6 = insertelement 4 x float %5, float %2, i32 2 + %7 = insertelement 4 x float %6, float %3, i32 3 + call void @llvm.R600.store.swizzle(4 x float %7, i32 0, i32 0) + ret void +} + +declare float @llvm.AMDIL.clamp.(float, float, float) readnone +declare void @llvm.R600.store.swizzle(4 x float, i32, i32) -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation
I fixed the coding style issue. The iostream include was a debug leftover line, it shouldn't be there. - Mail original - De : Tom Stellard t...@stellard.net À : Vincent Lejeune v...@ovi.com Cc : llvm-comm...@cs.uiuc.edu; mesa-dev@lists.freedesktop.org Envoyé le : Mercredi 13 mars 2013 21h49 Objet : Re: [PATCH] R600: Factorize code handling Const Read Port limitation On Wed, Mar 13, 2013 at 09:12:41PM +0100, Vincent Lejeune wrote: --- lib/Target/R600/AMDILISelDAGToDAG.cpp | 34 ++ lib/Target/R600/R600InstrInfo.cpp | 54 ++ lib/Target/R600/R600InstrInfo.h | 3 ++ lib/Target/R600/R600MachineScheduler.cpp | 77 lib/Target/R600/R600MachineScheduler.h | 3 +- test/CodeGen/R600/kcache-fold-2.ll | 52 + 6 files changed, 144 insertions(+), 79 deletions(-) create mode 100644 test/CodeGen/R600/kcache-fold-2.ll diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp b/lib/Target/R600/AMDILISelDAGToDAG.cpp index 0c7880d..05a1ea7 100644 --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp @@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) { return Result; } + Whitespace bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, const R600InstrInfo *TII, std::vectorSDValue Ops) { int OperandIdx[] = { @@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode, SDValue Operand = Ops[OperandIdx[i] - 1]; switch (Operand.getOpcode()) { case AMDGPUISD::CONST_ADDRESS: { - if (i == 2) - break; SDValue CstOffset; - if (!Operand.getValueType().isVector() - SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) { - Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); - Ops[SelIdx[i] - 1] = CstOffset; - return true; + if (Operand.getValueType().isVector() || + !SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) + break; + + // Gather others constants values + std::vectorunsigned Consts; + for (unsigned j = 0; j 3; j++) { + int SrcIdx = OperandIdx[j]; + if (SrcIdx 0) + break; + if (RegisterSDNode *Reg = dyn_castRegisterSDNode(Ops[SrcIdx - 1])) { + if (Reg-getReg() == AMDGPU::ALU_CONST) { + ConstantSDNode *Cst = dyn_castConstantSDNode(Ops[SelIdx[j] - 1]); + Consts.push_back(Cst-getZExtValue()); + } + } } + + ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset); + Consts.push_back(Cst-getZExtValue()); + if (!TII-fitsConstReadLimitations(Consts)) + break; + + Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, MVT::f32); + Ops[SelIdx[i] - 1] = CstOffset; + return true; } - break; case ISD::FNEG: if (NegIdx[i] 0) break; diff --git a/lib/Target/R600/R600InstrInfo.cpp b/lib/Target/R600/R600InstrInfo.cpp index be3318a..0865098 100644 --- a/lib/Target/R600/R600InstrInfo.cpp +++ b/lib/Target/R600/R600InstrInfo.cpp @@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const { (TargetFlags R600_InstFlag::OP3)); } +bool +R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned Consts) + const { + assert (Consts.size() = 12 Too many operands in instructions group); + unsigned Pair1 = 0, Pair2 = 0; + for (unsigned i = 0, n = Consts.size(); i n; ++i) { + unsigned ReadConstHalf = Consts[i] 2; + unsigned ReadConstIndex = Consts[i] (~3); + unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf; + if (!Pair1) { + Pair1 = ReadHalfConst; + continue; + } + if (Pair1 == ReadHalfConst) + continue; + if (!Pair2) { + Pair2 = ReadHalfConst; + continue; + } + if (Pair2 != ReadHalfConst) + return false; + } + return true; +} + +bool +R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) const { + std::vectorunsigned Consts; + for (unsigned i = 0, n = MIs.size(); i n; i++) { + const MachineInstr *MI = MIs[i]; + + const R600Operands::Ops OpTable[3][2] = { + {R600Operands::SRC0, R600Operands::SRC0_SEL}, + {R600Operands::SRC1, R600Operands::SRC1_SEL}, + {R600Operands::SRC2, R600Operands::SRC2_SEL}, + }; + + if (!isALUInstr(MI-getOpcode())) + continue; + + for (unsigned j = 0; j 3; j++) { + int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]); + if (SrcIdx 0) + break; + if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) { + unsigned Const = MI-getOperand( +
Re: [Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD
On Wed, Mar 13, 2013 at 3:51 PM, Christoph Bumiller e0425...@student.tuwien.ac.at wrote: Second attempt, 2 years ago no one replied or cared ... We really need to know about these on nvc0 because there are only 8 fixed hardware locations that can be overwritten by sprite coordinates, and one location that represents gl_PointCoord and unconditionally returns sprite coordinates. So far this was solved via a hack, which works since the locations the state tracker picks aren't dynamic (and likely will never be, to facilitate ARB_separate_shader_objects), but it still isn't nice to do it this way. It looks like nv30 was using a hack, too, since it had a check for Semantic.Index == 9, which is what mesa uses for PointCoord. Implementing a safe, non-mesa-dependent way without these SEMANTICs would be jumping through hoops and doing expensive shader recompilations just because we like to destroy information at the gallium threshold, and that's unacceptable. I started to (try) fix up the other drivers, but maybe we just want a CAP for this instead, since the default solution - if this is TEXCOORD then treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really look that nicer either. E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker should use the TEXCOORD and PCOORD semantics, otherwise it should just use GENERICs as before. --- src/gallium/auxiliary/draw/draw_pipe_wide_point.c | 39 src/gallium/auxiliary/tgsi/tgsi_dump.c |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |2 + src/gallium/docs/source/cso/rasterizer.rst |2 +- src/gallium/docs/source/tgsi.rst | 23 +- src/gallium/drivers/freedreno/freedreno_compiler.c |2 + src/gallium/drivers/i915/i915_fpc_translate.c |2 + src/gallium/drivers/i915/i915_state_derived.c |4 ++ src/gallium/drivers/llvmpipe/lp_setup_point.c | 29 ++-- src/gallium/drivers/nv30/nvfx_fragprog.c | 39 src/gallium/drivers/nv50/nv50_shader_state.c |8 +-- src/gallium/drivers/nv50/nv50_surface.c|5 +- src/gallium/drivers/nvc0/nvc0_program.c| 37 +-- src/gallium/drivers/r300/r300_fs.c |2 + src/gallium/drivers/r300/r300_shader_semantics.h |3 +- src/gallium/drivers/r300/r300_vs.c |2 + src/gallium/drivers/r600/evergreen_state.c |7 ++- src/gallium/drivers/r600/r600_shader.c |3 +- src/gallium/drivers/r600/r600_state.c |7 ++- src/gallium/drivers/radeonsi/radeonsi_shader.c |1 + src/gallium/drivers/radeonsi/si_state.c|2 +- src/gallium/drivers/radeonsi/si_state_draw.c |5 +- src/gallium/include/pipe/p_shader_tokens.h | 36 +-- src/gallium/include/pipe/p_state.h |2 +- src/mesa/state_tracker/st_atom_rasterizer.c|6 +-- src/mesa/state_tracker/st_program.c| 48 +-- 26 files changed, 162 insertions(+), 155 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c index 8e0a117..d4ed0f7 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c +++ b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c @@ -233,28 +233,29 @@ widepoint_first_point(struct draw_stage *stage, wide-num_texcoord_gen = 0; - /* Loop over fragment shader inputs looking for generic inputs - * for which bit 'k' in sprite_coord_enable is set. + /* Loop over fragment shader inputs looking for the PCOORD input or + * TEXCOORD inputs for which bit 'k' in sprite_coord_enable is set. */ for (i = 0; i fs-info.num_inputs; i++) { - if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_GENERIC) { -const int generic_index = fs-info.input_semantic_index[i]; -/* Note that sprite_coord enable is a bitfield of - * PIPE_MAX_SHADER_OUTPUTS bits. - */ -if (generic_index PIPE_MAX_SHADER_OUTPUTS -(rast-sprite_coord_enable (1 generic_index))) { - /* OK, this generic attribute needs to be replaced with a -* texcoord (see above). -*/ - int slot = draw_alloc_extra_vertex_attrib(draw, - TGSI_SEMANTIC_GENERIC, - generic_index); - - /* add this slot to the texcoord-gen list */ - wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot; -} + int slot; + const unsigned sn = fs-info.input_semantic_name[i]; + const unsigned si = fs-info.input_semantic_index[i]; + + if (sn ==
Re: [Mesa-dev] OSMesa VTK
On 03/13/2013 12:08 PM, Kevin H. Hobbs wrote: On 03/13/2013 10:40 AM, Brian Paul wrote: Can you tell me what the parameters are for your OSMesaCreateContext() and OSMesaMakeCurrent() calls (in particular the image format/type)? I do not know. I'll do some digging into the VTK source and ask on the VTK list. You could also try setting GALLIUM_DRIVER=softpipe and see if the softpipe driver works. The test still fails, however it fails differently, instead of a black image I get a strip of colored snow at the top of the output image... How about I just show you? The output (boring black image) of last night's VTK LoadOpenGLExtension test which used llvmpipe is here : http://open.cdash.org/testDetails.php?test=180871306build=2844006 The output of: $ env GALLIUM_DRIVER=softpipe \ /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkRenderingOpenGLCxxTests \ LoadOpenGLExtension -D /home/kevin/kitware/VTKData \ -T /home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary \ -V Baseline/Rendering/LoadOpenGLExtension.png is here : http://crab-lab.zool.ohiou.edu/kevin/LoadOpenGLExtension.png Could I get a binary of one of your test programs? Linux 64-bit? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OSMesa VTK
Sure: http://crab-lab.zool.ohiou.edu/kevin/vtkRenderingOpenGLCxxTests ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [libclc] configure: Enable building separate libraries for target variants
The python changes in this file look good to me. I haven't done a line-by-line review of the SI changes. I tested this patch and v2 of the related mesa series on r600g (radeon 6850) with a recent LLVM and fresh mesa master as of this evening. No real change in the piglit CL test success/failure rate. Do you have any interest in trying to merge your changes to date back into the upstream libclc codebase? If you think it's a good idea, but don't have time to do it yourself, let me know and I'll try to re-base the series of patches. --Aaron On Tue, Mar 12, 2013 at 3:20 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- configure.py | 119 - 1 files changed, 75 insertions(+), 44 deletions(-) diff --git a/configure.py b/configure.py index d861c24..dfd9a8f 100755 --- a/configure.py +++ b/configure.py @@ -68,6 +68,15 @@ llvm_clang = os.path.join(llvm_bindir, 'clang') llvm_link = os.path.join(llvm_bindir, 'llvm-link') llvm_opt = os.path.join(llvm_bindir, 'opt') +available_targets = { + 'r600--' : { 'devices' : + [{'gpu' : 'cedar', 'aliases' : ['palm', 'sumo', 'sumo2', 'redwood', 'juniper']}, +{'gpu' : 'cypress', 'aliases' : ['hemlock']}, +{'gpu' : 'barts', 'aliases' : ['turks', 'caicos']}, +{'gpu' : 'cayman', 'aliases' : ['aruba']}, +{'gpu' : 'tahiti', 'aliases' : ['pitcairn', 'verde', 'oland']}]} +} + default_targets = ['r600--'] targets = args @@ -127,50 +136,72 @@ for target in targets: clang_cl_includes = ' '.join([-I%s % incdir for incdir in incdirs]) - # The rule for building a .bc file for the specified architecture using clang. - clang_bc_flags = -target %s -I`dirname $in` %s \ - -Dcl_clang_storage_class_specifiers \ - -Dcl_khr_fp64 \ - -emit-llvm % (target, clang_cl_includes) - clang_bc_rule = CLANG_CL_BC_ + target - c_compiler_rule(b, clang_bc_rule, LLVM-CC, llvm_clang, clang_bc_flags) - - objects = [] - sources_seen = set() - - for libdir in libdirs: -subdir_list_file = os.path.join(libdir, 'SOURCES') -manifest_deps.add(subdir_list_file) -override_list_file = os.path.join(libdir, 'OVERRIDES') - -# Add target overrides -if os.path.exists(override_list_file): - for override in open(override_list_file).readlines(): -override = override.rstrip() -sources_seen.add(override) - -for src in open(subdir_list_file).readlines(): - src = src.rstrip() - if src not in sources_seen: -sources_seen.add(src) -obj = os.path.join(target, 'lib', src + '.bc') -objects.append(obj) -src_file = os.path.join(libdir, src) -ext = os.path.splitext(src)[1] -if ext == '.ll': - b.build(obj, 'LLVM_AS', src_file) -else: - b.build(obj, clang_bc_rule, src_file) - - builtins_link_bc = os.path.join(target, 'lib', 'builtins.link.bc') - builtins_opt_bc = os.path.join(target, 'lib', 'builtins.opt.bc') - builtins_bc = os.path.join('built_libs', target + '.bc') - b.build(builtins_link_bc, LLVM_LINK, objects) - b.build(builtins_opt_bc, OPT, builtins_link_bc) - b.build(builtins_bc, PREPARE_BUILTINS, builtins_opt_bc, prepare_builtins) - install_files_bc.append((builtins_bc, builtins_bc)) - install_deps.append(builtins_bc) - b.default(builtins_bc) + for device in available_targets[target]['devices']: +# The rule for building a .bc file for the specified architecture using clang. +clang_bc_flags = -target %s -I`dirname $in` %s \ + -Dcl_clang_storage_class_specifiers \ + -Dcl_khr_fp64 \ + -emit-llvm % (target, clang_cl_includes) +if device['gpu'] != '': + clang_bc_flags += ' -mcpu=' + device['gpu'] +clang_bc_rule = CLANG_CL_BC_ + target +c_compiler_rule(b, clang_bc_rule, LLVM-CC, llvm_clang, clang_bc_flags) + +objects = [] +sources_seen = set() + +if device['gpu'] == '': + full_target_name = target + obj_suffix = '' +else: + full_target_name = device['gpu'] + '-' + target + obj_suffix = '.' + device['gpu'] + +for libdir in libdirs: + subdir_list_file = os.path.join(libdir, 'SOURCES') + manifest_deps.add(subdir_list_file) + override_list_file = os.path.join(libdir, 'OVERRIDES') + + # Add target overrides + if os.path.exists(override_list_file): +for override in open(override_list_file).readlines(): + override = override.rstrip() + sources_seen.add(override) + + for src in open(subdir_list_file).readlines(): +src = src.rstrip() +# Only add the base filename (e.g. Add get_global_id instead of +# get_global_id.cl) to sources_seen. +#
Re: [Mesa-dev] [PATCH] i965: Split shader_time entries into separate cachelines.
On 03/13/2013 10:45 AM, Eric Anholt wrote: This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Plausible! Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). +/- 18.3%...lol. Still, nice improvement. This should make the tool much nicer to use. Reviewed-by: Kenneth Graunke kenn...@whitecape.org In case I forgot, all shader_time patches you've sent so far get a R-b. I remember reading them, just not whether I acked them. v2: Add a define for the stride with a comment explaining its units and why. --- src/mesa/drivers/dri/i965/brw_context.h |8 src/mesa/drivers/dri/i965/brw_fs.cpp|2 +- src/mesa/drivers/dri/i965/brw_program.c |5 +++-- src/mesa/drivers/dri/i965/brw_vec4.cpp |2 +- 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index c34d6b1..d042dd6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -571,6 +571,14 @@ struct brw_vs_prog_data { #define SURF_INDEX_SOL_BINDING(t)((t)) #define BRW_MAX_GS_SURFACES SURF_INDEX_SOL_BINDING(BRW_MAX_SOL_BINDINGS) +/** + * Stride in bytes between shader_time entries. + * + * We separate entries by a cacheline to reduce traffic between EUs writing to + * different entries. + */ +#define SHADER_TIME_STRIDE 64 + enum brw_cache_id { BRW_BLEND_STATE, BRW_DEPTH_STENCIL_STATE, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8ce3954..8476bb5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum shader_time_shader_type type, fs_reg offset_mrf = fs_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, fs_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, fs_reg(shader_time_index * SHADER_TIME_STRIDE))); fs_reg time_mrf = fs_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 75eb6bc..62954d3 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -228,7 +228,8 @@ brw_init_shader_time(struct brw_context *brw) const int max_entries = 4096; brw-shader_time.bo = drm_intel_bo_alloc(intel-bufmgr, shader time, -max_entries * 4, 4096); +max_entries * SHADER_TIME_STRIDE, +4096); brw-shader_time.programs = rzalloc_array(brw, struct gl_shader_program *, max_entries); brw-shader_time.types = rzalloc_array(brw, enum shader_time_shader_type, @@ -409,7 +410,7 @@ brw_collect_shader_time(struct brw_context *brw) uint32_t *times = brw-shader_time.bo-virtual; for (int i = 0; i brw-shader_time.num_entries; i++) { - brw-shader_time.cumulative[i] += times[i]; + brw-shader_time.cumulative[i] += times[i * SHADER_TIME_STRIDE / 4]; } /* Zero the BO out to clear it out for our next collection. diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f319f32..d759710 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, dst_reg offset_mrf = dst_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, src_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, src_reg(shader_time_index * SHADER_TIME_STRIDE))); dst_reg time_mrf = dst_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62319] New: libtxc: missing m4 folder preventing build
https://bugs.freedesktop.org/show_bug.cgi?id=62319 Priority: medium Bug ID: 62319 Assignee: mesa-dev@lists.freedesktop.org Summary: libtxc: missing m4 folder preventing build Severity: blocker Classification: Unclassified OS: All Reporter: alexandre.f.dem...@gmail.com Hardware: All Status: NEW Version: git Component: Other Product: Mesa When building from libtxc's latest git repository, I receive the following error: couldn't open directory 'm4': No such file or directory Everything was working fine until the switch to autotool. Manually creating the m4 directory then launching ./autogen.sh works fine. If the m4 directory is missing, it fails. Is it possible we are missing an m4 directory in libtxc's git tree? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62319] libtxc: missing m4 folder preventing build
https://bugs.freedesktop.org/show_bug.cgi?id=62319 --- Comment #1 from Alexandre Demers alexandre.f.dem...@gmail.com --- Also, I think there should be no m4 in the .gitignore file (I'm supposing this from mesa's .gitignore file) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62319] libtxc: missing m4 folder preventing build
https://bugs.freedesktop.org/show_bug.cgi?id=62319 Alexandre Demers alexandre.f.dem...@gmail.com changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|mesa-dev@lists.freedesktop. |alexandre.f.dem...@gmail.co |org |m --- Comment #2 from Alexandre Demers alexandre.f.dem...@gmail.com --- Created attachment 76508 -- https://bugs.freedesktop.org/attachment.cgi?id=76508action=edit Fixes m4 error removed m4 from .gitignore added m4 folder added m4/.gitignore -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev