[Mesa-dev] [PATCH] llvmpipe: fix pipeline statistics with a null ps
If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/drivers/llvmpipe/lp_query.c | 30 ++ 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_query.c b/src/gallium/drivers/llvmpipe/lp_query.c index cea2d07..fb24c36 100644 --- a/src/gallium/drivers/llvmpipe/lp_query.c +++ b/src/gallium/drivers/llvmpipe/lp_query.c @@ -32,6 +32,7 @@ #include draw/draw_context.h #include pipe/p_defines.h +#include tgsi/tgsi_scan.h #include util/u_memory.h #include os/os_time.h #include lp_context.h @@ -95,6 +96,7 @@ llvmpipe_get_query_result(struct pipe_context *pipe, union pipe_query_result *vresult) { struct llvmpipe_screen *screen = llvmpipe_screen(pipe-screen); + struct llvmpipe_context *llvmpipe = llvmpipe_context(pipe); unsigned num_threads = MAX2(1, screen-num_threads); struct llvmpipe_query *pq = llvmpipe_query(q); uint64_t *result = (uint64_t *)vresult; @@ -166,11 +168,31 @@ llvmpipe_get_query_result(struct pipe_context *pipe, case PIPE_QUERY_PIPELINE_STATISTICS: { struct pipe_query_data_pipeline_statistics *stats = (struct pipe_query_data_pipeline_statistics *)vresult; - /* only ps_invocations come from binned query */ - for (i = 0; i num_threads; i++) { - pq-stats.ps_invocations += pq-end[i]; + /* If we're running on what's considrered a null fragment + * shader, i.e. fragment shader consisting of a single + * END opcode or if the fragment shader is null then + * the number of ps_invocations should be zero */ + if (llvmpipe-fs llvmpipe-fs-info.base.num_tokens 1) { + /* only ps_invocations come from binned query */ + for (i = 0; i num_threads; i++) { +pq-stats.ps_invocations += pq-end[i]; + } + pq-stats.ps_invocations *= +LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE; + } else { + /* + * Clipper primitives and invocations are equal to zero + * if we're running a null fragment shader but only + * if both stencil and depth testing are disabled. + */ + if (!llvmpipe-depth_stencil-depth.enabled + !llvmpipe-depth_stencil-stencil[0].enabled + !llvmpipe-depth_stencil-stencil[1].enabled) { +pq-stats.c_primitives = 0; +pq-stats.c_invocations = 0; + } + pq-stats.ps_invocations = 0; } - pq-stats.ps_invocations *= LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE; *stats = pq-stats; } break; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Patches: R600: Merge R600 and SI vector op expansions
On Mon, 2013-08-12 at 15:25 -0700, Tom Stellard wrote: The attached patches expand a few more vector operations and also move the expansion code into AMDGPUISelLowering.cpp so it can be shared between R600 and SI. This series is Reviewed-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)
On Mon, Aug 12, 2013 at 10:09 PM, Chad Versace chad.vers...@linux.intel.com wrote: On 08/06/2013 09:44 PM, Siarhei Siamashka wrote: On Tue, 6 Aug 2013 15:54:57 -0700 Matt Turner matts...@gmail.com wrote: On Tue, Aug 6, 2013 at 2:13 PM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: But if upstream Mesa treats this configuration as unsupported, then I also don't see it progressing anywhere in Gentoo. So could you please re-consider this decision? As far as I'm aware, ES without Desktop GL is disallowed only because it was discovered to be broken which is because no one working on Mesa appears to test it. I have not done any really serious testing. I'm just playing around [...] If you can test it (and provide patches when you notice that it's broken) I don't have a problem with allowing ES-only builds. I agree. If you can fix Mesa to support ES-only builds and do *serious* testing with Piglit and some real ES applications to prove that it works, then I'm not opposed to supporting that configuration. If you're willing to give a go, I'm willing to spend some time on it. -- Regards, Tom Where's the kaboom!? There was supposed to be an earth-shattering kaboom! Marvin Martian Tech Lead, Graphics Working Group | Linaro.org │ Open source software for ARM SoCs w) tom.gall att linaro.org h) tom_gall att mac.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Patches: R600/SI: Rework resource descriptor types and fix piglit compute hangs
On Mon, 2013-08-12 at 12:53 -0700, Tom Stellard wrote: Hi, The attached patches make the v1i32 type which was used for sample coordinates and the v16i8 type which was used for resource descriptors illegal. There is a new pass which will convert v1i32 to i32 and v16i8 to i128 for all non-compute shaders. Since v16i8 is a legal type in OpenCL, using this type for resource descriptors and making it legal was over-complicating the type legalizer and causing some piglit tests to hang. The v1i32 type is identical to i32 on R600 and there is really no benefit in having it be a legal type. I currently get 231 piglit failures from quick-driver.tests on SI, so it would be quite an achievement for patch 3 to fix 364 piglit tests. ;) Please fix / clarify that paragraph of the commit log. In patch 5, it would be better to reference the URL of the bug report itself instead of a bug attachment. And you can verify using llc and the IR in the attachment that the bug is fixed. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw: make sure that the stages setup outputs
Am 09.08.2013 16:13, schrieb Zack Rusin: Calling the prepare outputs cleans up the slot assignments for outputs, unfortunately aapoint and aaline didn't have code to reset their slots after the initial setup, this was messing up our slot assignments. The unfilled stage was just missing the initial assignment of the face slot. This fixes all of the reported piglit failures. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_context.c |2 + src/gallium/auxiliary/draw/draw_pipe.h |5 +- src/gallium/auxiliary/draw/draw_pipe_aaline.c | 27 --- src/gallium/auxiliary/draw/draw_pipe_aapoint.c | 56 ++- src/gallium/auxiliary/draw/draw_pipe_unfilled.c |2 + 5 files changed, 62 insertions(+), 30 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 2d4843e..d1fac0c 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -564,6 +564,8 @@ draw_prepare_shader_outputs(struct draw_context *draw) draw_remove_extra_vertex_attribs(draw); draw_prim_assembler_prepare_outputs(draw-ia); draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled); + draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint); + draw_aaline_prepare_outputs(draw, draw-pipeline.aaline); } /** diff --git a/src/gallium/auxiliary/draw/draw_pipe.h b/src/gallium/auxiliary/draw/draw_pipe.h index 7c9ed6c..ad3165f 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.h +++ b/src/gallium/auxiliary/draw/draw_pipe.h @@ -101,7 +101,10 @@ void draw_pipe_passthrough_tri(struct draw_stage *stage, struct prim_header *hea void draw_pipe_passthrough_line(struct draw_stage *stage, struct prim_header *header); void draw_pipe_passthrough_point(struct draw_stage *stage, struct prim_header *header); - +void draw_aapoint_prepare_outputs(struct draw_context *context, + struct draw_stage *stage); +void draw_aaline_prepare_outputs(struct draw_context *context, + struct draw_stage *stage); void draw_unfilled_prepare_outputs(struct draw_context *context, struct draw_stage *stage); diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c b/src/gallium/auxiliary/draw/draw_pipe_aaline.c index aa88459..c44c236 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c +++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c @@ -692,13 +692,7 @@ aaline_first_line(struct draw_stage *stage, struct prim_header *header) return; } - /* update vertex attrib info */ - aaline-pos_slot = draw_current_shader_position_output(draw);; - - /* allocate the extra post-transformed vertex attribute */ - aaline-tex_slot = draw_alloc_extra_vertex_attrib(draw, - TGSI_SEMANTIC_GENERIC, - aaline-fs-generic_attrib); + draw_aaline_prepare_outputs(draw, draw-pipeline.aaline); /* how many samplers? */ /* we'll use sampler/texture[pstip-sampler_unit] for the stipple */ @@ -953,6 +947,25 @@ aaline_set_sampler_views(struct pipe_context *pipe, } +void +draw_aaline_prepare_outputs(struct draw_context *draw, +struct draw_stage *stage) +{ + struct aaline_stage *aaline = aaline_stage(stage); + const struct pipe_rasterizer_state *rast = draw-rasterizer; + + /* update vertex attrib info */ + aaline-pos_slot = draw_current_shader_position_output(draw);; + + if (!rast-line_smooth) + return; + + /* allocate the extra post-transformed vertex attribute */ + aaline-tex_slot = draw_alloc_extra_vertex_attrib(draw, + TGSI_SEMANTIC_GENERIC, + aaline-fs-generic_attrib); +} + /** * Called by drivers that want to install this AA line prim stage * into the draw module's pipeline. This will not be used if the diff --git a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c index 0d7b88e..7ae1ddd 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c +++ b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c @@ -696,28 +696,7 @@ aapoint_first_point(struct draw_stage *stage, struct prim_header *header) */ bind_aapoint_fragment_shader(aapoint); - /* update vertex attrib info */ - aapoint-pos_slot = draw_current_shader_position_output(draw); - - /* allocate the extra post-transformed vertex attribute */ - aapoint-tex_slot = draw_alloc_extra_vertex_attrib(draw, - TGSI_SEMANTIC_GENERIC, - aapoint-fs-generic_attrib); -
Re: [Mesa-dev] [PATCH] tgsi_build: fix order of arguments for ind register build
On 08/12/2013 06:14 PM, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This was broken when arrayid was added. Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/auxiliary/tgsi/tgsi_build.c | 2 +- src/gallium/renderer/virgl_hw.h | 39 + 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index 626faad..9c00cb6 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -875,8 +875,8 @@ static struct tgsi_ind_register tgsi_build_ind_register( unsigned file, unsigned swizzle, - unsigned arrayid, int index, + unsigned arrayid, struct tgsi_instruction *instruction, struct tgsi_header *header ) { For that part, Reviewed-by: Brian Paul bri...@vmware.com The rest of this patch below is unrelated afaict. -Brian diff --git a/src/gallium/renderer/virgl_hw.h b/src/gallium/renderer/virgl_hw.h index 2a8be61..71989cc 100644 --- a/src/gallium/renderer/virgl_hw.h +++ b/src/gallium/renderer/virgl_hw.h @@ -276,4 +276,43 @@ enum virgl_formats { VIRGL_FORMAT_MAX, }; +struct virgl_caps_bool_set1 { +unsigned indep_blend_enable:1; +unsigned indep_blend_func:1; +unsigned cube_map_array:1; +unsigned shader_stencil_export:1; +unsigned conditional_render:1; +unsigned start_instance:1; +unsigned primitive_restart:1; +unsigned blend_eq_sep:1; +unsigned instanceid:1; +unsigned vertex_element_instance_divisor:1; +unsigned seamless_cube_map:1; +unsigned occlusion_query:1; +unsigned timer_query:1; +unsigned streamout_pause_resume:1; +}; + +/* endless expansion capabilites - current gallium has 252 formats */ +struct virgl_supported_format_mask { +uint32_t bitmask[16]; +}; +/* capabilities set 2 - version 1 - 32-bit and float values */ +struct virgl_caps_v1 { +struct virgl_caps_bool_set1 bset; +uint32_t glsl_level; +uint32_t max_texture_array_layers; +uint32_t max_streamout_buffers; +uint32_t max_dual_source_render_targets; +uint32_t max_render_targets; +struct virgl_supported_format_mask sampler; +struct virgl_supported_format_mask fb; +struct virgl_supported_format_mask depthstencil; +struct virgl_supported_format_mask vertexbuffer; +}; + +union virgl_caps { +uint32_t max_version; +struct virgl_caps_v1 v1; +}; #endif ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] segfault in pstip_bind_sampler_states
On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote: On 08/12/2013 10:29 AM, Brian Paul wrote: Can you run with valgrind? That should give us some useful info if there's a use-after-free. Sure, $ valgrind /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkpython --enable-bt /home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py/rtImageTest.py /home/kevin/kitware/VTK/Filters/Hybrid/Testing/Python/largeImageOffset.py -D /home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Testing -T /home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary -V /home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png -A /home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py /tmp/osmesa_valgrind.txt 21 [...] ==30166== --30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --30166-- si_code=80; Faulting address: 0x0; sp: 0x4030fdd50 valgrind: the 'impossible' happened: Killed by fatal signal Well, that's not too helpful. Can you send me an executable? Or, is it simple to build the test case? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Patches: R600: Improve load / store support for 8-bit and 16-bit types
I've finished running comparison tests on Cayman/Pitcairn. The descriptors series and 8/16-bit load/store support series both look like they're in good condition for the cards I was able to test: Cedar (5400), A6-3500 Llano and Pitcairn (7850). No regressions spotted, just improvements. And as you said, the descriptors series fixed compute hangs for the 7850 on quite a few kernels which did comparison operations (max/clamp kernels mostly, maybe some min). You can definitely get a tested-by for both the descriptors series and this: Tested-by: Aaron Watry awa...@gmail.com Quite a few of the tablegen changes are still a bit above my head, so I don't feel qualified to give a comprehensive review on that. --Aaron On Mon, Aug 12, 2013 at 6:00 PM, Aaron Watry awa...@gmail.com wrote: It'll take me a while to attempt to parse everything that's going on in these patches (and your resource descriptor types series that this depends on), but I have sent it all through a piglit run on Evergreen (Cedar). Everything was latest Mesa/LLVM/libclc upstream code as of today. Baseline: 567/855 tests passed Descriptors Series: 575/855 tests passed -- Main differences here were with some int3 load/store issues which were just exposed recently and fixed by this series) Descriptors + char/short load/store series: 880/1119 tests passed (most of the additional tests and passes were char/short tests that no longer crash out). Specifically, I've double-checked the char/short/uchar/ushort built-in functions, as well as the char/short arithmetic tests, and things are looking good so far. I'll try to test on Cayman/SI later. --Aaron On Mon, Aug 12, 2013 at 2:56 PM, Tom Stellard t...@stellard.net wrote: Hi, The attached patches improve support for i8 and i16 loads and stores for Evergreen and newer GPUs. This means that byte-addressable stores are now supported. Please review/test. -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)
On Thu, 08 Aug 2013 16:19:28 -0700 Ian Romanick i...@freedesktop.org wrote: On 08/06/2013 02:13 PM, Siarhei Siamashka wrote: Hello, Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL dropped support for the OpenGL-free configuration. http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html Could this be possibly reverted to allow me to continue shooting myself in the foot? The support for OpenGL ES is pretty horrible in the open source software. One nice exception is Qt5 which is doing pretty well. But the rest of the software does not generally work out of the box without patches or tweaks. You can also hardly find a problem-free OpenGL ES compatible open source game (other than Quake3). I have an open feature request for Gentoo, which is a very configurable Linux distribution and should not have any troubles working either with or without OpenGL (the choice is up to the user): https://bugs.gentoo.org/show_bug.cgi?id=476524 But if upstream Mesa treats this configuration as unsupported, then I also don't see it progressing anywhere in Gentoo. So could you please re-consider this decision? We've removed all of the #ifdef code inside Mesa that would have made any difference. It was a nightmare to maintain, and we almost always got it wrong... because nobody was testing that configuration. I believe this can be changed :-) That's a bit of a chicken/egg problem. The OpenGL ES support in free software applications and libraries is so broken, that it's currently a big pain to try this configuration for anything practical. And the applications/libraries can't be fixed without having a non-OpenGL environment for development and testing. The needed tweaks for Mesa are really trivial. Maybe one could also just compile everything, but delete GL headers, gl.pc and libGL.so after compilation and before installing Mesa to the system. Still it is a bit ugly to have the configure script claim that OpenGL ES is not supported without OpenGL, while in fact it works. The only thing this is possibly going to gain you is a trivial amount of build time (by not building libGL, etc.). The compilation time is irrelevant. But it is very useful to be able to install Mesa without OpenGL headers and without libGL.so, so that the problematic software just fails at compile time instead of exhibiting hard to debug problems at runtime. It seems to be a rather common failure scenario when some big bloatware application loads both libGL.so (provided by Mesa) and libGLESv2.so (provided by some proprietary OpenGL ES driver on ARM hardware) into the same process via indirect library dependencies. These shared libraries are providing overlapping function names, but are backed by totally different implementations. And everything blows up as a result when the application is run, or maybe it even mostly works if you are lucky. What's the point installing both Mesa and the proprietary OpenGL ES drivers on the same system? I would surely love to have open source hardware accelerated OpenGL ES drivers on ARM systems today. But they are not quite here yet. And even assuming that we get perfectly functional free software OpenGL ES drivers for embedded hardware, the current buggy applications are not going be magically fixed themselves. Somebody still needs to debug and fix the OpenGL ES compatibility problems. The easiest way forward seems to be just allowing to compile Mesa without desktop OpenGL. It is going to provide: 1. On x86 desktop systems - the development environment for testing OpenGL ES applications. 2. On ARM hardware via softpipe/llvmpipe - some reference fallback implementation. 3. Have both the existing proprietary drivers and Mesa installed on ARM hardware (with the ability to switch between them at any time) - the applications can run at full speed and be profiled/benchmarked. Somebody may argue that I'm exaggerating and OpenGL ES support seems to be not so bad. There were many OpenGL ES related news and announcements. Also there exists Linaro/Ubuntu distribution and some videos on youtube showing how it successfully runs something in 3D on ARM. Still the problem is that in many applications the said OpenGL ES support is either in the work-in-progress state, or it possibly has been contributed by somebody some time ago and has already bitrotten. Also Linaro bundles a bunch of OpenGL ES hacks, which don't seem to be actively pushed upstream. This all is less than perfect and needs to be improved. That is unless we are happy with having OpenGL ES just exclusively for running Qt5 and a few compliant compositing window managers. -- Best regards, Siarhei Siamashka ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)
On Mon, 12 Aug 2013 14:09:39 -0700 Chad Versace chad.vers...@linux.intel.com wrote: On 08/06/2013 09:44 PM, Siarhei Siamashka wrote: On Tue, 6 Aug 2013 15:54:57 -0700 Matt Turner matts...@gmail.com wrote: On Tue, Aug 6, 2013 at 2:13 PM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: But if upstream Mesa treats this configuration as unsupported, then I also don't see it progressing anywhere in Gentoo. So could you please re-consider this decision? As far as I'm aware, ES without Desktop GL is disallowed only because it was discovered to be broken which is because no one working on Mesa appears to test it. I have not done any really serious testing. I'm just playing around [...] If you can test it (and provide patches when you notice that it's broken) I don't have a problem with allowing ES-only builds. I agree. If you can fix Mesa to support ES-only builds and do *serious* testing with Piglit and some real ES applications to prove that it works, then I'm not opposed to supporting that configuration. That's a really good point about Piglit. Also if you wonder about what can be run with OpenGL ES only (and without OpenGL), then here is some initial list: Qt5 works, KWin works, glmark2 works, OGRE works to some extent (simple demos run, but not full fledged games). WebGL in Firefox also works, but still has some issues: https://bugzilla.mozilla.org/show_bug.cgi?id=788319 But anything, that wants GLU as a dependency, simply does not build. Some applications desperately want to link with -lGL for no reason. If you want an example of such problematic package, a good one is mesa demos - http://cgit.freedesktop.org/mesa/demos While it is supposed to provide the es2gears test program, it does not build out of the box: checking for GL... no checking GL/gl.h usability... no checking GL/gl.h presence... no checking for GL/gl.h... no configure: error: GL not found And so on. There is definitely some work to do on the applications front. PS. I would be grateful if somebody could advise a highly dynamic and enjoyable OpenGL ES compatible open source 3D game (not Quake3!). Tux Rider World Challenge seems to be promising (as a GLESv1 testcase), but needs some porting back to Linux and X11 EGL: http://www.barlow-server.com/tuxriderworldchallenge -- Best regards, Siarhei Siamashka ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] segfault in pstip_bind_sampler_states
On 08/13/2013 09:50 AM, Brian Paul wrote: On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote: --30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting Well, that's not too helpful. I think it may have been helpful. Before Valgrind crashed it mentions that osmesa_st_framebuffer_flush_front wrote to an address it should not have. In the test VTK uses a filter that tiles the rendering of a (not very) large image. The render window is initially 150x150 and then after some fooling around magnification is set to 3. I think this results in what I see in gdb : res-width0 = 450 res-height0= 450 osbuffer-width = 150 bytes = 1800 dst_stride = -600 If I read that last right then in the for loop we write 1800 bytes to dst move back 600 bytes and write another 1800 bytes. Are we overwriting 2/3 of what we just wrote? Can you send me an executable? Not quickly. Or, is it simple to build the test case? I loose track of what's simple and not for me there's a cron job that just runs a bunch of scripts while I sleep. The test is a python wrapped test so the whole hour long build of vtk is hard to avoid. They are attached for good measure but all I do is: The mesa I use is built nightly with : ./autogen.sh \ --prefix=/home/kevin/mesa_nightly \ --enable-glx \ --enable-dri \ --enable-shared-glapi \ --enable-gallium-llvm \ --with-gallium-drivers=nouveau,swrast \ --enable-osmesa I have VTK cloned : git clone http://vtk.org/VTK.git I happen to be on the nightly-master branch for the dashboard but that shouldn't matter. ctest builds and tests VTK with : ctest -S vtk_osmesa.cmake Since you only want one test a build just like mine is : VTK_BUILD=~/VTK_Build VTK_SRC=~/VTK mkdir $VTK_BUILD cd $VTK_BUILD cmake \ \ -DBUILD_EXAMPLES:BOOL=ON\ -DBUILD_SHARED_LIBS:BOOL=ON\ \ -DVTK_BUILD_ALL_MODULES:BOOL=OFF\ -DVTK_Group_Imaging:BOOL=ON\ -DVTK_Group_MPI:BOOL=ON\ -DVTK_Group_Rendering:BOOL=ON\ -DVTK_Group_StandAlone:BOOL=ON\ -DVTK_Group_Views:BOOL=ON\ \ -DVTK_WRAP_JAVA:BOOL=OFF\ -DVTK_WRAP_PYTHON:BOOL=ON\ -DVTK_WRAP_TCL:BOOL=ON\ \ -DOPENGL_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include\ -DOPENGL_gl_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGL.so\ -DOPENGL_glu_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGLU.so\ -DVTK_OPENGL_HAS_OSMESA:BOOL=ON\ -DOSMESA_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include\ -DOSMESA_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libOSMesa.so\ \ -DVTK_USE_OFFSCREEN:BOOL=ON\ -DVTK_USE_X:BOOL=OFF\ -DVTK_USE_TK:BOOL=OFF\ $VTK_SRC and the test is : $VTK_BUILD/bin/vtkpython --enable-bt $VTK_BUILD/Utilities/vtkTclTest2Py/rtImageTest.py $VTK_SRC/Filters/Hybrid/Testing/Python/largeImageOffset.py -D $VTK_BUILD/ExternalData/Testing -T $VTK_BUILD/Testing/Temporary -V $VTK_BUILD/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png -A $VTK_BUILD/Utilities/vtkTclTest2Py ctest usually downloads the the validation image largeImageOffset.png and the input file mentioned in the test iflamigm.3ds I don't know if this happens without ctest. # Client maintainer: hob...@ohio.edu set(CTEST_SITE bubbles.hooperlab) set(CTEST_BUILD_NAME Fedora-18_OSMesaDevel-x86_64) set(CTEST_CONFIGURATION_TYPE Release) set(CTEST_CMAKE_GENERATOR Unix Makefiles) set( dashboard_model Nightly ) set( CTEST_BUILD_FLAGS -ij8 ) set( CTEST_TEST_ARGS PARALLEL_LEVEL 8 ) set( CTEST_DASHBOARD_ROOT ${CTEST_SCRIPT_DIRECTORY} ) set( CTEST_SOURCE_DIRECTORY ${CTEST_DASHBOARD_ROOT}/VTK ) set( CTEST_BINARY_DIRECTORY ${CTEST_DASHBOARD_ROOT}/VTK_OSMesa_Build ) set( VTK_USE_LARGE_DATA ON ) set( dashboard_cache BUILD_EXAMPLES:BOOL=ON BUILD_SHARED_LIBS:BOOL=ON VTK_BUILD_ALL_MODULES:BOOL=OFF VTK_Group_Imaging:BOOL=ON VTK_Group_MPI:BOOL=ON VTK_Group_Rendering:BOOL=ON VTK_Group_StandAlone:BOOL=ON VTK_Group_Views:BOOL=ON VTK_WRAP_JAVA:BOOL=OFF VTK_WRAP_PYTHON:BOOL=ON VTK_WRAP_TCL:BOOL=ON OPENGL_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include OPENGL_gl_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGL.so OPENGL_glu_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGLU.so VTK_OPENGL_HAS_OSMESA:BOOL=ON OSMESA_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include OSMESA_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libOSMesa.so VTK_USE_OFFSCREEN:BOOL=ON VTK_USE_X:BOOL=OFF VTK_USE_TK:BOOL=OFF ) include(${CTEST_SCRIPT_DIRECTORY}/VTKScripts/vtk_common.cmake) update_kitware.sh Description: application/shellscript update_mesa.sh Description: application/shellscript signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] i965: Some small refactors for Broadwell
Here are some small refactors for Broadwell. I'd like to see them merged upstream now to ease the maintenance of internal trees. There's nothing exciting here, so I request your rubberstamp. Chad Versace (2): i965: Refactor names of sample_positions_8/4x arrays i965: Move arrays brw_multisample_positions* to new header Kenneth Graunke (2): i965: Mark a few brw_draw_upload.c functions as non-static i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2) src/mesa/drivers/dri/i965/brw_context.h| 5 ++ src/mesa/drivers/dri/i965/brw_draw_upload.c| 16 ++--- src/mesa/drivers/dri/i965/brw_multisample_state.h | 72 ++ src/mesa/drivers/dri/i965/brw_state.h | 4 ++ src/mesa/drivers/dri/i965/gen6_multisample_state.c | 57 ++--- src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 +- 6 files changed, 99 insertions(+), 61 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/brw_multisample_state.h -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] i965: Mark a few brw_draw_upload.c functions as non-static
From: Kenneth Graunke kenn...@whitecape.org We will reuse these for Broadwell. Reviewed-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h | 5 + src/mesa/drivers/dri/i965/brw_draw_upload.c | 16 +--- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 00dd2b4..74e38f1 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1362,6 +1362,11 @@ int brw_disasm (FILE *file, struct brw_instruction *inst, int gen); /* brw_vs.c */ gl_clip_plane *brw_select_clip_planes(struct gl_context *ctx); +/* brw_draw_upload.c */ +unsigned brw_get_vertex_surface_type(struct brw_context *brw, + const struct gl_client_array *glarray); +unsigned brw_get_index_type(GLenum type); + /* brw_wm_surface_state.c */ void brw_init_surface_formats(struct brw_context *brw); void diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index 897e733..158c9e5 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -222,9 +222,9 @@ static GLuint byte_types_scale[5] = { * the appopriate hardware surface type. * Format will be GL_RGBA or possibly GL_BGRA for GLubyte[4] color arrays. */ -static unsigned -get_surface_type(struct brw_context *brw, - const struct gl_client_array *glarray) +unsigned +brw_get_vertex_surface_type(struct brw_context *brw, +const struct gl_client_array *glarray) { int size = glarray-Size; @@ -342,7 +342,8 @@ get_surface_type(struct brw_context *brw, } } -static GLuint get_index_type(GLenum type) +unsigned +brw_get_index_type(GLenum type) { switch (type) { case GL_UNSIGNED_BYTE: return BRW_INDEX_BYTE; @@ -687,7 +688,7 @@ static void brw_emit_vertices(struct brw_context *brw) OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS 16) | (2 * nr_elements - 1)); for (i = 0; i brw-vb.nr_enabled; i++) { struct brw_vertex_element *input = brw-vb.enabled[i]; - uint32_t format = get_surface_type(brw, input-glarray); + uint32_t format = brw_get_vertex_surface_type(brw, input-glarray); uint32_t comp0 = BRW_VE1_COMPONENT_STORE_SRC; uint32_t comp1 = BRW_VE1_COMPONENT_STORE_SRC; uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC; @@ -748,7 +749,8 @@ static void brw_emit_vertices(struct brw_context *brw) } if (brw-gen = 6 gen6_edgeflag_input) { - uint32_t format = get_surface_type(brw, gen6_edgeflag_input-glarray); + uint32_t format = + brw_get_vertex_surface_type(brw, gen6_edgeflag_input-glarray); OUT_BATCH((gen6_edgeflag_input-buffer GEN6_VE0_INDEX_SHIFT) | GEN6_VE0_VALID | @@ -900,7 +902,7 @@ static void brw_emit_index_buffer(struct brw_context *brw) BEGIN_BATCH(3); OUT_BATCH(CMD_INDEX_BUFFER 16 | cut_index_setting | - get_index_type(index_buffer-type) 8 | + brw_get_index_type(index_buffer-type) 8 | 1); OUT_RELOC(brw-ib.bo, I915_GEM_DOMAIN_VERTEX, 0, -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)
From: Kenneth Graunke kenn...@whitecape.org We will reuse this for Broadwell. v2: Prefix function name with 'gen7'. (chadv) Reviewed-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_state.h | 4 src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index 321bffe..a1236b7 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -200,6 +200,10 @@ void gen7_init_vtable_surface_functions(struct brw_context *brw); void gen7_create_shader_time_surface(struct brw_context *brw, uint32_t *out_offset); +/* gen7_sol_state.c */ +void gen7_upload_3dstate_so_decl_list(struct brw_context *brw, + const struct brw_vue_map *vue_map); + /* brw_wm_sampler_state.c */ uint32_t translate_wrap_mode(GLenum wrap, bool using_nearest); void upload_default_color(struct brw_context *brw, diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 034efe8..185e422 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -97,9 +97,9 @@ upload_3dstate_so_buffers(struct brw_context *brw) * stream. We only have one stream of rendering coming out of the GS unit, so * we only emit stream 0 (low 16 bits) SO_DECLs. */ -static void -upload_3dstate_so_decl_list(struct brw_context *brw, - const struct brw_vue_map *vue_map) +void +gen7_upload_3dstate_so_decl_list(struct brw_context *brw, + const struct brw_vue_map *vue_map) { struct gl_context *ctx = brw-ctx; /* BRW_NEW_VERTEX_PROGRAM */ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] i965: Refactor names of sample_positions_8/4x arrays
Place each array in the brw namespace by renaming it: sample_positions_4x - brw_multisample_positions_4x sample_positions_8x - brw_multisample_positions_8x This prepares for moving the arrays to a header shared by gen6 and gen8. CC: Paul Berry stereotype...@gmail.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/gen6_multisample_state.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c b/src/mesa/drivers/dri/i965/gen6_multisample_state.c index 268dc79..0ba3642 100644 --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c @@ -34,7 +34,7 @@ * e 3 */ static uint32_t -sample_positions_4x[] = { 0xae2ae662 }; +brw_multisample_positions_4x[] = { 0xae2ae662 }; /* Sample positions are based on a solution to the 8 queens puzzle. * Rationale: in a solution to the 8 queens puzzle, no two queens share * a row, column, or diagonal. This is a desirable property for samples @@ -69,7 +69,7 @@ sample_positions_4x[] = { 0xae2ae662 }; * f 7 */ static uint32_t -sample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 }; +brw_multisample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 }; void @@ -82,13 +82,13 @@ gen6_get_sample_position(struct gl_context *ctx, result[0] = result[1] = 0.5f; break; case 4: { - uint8_t val = (uint8_t)(sample_positions_4x[0] (8*index)); + uint8_t val = (uint8_t)(brw_multisample_positions_4x[0] (8*index)); result[0] = ((val 4) 0xf) / 16.0f; result[1] = (val 0xf) / 16.0f; break; } case 8: { - uint8_t val = (uint8_t)(sample_positions_8x[index2] (8*(index 3))); + uint8_t val = (uint8_t)(brw_multisample_positions_8x[index2] (8*(index 3))); result[0] = ((val 4) 0xf) / 16.0f; result[1] = (val 0xf) / 16.0f; break; @@ -116,12 +116,12 @@ gen6_emit_3dstate_multisample(struct brw_context *brw, break; case 4: number_of_multisamples = MS_NUMSAMPLES_4; - sample_positions_3210 = sample_positions_4x[0]; + sample_positions_3210 = brw_multisample_positions_4x[0]; break; case 8: number_of_multisamples = MS_NUMSAMPLES_8; - sample_positions_3210 = sample_positions_8x[0]; - sample_positions_7654 = sample_positions_8x[1]; + sample_positions_3210 = brw_multisample_positions_8x[0]; + sample_positions_7654 = brw_multisample_positions_8x[1]; break; default: assert(!Unrecognized num_samples in gen6_emit_3dstate_multisample); -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] i965: Move arrays brw_multisample_positions* to new header
Move the arrays to the new header brw_multisample_state.h, which will be shared with Broadwell code. CC: Paul Berry stereotype...@gmail.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_multisample_state.h | 72 ++ src/mesa/drivers/dri/i965/gen6_multisample_state.c | 47 +- 2 files changed, 73 insertions(+), 46 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/brw_multisample_state.h diff --git a/src/mesa/drivers/dri/i965/brw_multisample_state.h b/src/mesa/drivers/dri/i965/brw_multisample_state.h new file mode 100644 index 000..79566f0 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_multisample_state.h @@ -0,0 +1,72 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include stdint.h + +/** + * Sample positions: + * 2 6 a e + * 2 0 + * 6 1 + * a 2 + * e 3 + */ +static const uint32_t +brw_multisample_positions_4x[] = { 0xae2ae662 }; + +/** + * Sample positions are based on a solution to the 8 queens puzzle. + * Rationale: in a solution to the 8 queens puzzle, no two queens share + * a row, column, or diagonal. This is a desirable property for samples + * in a multisampling pattern, because it ensures that the samples are + * relatively uniformly distributed through the pixel. + * + * There are several solutions to the 8 queens puzzle (see + * http://en.wikipedia.org/wiki/Eight_queens_puzzle). This solution was + * chosen because it has a queen close to the center; this should + * improve the accuracy of centroid interpolation, since the hardware + * implements centroid interpolation by choosing the centermost sample + * that overlaps with the primitive being drawn. + * + * Note: from the Ivy Bridge PRM, Vol2 Part1 p304 (3DSTATE_MULTISAMPLE: + * Programming Notes): + * + * When programming the sample offsets (for NUMSAMPLES_4 or _8 and + * MSRASTMODE_xxx_PATTERN), the order of the samples 0 to 3 (or 7 + * for 8X) must have monotonically increasing distance from the + * pixel center. This is required to get the correct centroid + * computation in the device. + * + * Sample positions: + * 1 3 5 7 9 b d f + * 1 5 + * 3 2 + * 5 6 + * 7 4 + * 9 0 + * b 3 + * d 1 + * f 7 + */ +static const uint32_t +brw_multisample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 }; diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c b/src/mesa/drivers/dri/i965/gen6_multisample_state.c index 0ba3642..c94c900 100644 --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c @@ -25,52 +25,7 @@ #include brw_context.h #include brw_defines.h - -/* Sample positions: - * 2 6 a e - * 2 0 - * 6 1 - * a 2 - * e 3 - */ -static uint32_t -brw_multisample_positions_4x[] = { 0xae2ae662 }; -/* Sample positions are based on a solution to the 8 queens puzzle. - * Rationale: in a solution to the 8 queens puzzle, no two queens share - * a row, column, or diagonal. This is a desirable property for samples - * in a multisampling pattern, because it ensures that the samples are - * relatively uniformly distributed through the pixel. - * - * There are several solutions to the 8 queens puzzle (see - * http://en.wikipedia.org/wiki/Eight_queens_puzzle). This solution was - * chosen because it has a queen close to the center; this should - * improve the accuracy of centroid interpolation, since the hardware - * implements centroid interpolation by choosing the centermost sample - * that overlaps with the primitive being drawn. - * - * Note: from the Ivy Bridge PRM, Vol2 Part1 p304 (3DSTATE_MULTISAMPLE: - * Programming Notes): - * - * When programming the sample offsets (for NUMSAMPLES_4 or _8 and - * MSRASTMODE_xxx_PATTERN), the order of the samples 0 to 3 (or 7 - * for
[Mesa-dev] [PATCH] configure: link against -lLLVM to determine build type
Fixes a build failure of 9.2 on ubuntu, because libLLVM-3.3.so is not present in /usr/lib/llvm-3.2/lib. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- diff --git a/configure.ac b/configure.ac index 35f6797..579d8d4 100644 --- a/configure.ac +++ b/configure.ac @@ -1870,7 +1870,18 @@ if test x$MESA_LLVM != x0; then if test x$with_llvm_shared_libs = xyes; then dnl We can't use $LLVM_VERSION because it has 'svn' stripped out, LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version` -AS_IF([test -f $LLVM_LIBDIR/lib$LLVM_SO_NAME.so], [llvm_have_one_so=yes]) + +AC_MSG_CHECKING([whether $LLVM_SO_NAME is a monolithic blob]) +save_LIBS=$LIBS +save_LDFLAGS=$LDFLAGS +LDFLAGS=$LDFLAGS $LLVM_LDFLAGS +LIBS=$LIBS -l$LLVM_SO_NAME + +AC_LINK_IFELSE([AC_LANG_CALL([], [LLVMInitializeCore])], + [llvm_have_one_so=yes], [llvm_have_one_so=no]) +LIBS=$save_LIBS +LDFLAGS=$save_LDFLAGS +AC_MSG_RESULT([$llvm_have_one_so]) if test x$llvm_have_one_so = xyes; then dnl LLVM was built using auto*, so there is only one shared object. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Patch] Sharing flags should disable tiling
On 08/12/2013 03:15 PM, Stéphane Marchesin wrote: On Mon, Aug 12, 2013 at 3:05 PM, Marek Olšák mar...@gmail.com wrote: On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin stephane.marche...@gmail.com wrote: Other than hybrid systems (of which there are none with i915 graphics), is there any case where __DRI_IMAGE_USE_SHARE can occur? You could do interesting things like cross-process sharing with it. I think it's worth doing it, no matter what. It's easy to pick up now, and hard to fix up later. Cross-process sharing is mandatory already and exposed via resource_from_handle and resource_get_handle. I don't think this is useful for cross-process sharing anyway, because it disables tiling. Well, for Chrome we're thinking of using it. If one end can map linear memory and write texture data to it from the CPU, and the other end can use it as a GL texture, then we have a zero copy cross-process texture upload. I realize it's not your normal use case, but... :) Stéphane, have you considered using EGL_EXT_dma_buf_import with GL_OES_EGL_image_external for Chrome texture sharing? It seems like a good fit for what Chrome is doing: sharing non-mipmapped 2d texture memory. Both extensions recently landed on master for i965. On that note, do you have any interest in moving Chrome/Intel awaw from GLX to EGL? Some people on the Intel Media team have been discussing a desire to do that for some of the media components. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeon/llvm: fix compile error with -Werror=format-security
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c index 1a4d4fd..2dd7bf7 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c @@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_llvm_binary *binary, r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err, out_buffer); if (r) { - fprintf(stderr, err); + fprintf(stderr, %s, err); FREE(err); return 1; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965: Move arrays brw_multisample_positions* to new header
On 08/13/2013 08:53 AM, Chad Versace wrote: Move the arrays to the new header brw_multisample_state.h, which will be shared with Broadwell code. CC: Paul Berry stereotype...@gmail.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com Hmm. Looks like I botched my patches in a rebase, but I'm pretty sure my new plan was to just put the Broadwell code in gen6_multisample_state.c (rather than introducing a new file) to avoid having to do this. It's not much code. Then again, I don't really mind doing this either. Patches 3-4 get my R-b, and I'm fine with landing the series. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] ilo: implement new float comparison instructions
From: Roland Scheidegger srol...@vmware.com untested. --- src/gallium/drivers/ilo/shader/toy_tgsi.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c b/src/gallium/drivers/ilo/shader/toy_tgsi.c index d5a3f2f..830aa57 100644 --- a/src/gallium/drivers/ilo/shader/toy_tgsi.c +++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c @@ -209,15 +209,18 @@ aos_set_on_cond(struct toy_compiler *tc, case TGSI_OPCODE_SLT: case TGSI_OPCODE_ISLT: case TGSI_OPCODE_USLT: + case TGSI_OPCODE_FSLT: cond = BRW_CONDITIONAL_L; break; case TGSI_OPCODE_SGE: case TGSI_OPCODE_ISGE: case TGSI_OPCODE_USGE: + case TGSI_OPCODE_FSGE: cond = BRW_CONDITIONAL_GE; break; case TGSI_OPCODE_SEQ: case TGSI_OPCODE_USEQ: + case TGSI_OPCODE_FSEQ: cond = BRW_CONDITIONAL_EQ; break; case TGSI_OPCODE_SGT: @@ -228,6 +231,7 @@ aos_set_on_cond(struct toy_compiler *tc, break; case TGSI_OPCODE_SNE: case TGSI_OPCODE_USNE: + case TGSI_OPCODE_FSNE: cond = BRW_CONDITIONAL_NEQ; break; default: @@ -935,10 +939,10 @@ static const toy_tgsi_translate aos_translate_table[TGSI_OPCODE_LAST] = { [105] = aos_unsupported, [106] = aos_unsupported, [TGSI_OPCODE_NOP] = aos_simple, - [108] = aos_unsupported, - [109] = aos_unsupported, - [110] = aos_unsupported, - [111] = aos_unsupported, + [TGSI_OPCODE_FSEQ] = aos_set_on_cond, + [TGSI_OPCODE_FSGE] = aos_set_on_cond, + [TGSI_OPCODE_FSLT] = aos_set_on_cond, + [TGSI_OPCODE_FSNE] = aos_set_on_cond, [TGSI_OPCODE_NRM4] = aos_NRM4, [TGSI_OPCODE_CALLNZ] = aos_unsupported, [TGSI_OPCODE_BREAKC] = aos_unsupported, @@ -1551,10 +1555,10 @@ static const toy_tgsi_translate soa_translate_table[TGSI_OPCODE_LAST] = { [105] = soa_unsupported, [106] = soa_unsupported, [TGSI_OPCODE_NOP] = soa_passthrough, - [108] = soa_unsupported, - [109] = soa_unsupported, - [110] = soa_unsupported, - [111] = soa_unsupported, + [TGSI_OPCODE_FSEQ] = soa_per_channel, + [TGSI_OPCODE_FSGE] = soa_per_channel, + [TGSI_OPCODE_FSLT] = soa_per_channel, + [TGSI_OPCODE_FSNE] = soa_per_channel, [TGSI_OPCODE_NRM4] = soa_NRM4, [TGSI_OPCODE_CALLNZ] = soa_unsupported, [TGSI_OPCODE_BREAKC] = soa_unsupported, -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] r600/radeonsi: implement new float comparison instructions
From: Roland Scheidegger srol...@vmware.com Also use ordered comparisons for old cmp instructions. Untested. --- src/gallium/drivers/r600/r600_shader.c | 18 --- .../drivers/radeon/radeon_setup_tgsi_llvm.c| 49 2 files changed, 48 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 37298cc..fb766c4 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5743,11 +5743,10 @@ static struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[] = { {105, 0, ALU_OP0_NOP, tgsi_unsupported}, {106, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_NOP, 0, ALU_OP0_NOP, tgsi_unsupported}, - /* gap */ - {108, 0, ALU_OP0_NOP, tgsi_unsupported}, - {109, 0, ALU_OP0_NOP, tgsi_unsupported}, - {110, 0, ALU_OP0_NOP, tgsi_unsupported}, - {111, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_FSEQ, 0, ALU_OP2_SETE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSGE, 0, ALU_OP2_SETGE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSLT, 0, ALU_OP2_SETGT_DX10, tgsi_op2_swap}, + {TGSI_OPCODE_FSNE, 0, ALU_OP2_SETNE_DX10, tgsi_op2_swap}, {TGSI_OPCODE_NRM4, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported}, /* gap */ @@ -5936,11 +5935,10 @@ static struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] = { {105, 0, ALU_OP0_NOP, tgsi_unsupported}, {106, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_NOP, 0, ALU_OP0_NOP, tgsi_unsupported}, - /* gap */ - {108, 0, ALU_OP0_NOP, tgsi_unsupported}, - {109, 0, ALU_OP0_NOP, tgsi_unsupported}, - {110, 0, ALU_OP0_NOP, tgsi_unsupported}, - {111, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_FSEQ, 0, ALU_OP2_SETE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSGE, 0, ALU_OP2_SETGE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSLT, 0, ALU_OP2_SETGT_DX10, tgsi_op2_swap}, + {TGSI_OPCODE_FSNE, 0, ALU_OP2_SETNE_DX10, tgsi_op2_swap}, {TGSI_OPCODE_NRM4, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported}, /* gap */ diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c index 7a47746..8ff9abd 100644 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c @@ -850,18 +850,16 @@ static void emit_cmp( LLVMRealPredicate pred; LLVMValueRef cond; - /* XXX I'm not sure whether to do unordered or ordered comparisons, -* but llvmpipe uses unordered comparisons, so for consistency we use -* unordered. (The authors of llvmpipe aren't sure about using -* unordered vs ordered comparisons either. + /* Use ordered for everything but NE (which is usual for +* float comparisons) */ switch (emit_data-inst-Instruction.Opcode) { - case TGSI_OPCODE_SGE: pred = LLVMRealUGE; break; - case TGSI_OPCODE_SEQ: pred = LLVMRealUEQ; break; - case TGSI_OPCODE_SLE: pred = LLVMRealULE; break; - case TGSI_OPCODE_SLT: pred = LLVMRealULT; break; + case TGSI_OPCODE_SGE: pred = LLVMRealOGE; break; + case TGSI_OPCODE_SEQ: pred = LLVMRealOEQ; break; + case TGSI_OPCODE_SLE: pred = LLVMRealOLE; break; + case TGSI_OPCODE_SLT: pred = LLVMRealOLT; break; case TGSI_OPCODE_SNE: pred = LLVMRealUNE; break; - case TGSI_OPCODE_SGT: pred = LLVMRealUGT; break; + case TGSI_OPCODE_SGT: pred = LLVMRealOGT; break; default: assert(!unknown instruction); pred = 0; break; } @@ -872,6 +870,35 @@ static void emit_cmp( cond, bld_base-base.one, bld_base-base.zero, ); } +static void emit_fcmp( + const struct lp_build_tgsi_action *action, + struct lp_build_tgsi_context * bld_base, + struct lp_build_emit_data * emit_data) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMContextRef context = bld_base-base.gallivm-context; + LLVMRealPredicate pred; + + /* Use ordered for everything but NE (which is usual for +* float comparisons) +*/ + switch (emit_data-inst-Instruction.Opcode) { + case TGSI_OPCODE_FSEQ: pred = LLVMRealOEQ; break; + case TGSI_OPCODE_FSGE: pred = LLVMRealOGE; break; + case TGSI_OPCODE_FSLT: pred = LLVMRealOLT; break; + case TGSI_OPCODE_FSNE: pred = LLVMRealUNE; break; + default: assert(!unknown instruction); pred = 0; break; +
[Mesa-dev] [PATCH 2/4] nv50: implement new float comparison instructions
From: Roland Scheidegger srol...@vmware.com untested. --- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 17 + 1 file changed, 17 insertions(+) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp index 56eccac..a2ad9f4 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp @@ -440,6 +440,11 @@ nv50_ir::DataType Instruction::inferDstType() const switch (getOpcode()) { case TGSI_OPCODE_F2U: return nv50_ir::TYPE_U32; case TGSI_OPCODE_F2I: return nv50_ir::TYPE_S32; + case TGSI_OPCODE_FSEQ: + case TGSI_OPCODE_FSGE: + case TGSI_OPCODE_FSLT: + case TGSI_OPCODE_FSNE: + return nv50_ir::TYPE_U32; case TGSI_OPCODE_I2F: case TGSI_OPCODE_U2F: return nv50_ir::TYPE_F32; @@ -456,19 +461,23 @@ nv50_ir::CondCode Instruction::getSetCond() const case TGSI_OPCODE_SLT: case TGSI_OPCODE_ISLT: case TGSI_OPCODE_USLT: + case TGSI_OPCODE_FSLT: return CC_LT; case TGSI_OPCODE_SLE: return CC_LE; case TGSI_OPCODE_SGE: case TGSI_OPCODE_ISGE: case TGSI_OPCODE_USGE: + case TGSI_OPCODE_FSGE: return CC_GE; case TGSI_OPCODE_SGT: return CC_GT; case TGSI_OPCODE_SEQ: case TGSI_OPCODE_USEQ: + case TGSI_OPCODE_FSEQ: return CC_EQ; case TGSI_OPCODE_SNE: + case TGSI_OPCODE_FSNE: return CC_NEU; case TGSI_OPCODE_USNE: return CC_NE; @@ -556,6 +565,10 @@ static nv50_ir::operation translateOpcode(uint opcode) NV50_IR_OPCODE_CASE(KILL_IF, DISCARD); NV50_IR_OPCODE_CASE(F2I, CVT); + NV50_IR_OPCODE_CASE(FSEQ, SET); + NV50_IR_OPCODE_CASE(FSGE, SET); + NV50_IR_OPCODE_CASE(FSLT, SET); + NV50_IR_OPCODE_CASE(FSNE, SET); NV50_IR_OPCODE_CASE(IDIV, DIV); NV50_IR_OPCODE_CASE(IMAX, MAX); NV50_IR_OPCODE_CASE(IMIN, MIN); @@ -2354,6 +2367,10 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn) case TGSI_OPCODE_SLE: case TGSI_OPCODE_SNE: case TGSI_OPCODE_STR: + case TGSI_OPCODE_FSEQ: + case TGSI_OPCODE_FSGE: + case TGSI_OPCODE_FSLT: + case TGSI_OPCODE_FSNE: case TGSI_OPCODE_ISGE: case TGSI_OPCODE_ISLT: case TGSI_OPCODE_USEQ: -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported
From: Roland Scheidegger srol...@vmware.com Should get rid of some float-to-int conversions (with negation). No piglit regressions (with llvmpipe). --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d9b4ed2..65ba449 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -420,8 +420,6 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); - void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); - void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); - if (native_integers) - try_emit_float_set(ir, op, dst); - return inst; } @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } - /** - * Emits the code to convert the result of float SET instructions to integers. - */ -void -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, -st_dst_reg dst) -{ - if ((op == TGSI_OPCODE_SEQ || -op == TGSI_OPCODE_SNE || -op == TGSI_OPCODE_SGE || -op == TGSI_OPCODE_SLT)) - { - st_src_reg src = st_src_reg(dst); - src.negate = ~src.negate; - dst.type = GLSL_TYPE_FLOAT; - emit(ir, TGSI_OPCODE_F2I, dst, src); - } -} - /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, #define case2fi(f, i) case4(f, f, i, i) #define case2iu(i, u) case4(i, LAST, i, u) +#define casecomp(c, f, i, u) \ + case TGSI_OPCODE_##c: \ + if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \ + else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \ + else if (native_integers) \ + op = TGSI_OPCODE_##f; \ + else op = TGSI_OPCODE_##c; \ + break; + switch(op) { case2fi(ADD, UADD); case2fi(MUL, UMUL); @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(MAX, IMAX, UMAX); case3(MIN, IMIN, UMIN); case2iu(MOD, UMOD); - - case2fi(SEQ, USEQ); - case2fi(SNE, USNE); - case3(SGE, ISGE, USGE); - case3(SLT, ISLT, USLT); - + + casecomp(SEQ, FSEQ, USEQ, USEQ); + casecomp(SNE, FSNE, USNE, USNE); + casecomp(SGE, FSGE, ISGE, USGE); + casecomp(SLT, FSLT, ISLT, USLT); + case2iu(ISHR, USHR); case2fi(SSG, ISSG); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported
I tested this for llvmpipe, but it would be good if the respective driver authors could verify it works for their drivers, I have no idea if I got it right there, just guessed how it might work based mostly on how other comparison instructions are handled (and I hope I caught all drivers, those supporting integers), but if not glsl will probably break quite badly I suppose. Roland Am 13.08.2013 19:04, schrieb srol...@vmware.com: From: Roland Scheidegger srol...@vmware.com Should get rid of some float-to-int conversions (with negation). No piglit regressions (with llvmpipe). --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d9b4ed2..65ba449 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -420,8 +420,6 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); - void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); - void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); - if (native_integers) - try_emit_float_set(ir, op, dst); - return inst; } @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } - /** - * Emits the code to convert the result of float SET instructions to integers. - */ -void -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, - st_dst_reg dst) -{ - if ((op == TGSI_OPCODE_SEQ || -op == TGSI_OPCODE_SNE || -op == TGSI_OPCODE_SGE || -op == TGSI_OPCODE_SLT)) - { - st_src_reg src = st_src_reg(dst); - src.negate = ~src.negate; - dst.type = GLSL_TYPE_FLOAT; - emit(ir, TGSI_OPCODE_F2I, dst, src); - } -} - /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, #define case2fi(f, i) case4(f, f, i, i) #define case2iu(i, u) case4(i, LAST, i, u) +#define casecomp(c, f, i, u) \ + case TGSI_OPCODE_##c: \ + if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \ + else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \ + else if (native_integers) \ + op = TGSI_OPCODE_##f; \ + else op = TGSI_OPCODE_##c; \ + break; + switch(op) { case2fi(ADD, UADD); case2fi(MUL, UMUL); @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(MAX, IMAX, UMAX); case3(MIN, IMIN, UMIN); case2iu(MOD, UMOD); - - case2fi(SEQ, USEQ); - case2fi(SNE, USNE); - case3(SGE, ISGE, USGE); - case3(SLT, ISLT, USLT); - + + casecomp(SEQ, FSEQ, USEQ, USEQ); + casecomp(SNE, FSNE, USNE, USNE); + casecomp(SGE, FSGE, ISGE, USGE); + casecomp(SLT, FSLT, ISLT, USLT); + case2iu(ISHR, USHR); case2fi(SSG, ISSG); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.
On 08/09/2013 12:33 AM, Eric Anholt wrote: Mark Mueller markkmuel...@gmail.com writes: On Thu, Aug 8, 2013 at 2:19 PM, Eric Anholt e...@anholt.net wrote: Mark Mueller markkmuel...@gmail.com writes: Signed-off-by: Mark Mueller markkmuel...@gmail.com --- src/mesa/drivers/dri/i965/brw_draw.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 6170d07..1b5ed55 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + const int estimated_max_prim_size = + 512 + /* batchbuffer commands */ + ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + + 1024 + /* gen6 VS push constants */ + 1024 + /* gen6 WM push constants */ + 512; /* misc. pad */ What's the point of this change? Moving loop invariants out of loops is something basic that your compiler does, Is that universally true for the code as it looked originally (see below)? I've worked on embedded Atom and other systems with heavily dumbed down gcc or other cross compilers. For instance there is a good chance that the compilers from vehicle infotainment systems that I've worked on recently would generate assembly for each line of code below inside the loop. If your compiler isn't doing that, it's a problem with your compiler, not the code being compiled, and you need to fix that in your build environment. Sure, yet it's in the company of fail_next with a similar problem. What about keeping the definition inside the for loop but adding the const keyword and adding all of the immediates as one operation? for (i = 0; i nr_prims; i++) { const int estimated_max_prim_size = 512 + /* batchbuffer commands */ ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + 1024 + /* gen6 VS push constants */ 1024 + /* gen6 WM push constants */ 512; /* misc. pad */ The const keyword doesn't tell the compiler anyhing except keep the developer from trying to modify this, which just makes things irritating when somebody comes along later to add something like oh, and on gen11 we need to reserve an extra 4k or whatever. Just a friendly reminder that this number is kind of bullshit anyway: 1. 512 bytes for batchbuffer commands? On what generation? 2. Surface states anybody? That's potentially 1-3k of space not tracked 3. With hardware contexts, we don't always emit the full state anyway... 4. BLORP shares batches with normal drawing now... As far as I know, this is only used to flush the batch when it's approaching full. If we filled up the last little bit, and then ran out of space, we'd have to start over, wasting CPU time. If we want per-generation numbers, I think the right solution is to do: static const int estimated_max_prim_size[] = { ... } ... estimated_max_prim_size[brw-gen - 4] ... or a helper function, not if-ladders which +=. But this is all pretty dodgy anyway. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Enable GLX TLS by default in Mesa?
On 08/09/2013 12:26 AM, Vedran Rodic wrote: Hi, I've been burned with the issue of GLX TLS not being enabled by default in Mesa (Dota 2 seems to rely on it). What's the rationale of not enabling it by default? Thanks, Vedran Rodic As far as I know, --enable-glx-tls just makes things more efficient. Nothing should *rely* on it, or even be able to detect it... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: Fix incorrect pattern matching in ir_set_program_inouts
On 08/09/2013 10:27 AM, Paul Berry wrote: In commit 8fc41df (glsl: Modify ir_set_program_inouts to handle geometry shaders), when attempting to pattern match the foo part of expressions such as: foo[i][j] foo[i] I incorrectly called as_dereference_variable() on the subexpression foo[i] instead of foo. As a result, the pattern never matched, so ir_set_program_inouts would fall back on marking the entire variable as used, rather than just the portion indexed by the array. This didn't result in incorrect behaviour, but it could have resulted in inefficiency by causing the back-end to allocate resources for unused parts of an input or output array. Patch 1 is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/2] radeonsi: Don't export unused clip distance vectors from vertex shader
From: Michel Dänzer michel.daen...@amd.com E.g. the Source engine seems to always write to gl_ClipVertex, but normally doesn't enable any GL_CLIP_DISTANCEn states. This change removes some irrelevant parts from the generated vertex shader code in such cases. Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v2: Adapt for the possibility to export clip distance vector 1 but not 0. src/gallium/drivers/radeonsi/radeonsi_shader.c | 10 +- src/gallium/drivers/radeonsi/radeonsi_shader.h | 1 + src/gallium/drivers/radeonsi/si_state.c| 4 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index dd9581d..6bf4b05 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -565,6 +565,7 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, LLVMValueRef (*pos)[9], unsigned index) { struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); + struct si_pipe_shader *shader = si_shader_ctx-shader; struct lp_build_context *base = bld_base-base; struct lp_build_context *uint = si_shader_ctx-radeon_bld.soa.bld_base.uint_bld; unsigned reg_index; @@ -583,6 +584,11 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, for (reg_index = 0; reg_index 2; reg_index ++) { LLVMValueRef *args = pos[2 + reg_index]; + if (!(shader-key.vs.ucps_enabled (1 reg_index))) + continue; + + shader-shader.clip_dist_write |= 0xf (4 * reg_index); + args[5] = args[6] = args[7] = @@ -709,13 +715,15 @@ handle_semantic: } break; case TGSI_SEMANTIC_CLIPDIST: + if (!(si_shader_ctx-shader-key.vs.ucps_enabled + (1 d-Semantic.Index))) + continue; shader-clip_dist_write |= d-Declaration.UsageMask (d-Semantic.Index 2); target = V_008DFC_SQ_EXP_POS + 2 + d-Semantic.Index; break; case TGSI_SEMANTIC_CLIPVERTEX: si_llvm_emit_clipvertex(bld_base, pos_args, index); - shader-clip_dist_write = 0xFF; continue; case TGSI_SEMANTIC_FOG: case TGSI_SEMANTIC_GENERIC: diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h b/src/gallium/drivers/radeonsi/radeonsi_shader.h index f28a0ea..2d4468a 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h @@ -128,6 +128,7 @@ union si_shader_key { } ps; struct { unsignedinstance_divisors[PIPE_MAX_ATTRIBS]; + unsigneducps_enabled:2; } vs; }; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 58e5a56..0fecb1d 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2040,6 +2040,10 @@ static INLINE void si_shader_selector_key(struct pipe_context *ctx, for (i = 0; i rctx-vertex_elements-count; ++i) key-vs.instance_divisors[i] = rctx-vertex_elements-elements[i].instance_divisor; + if (rctx-queued.named.rasterizer-clip_plane_enable 0xf0) + key-vs.ucps_enabled |= 0x2; + if (rctx-queued.named.rasterizer-clip_plane_enable 0xf) + key-vs.ucps_enabled |= 0x1; } else if (sel-type == PIPE_SHADER_FRAGMENT) { if (sel-fs_write_all) key-ps.nr_cbufs = rctx-framebuffer.nr_cbufs; -- 1.8.4.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/2] radeonsi: Don't leave gaps between position exports from vertex shader
From: Michel Dänzer michel.daen...@amd.com If the vertex shader exports clip distances but not point size, use position exports 1/2 instead of 2/3 for the clip distances. Fixes geometry corruption in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v2: No need to export unused position vectors, just export to consecutive position export slots. src/gallium/drivers/radeonsi/radeonsi_shader.c | 135 +++-- src/gallium/drivers/radeonsi/radeonsi_shader.h | 1 + src/gallium/drivers/radeonsi/si_state_draw.c | 6 +- 3 files changed, 83 insertions(+), 59 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index fee6262..dd9581d 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -562,12 +562,11 @@ static void si_alpha_test(struct lp_build_tgsi_context *bld_base, } static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, - unsigned index) + LLVMValueRef (*pos)[9], unsigned index) { struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); struct lp_build_context *base = bld_base-base; struct lp_build_context *uint = si_shader_ctx-radeon_bld.soa.bld_base.uint_bld; - LLVMValueRef args[9]; unsigned reg_index; unsigned chan; unsigned const_chan; @@ -582,6 +581,8 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, } for (reg_index = 0; reg_index 2; reg_index ++) { + LLVMValueRef *args = pos[2 + reg_index]; + args[5] = args[6] = args[7] = @@ -612,10 +613,6 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, args[3] = lp_build_const_int32(base-gallivm, V_008DFC_SQ_EXP_POS + 2 + reg_index); args[4] = uint-zero; - lp_build_intrinsic(base-gallivm-builder, - llvm.SI.export, - LLVMVoidTypeInContext(base-gallivm-context), - args, 9); } } @@ -630,17 +627,18 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) struct tgsi_parse_context *parse = si_shader_ctx-parse; LLVMValueRef args[9]; LLVMValueRef last_args[9] = { 0 }; + LLVMValueRef pos_args[4][9] = { { 0 } }; unsigned semantic_name; unsigned color_count = 0; unsigned param_count = 0; int depth_index = -1, stencil_index = -1; + int i; while (!tgsi_parse_end_of_tokens(parse)) { struct tgsi_full_declaration *d = parse-FullToken.FullDeclaration; unsigned target; unsigned index; - int i; tgsi_parse_token(parse); @@ -716,7 +714,7 @@ handle_semantic: target = V_008DFC_SQ_EXP_POS + 2 + d-Semantic.Index; break; case TGSI_SEMANTIC_CLIPVERTEX: - si_llvm_emit_clipvertex(bld_base, index); + si_llvm_emit_clipvertex(bld_base, pos_args, index); shader-clip_dist_write = 0xFF; continue; case TGSI_SEMANTIC_FOG: @@ -734,9 +732,13 @@ handle_semantic: si_llvm_init_export_args(bld_base, d, index, target, args); - if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ? - (semantic_name == TGSI_SEMANTIC_POSITION) : - (semantic_name == TGSI_SEMANTIC_COLOR)) { + if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX + target = V_008DFC_SQ_EXP_POS + target = (V_008DFC_SQ_EXP_POS + 3)) { + memcpy(pos_args[target - V_008DFC_SQ_EXP_POS], + args, sizeof(args)); + } else if (si_shader_ctx-type == TGSI_PROCESSOR_FRAGMENT + semantic_name == TGSI_SEMANTIC_COLOR) { if (last_args[0]) { lp_build_intrinsic(base-gallivm-builder, llvm.SI.export, @@ -806,66 +808,87 @@ handle_semantic: memcpy(last_args, args, sizeof(args)); } - if (!last_args[0]) { - assert(si_shader_ctx-type == TGSI_PROCESSOR_FRAGMENT); - -
Re: [Mesa-dev] segfault in pstip_bind_sampler_states
On 08/13/2013 11:45 AM, Kevin H. Hobbs wrote: On 08/13/2013 09:50 AM, Brian Paul wrote: On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote: --30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting Well, that's not too helpful. Ha! I can move the segfault all the way back to : Program received signal SIGSEGV, Segmentation fault. 0x7fffe0ba6d43 in osmesa_st_framebuffer_flush_front (stctx=0x1518ee0, stfbi=0x1520560, statt=ST_ATTACHMENT_FRONT_LEFT) at osmesa.c:305 305u_box_2d(0, 0, res-width0, res-height0, box); if I make the magnification on the vtkRenderLargeImage instance high enough. signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/9] glsl: Emit errors for things that look like default precision statements
On 08/09/2013 04:38 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Previously we would emit a warning for empty declarations like float; We would also emit the same warning for things like highp float; However, this second case is most likely the application trying to set the default precision. We should instead generate an error. Fixes piglit precision-05.vert. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org AMD succesfully compiles precision-05.vert, so I think this probably needs to be allowed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] nv50: implement new float comparison instructions
On 13.08.2013 19:04, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com untested. Looks like it should work though, thanks. nv50 only supported u32 result all along and on nvc0 both cases are already handled by the rest of the code, too. --- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 17 + 1 file changed, 17 insertions(+) diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp index 56eccac..a2ad9f4 100644 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp @@ -440,6 +440,11 @@ nv50_ir::DataType Instruction::inferDstType() const switch (getOpcode()) { case TGSI_OPCODE_F2U: return nv50_ir::TYPE_U32; case TGSI_OPCODE_F2I: return nv50_ir::TYPE_S32; + case TGSI_OPCODE_FSEQ: + case TGSI_OPCODE_FSGE: + case TGSI_OPCODE_FSLT: + case TGSI_OPCODE_FSNE: + return nv50_ir::TYPE_U32; case TGSI_OPCODE_I2F: case TGSI_OPCODE_U2F: return nv50_ir::TYPE_F32; @@ -456,19 +461,23 @@ nv50_ir::CondCode Instruction::getSetCond() const case TGSI_OPCODE_SLT: case TGSI_OPCODE_ISLT: case TGSI_OPCODE_USLT: + case TGSI_OPCODE_FSLT: return CC_LT; case TGSI_OPCODE_SLE: return CC_LE; case TGSI_OPCODE_SGE: case TGSI_OPCODE_ISGE: case TGSI_OPCODE_USGE: + case TGSI_OPCODE_FSGE: return CC_GE; case TGSI_OPCODE_SGT: return CC_GT; case TGSI_OPCODE_SEQ: case TGSI_OPCODE_USEQ: + case TGSI_OPCODE_FSEQ: return CC_EQ; case TGSI_OPCODE_SNE: + case TGSI_OPCODE_FSNE: return CC_NEU; case TGSI_OPCODE_USNE: return CC_NE; @@ -556,6 +565,10 @@ static nv50_ir::operation translateOpcode(uint opcode) NV50_IR_OPCODE_CASE(KILL_IF, DISCARD); NV50_IR_OPCODE_CASE(F2I, CVT); + NV50_IR_OPCODE_CASE(FSEQ, SET); + NV50_IR_OPCODE_CASE(FSGE, SET); + NV50_IR_OPCODE_CASE(FSLT, SET); + NV50_IR_OPCODE_CASE(FSNE, SET); NV50_IR_OPCODE_CASE(IDIV, DIV); NV50_IR_OPCODE_CASE(IMAX, MAX); NV50_IR_OPCODE_CASE(IMIN, MIN); @@ -2354,6 +2367,10 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn) case TGSI_OPCODE_SLE: case TGSI_OPCODE_SNE: case TGSI_OPCODE_STR: + case TGSI_OPCODE_FSEQ: + case TGSI_OPCODE_FSGE: + case TGSI_OPCODE_FSLT: + case TGSI_OPCODE_FSNE: case TGSI_OPCODE_ISGE: case TGSI_OPCODE_ISLT: case TGSI_OPCODE_USEQ: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Enable GLX TLS by default in Mesa?
On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote: As far as I know, --enable-glx-tls just makes things more efficient. Nothing should *rely* on it, or even be able to detect it... Dota 2 crashes without that option when loading the actual game map. I assumed it adds thread safety. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Another Take on the S3TC issue
Hi, I have read about the issue of implementing the S3TC Extension in Mesa: http://dri.freedesktop.org/wiki/S3TC/ As I understood, the problem is, that encoding and decoding S3TC in software is covered by patents, while passing S3TC compressed data to the GPU is still ok. AS NOW: If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded texture works if format==internalFormat is true. If format!=internalFormat is true, it would fail (as i know). SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Add Gen7 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually using the depth buffer in BLORP - we just failed to emit them when disabling depth altogether. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index 518d7f5..44e7578 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -756,6 +756,8 @@ static void gen7_blorp_emit_depth_disable(struct brw_context *brw, const brw_blorp_params *params) { + intel_emit_depth_stall_flushes(brw); + BEGIN_BATCH(7); OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER 16 | (7 - 2)); OUT_BATCH(BRW_DEPTHFORMAT_D32_FLOAT 18 | (BRW_SURFACE_NULL 29)); -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Add Gen6 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually using the depth buffer in BLORP - we just failed to emit them when disabling depth altogether. On Sandybridge, this also requires the post_sync_nonzero flush. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index a4a9081..129c113 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -914,6 +914,9 @@ static void gen6_blorp_emit_depth_disable(struct brw_context *brw, const brw_blorp_params *params) { + intel_emit_post_sync_nonzero_flush(brw); + intel_emit_depth_stall_flushes(brw); + BEGIN_BATCH(7); OUT_BATCH(_3DSTATE_DEPTH_BUFFER 16 | (7 - 2)); OUT_BATCH((BRW_DEPTHFORMAT_D32_FLOAT 18) | -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
I've been hanging on this list for a while, and this isn't the first time this has been suggested. The general thing that is repeated is basically this: if you make an API (e.g. OpenGL) that supports S3TC without a license, you're in trouble, even if it is a passthrough to the hardware, which also required a license to produce in the first place. I think the assumption most people make is that if the hardware vendor paid a license to implement S3TC in an ASIC, then surely simply passing through data is OK. After all, it is being done without any knowledge of the algorithm, etc. From a common sense standpoint, I would agree. However, the note in the S3TC extension itself[1] mentions explicitly to be wary of such assumptions in the IP Status section, and notes that *a license for one API is not a license for another*. This implies that for an API to make use of S3TC, it requires a license, which Mesa in general, does not have, while a hardware vendor might. All of this is theoretical as far as I've read; I don't think anyone has legally challenged this for open source drivers and posted the results on this mailing list -- mostly have stayed away from it with a prejudice. I think the patent was granted in 1999, so at least in the USA, hopefully we don't have too many more years of this garbage. Patrick [1] http://www.opengl.org/registry/specs/EXT/texture_compression_s3tc.txt On Tue, Aug 13, 2013 at 1:53 PM, Uwe Schmidt simon.schm...@cs-systemberatung.de wrote: Hi, I have read about the issue of implementing the S3TC Extension in Mesa: http://dri.freedesktop.org/wiki/S3TC/ As I understood, the problem is, that encoding and decoding S3TC in software is covered by patents, while passing S3TC compressed data to the GPU is still ok. AS NOW: If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded texture works if format==internalFormat is true. If format!=internalFormat is true, it would fail (as i know). SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
On 08/13/2013 11:53 AM, Uwe Schmidt wrote: Hi, I have read about the issue of implementing the S3TC Extension in Mesa: http://dri.freedesktop.org/wiki/S3TC/ As I understood, the problem is, that encoding and decoding S3TC in software is covered by patents, while passing S3TC compressed data to the GPU is still ok. It's all patented. Some hardware vendors have licenses for things their hardware does. There's a thing called contributory infringement, too. Please don't play arm-chair IP attorney. AS NOW: If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded texture works if format==internalFormat is true. If format!=internalFormat is true, it would fail (as i know). It doesn't fail. Mesa just leaves the data uncompressed. The only failure occurs if the application calls glGetComrpessedTexImage to get the compressed data back. SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. 'format' cannot be a compressed type. Compressed data can only be supplied using glCompressedTexImage2D, and that function only has an internalFormat parameter. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). The only format that Mesa can encode is FXT1. Only Intel hardware supports FXT1, and the quality (of Mesa's compressor) is not very good. Picking that format would result in bug reports of game XYZ looks horrible on Intel graphix you suck. So that leaves us with the only option of leaving the data uncompressed. Until S3 grants it's IP to OIN or the patents expire, this is going to be the situation. We've been through this mental exercise of the last 5 years more times than I can count, and we always come back to the same place. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)
On 08/13/2013 07:49 AM, Siarhei Siamashka wrote: On Thu, 08 Aug 2013 16:19:28 -0700 Ian Romanick i...@freedesktop.org wrote: On 08/06/2013 02:13 PM, Siarhei Siamashka wrote: Hello, Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL dropped support for the OpenGL-free configuration. http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html Could this be possibly reverted to allow me to continue shooting myself in the foot? The support for OpenGL ES is pretty horrible in the open source software. One nice exception is Qt5 which is doing pretty well. But the rest of the software does not generally work out of the box without patches or tweaks. You can also hardly find a problem-free OpenGL ES compatible open source game (other than Quake3). I have an open feature request for Gentoo, which is a very configurable Linux distribution and should not have any troubles working either with or without OpenGL (the choice is up to the user): https://bugs.gentoo.org/show_bug.cgi?id=476524 But if upstream Mesa treats this configuration as unsupported, then I also don't see it progressing anywhere in Gentoo. So could you please re-consider this decision? We've removed all of the #ifdef code inside Mesa that would have made any difference. It was a nightmare to maintain, and we almost always got it wrong... because nobody was testing that configuration. I believe this can be changed :-) That's a bit of a chicken/egg problem. The OpenGL ES support in free software applications and libraries is so broken, that it's currently a big pain to try this configuration for anything practical. And the applications/libraries can't be fixed without having a non-OpenGL environment for development and testing. What does that have to do with building Mesa without desktop GL? Build Mesa *with* ES, and develop your software. The needed tweaks for Mesa are really trivial. Maybe one could also just compile everything, but delete GL headers, gl.pc and libGL.so after compilation and before installing Mesa to the system. Still it is a bit ugly to have the configure script claim that OpenGL ES is not supported without OpenGL, while in fact it works. It's ugly once at package-time instead of ugly continuously at development time. The only thing this is possibly going to gain you is a trivial amount of build time (by not building libGL, etc.). The compilation time is irrelevant. But it is very useful to be able to install Mesa without OpenGL headers and without libGL.so, so that the problematic software just fails at compile time instead of exhibiting hard to debug problems at runtime. It seems to be a rather common failure scenario when some big bloatware application loads both libGL.so (provided by Mesa) and libGLESv2.so (provided by some proprietary OpenGL ES driver on ARM hardware) into the same process via indirect library dependencies. These shared libraries are providing overlapping function names, but are backed by totally different implementations. And everything blows up as a result when the application is run, or maybe it even mostly works if you are lucky. What's the point installing both Mesa and the proprietary OpenGL ES drivers on the same system? I would surely love to have open source hardware accelerated OpenGL ES drivers on ARM systems today. But they are not quite here yet. And even assuming that we get perfectly functional free software OpenGL ES drivers for embedded hardware, the current buggy applications are not going be magically fixed themselves. Somebody still needs to debug and fix the OpenGL ES compatibility problems. This all sounds like a packaging problem. It should be fixed in the packaging, not in the upstream project. The easiest way forward seems to be just allowing to compile Mesa without desktop OpenGL. It is going to provide: 1. On x86 desktop systems - the development environment for testing OpenGL ES applications. 2. On ARM hardware via softpipe/llvmpipe - some reference fallback implementation. 3. Have both the existing proprietary drivers and Mesa installed on ARM hardware (with the ability to switch between them at any time) - the applications can run at full speed and be profiled/benchmarked. Somebody may argue that I'm exaggerating and OpenGL ES support seems to be not so bad. There were many OpenGL ES related news and announcements. Also there exists Linaro/Ubuntu distribution and some videos on youtube showing how it successfully runs something in 3D on ARM. Still the problem is that in many applications the said OpenGL ES support is either in the work-in-progress state, or it possibly has been contributed by somebody some time ago and has already bitrotten. Also Linaro bundles a bunch of OpenGL ES hacks, which don't seem to be actively pushed upstream. This all is less than perfect and needs to be improved. That
Re: [Mesa-dev] Another Take on the S3TC issue
On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 11:53 AM, Uwe Schmidt wrote: [snip] SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. 'format' cannot be a compressed type. Compressed data can only be supplied using glCompressedTexImage2D, and that function only has an internalFormat parameter. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). The only format that Mesa can encode is FXT1. Only Intel hardware supports FXT1, and the quality (of Mesa's compressor) is not very good. Picking that format would result in bug reports of game XYZ looks horrible on Intel graphix you suck. So that leaves us with the only option of leaving the data uncompressed. Until S3 grants it's IP to OIN or the patents expire, this is going to be the situation. We've been through this mental exercise of the last 5 years more times than I can count, and we always come back to the same place. Please don't hit me with a stick if this has been asked |powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code (sample implementation, ... do not use without a license from S3 and the ritual scarification of at least one software engineer...) but having it off in the default build won't work... right ? Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, CJAVASunUnix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] segfault in pstip_bind_sampler_states
On 08/13/2013 01:41 PM, Kevin H. Hobbs wrote: Ha! I can move the segfault all the way back to : Program received signal SIGSEGV, Segmentation fault. 0x7fffe0ba6d43 in osmesa_st_framebuffer_flush_front (stctx=0x1518ee0, stfbi=0x1520560, statt=ST_ATTACHMENT_FRONT_LEFT) at osmesa.c:305 305u_box_2d(0, 0, res-width0, res-height0, box); if I make the magnification on the vtkRenderLargeImage instance high enough. This post was all wet ignore it. signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
On 08/13/2013 01:27 PM, Roland Mainz wrote: On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 11:53 AM, Uwe Schmidt wrote: [snip] SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. 'format' cannot be a compressed type. Compressed data can only be supplied using glCompressedTexImage2D, and that function only has an internalFormat parameter. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). The only format that Mesa can encode is FXT1. Only Intel hardware supports FXT1, and the quality (of Mesa's compressor) is not very good. Picking that format would result in bug reports of game XYZ looks horrible on Intel graphix you suck. So that leaves us with the only option of leaving the data uncompressed. Until S3 grants it's IP to OIN or the patents expire, this is going to be the situation. We've been through this mental exercise of the last 5 years more times than I can count, and we always come back to the same place. Please don't hit me with a stick if this has been asked |powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code (sample implementation, ... do not use without a license from S3 and the ritual scarification of at least one software engineer...) but having it off in the default build won't work... right ? That is more difficult for end-users than the current situation. In the current situation, your distro can build Mesa (no S3TC), and, if you live in a country without software patents, you can just drop in the libtxc_dxtn library to get compression. Putting it in Mesa, along with making the distros really uncomfortable, would mean you'd have to rebuild Mesa. Did I mention that we've been through this mental exercise a few times? Bye, Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements
From: Ian Romanick ian.d.roman...@intel.com Previously we would emit a warning for empty declarations like float; We would also emit the same warning for things like highp float; However, this second case is most likely the application trying to set the default precision. This makes the compiler generate a stronger warning with some suggestion of a fix. It really seems like this should be an error. I'll bet that 100% of the time someone writes 'highp float;' the actually meant 'precision highp float;'. Alas, both AMD and NVIDIA accept this syntax, and the spec doesn't explicitly forbid it. This makes piglit's precision-05.vert generate the following warnings: 0:12(11): warning: empty declaration with precision qualifier, to set the default precision, use `precision lowp float;' 0:13(12): warning: empty declaration with precision qualifier, to set the default precision, use `precision mediump int;' Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Kenneth Graunke kenn...@whitecape.org Cc: 9.2 mesa-sta...@lists.freedesktop.org --- src/glsl/ast_to_hir.cpp | 43 ++- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 40992fb..f96b64b 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2719,6 +2719,10 @@ ast_declarator_list::hir(exec_list *instructions, * name of a known structure type. This is both invalid and weird. * Emit an error. * + * - The program text contained something like 'mediump float;' + * when the programmer probably meant 'precision mediump + * float;' Emit an error. + * * Note that if decl_type is NULL and there is a structure involved, * there must have been some sort of error with the structure. In this * case we assume that an error was already generated on this line of @@ -2727,20 +2731,33 @@ ast_declarator_list::hir(exec_list *instructions, */ assert(this-type-specifier-structure == NULL || decl_type != NULL || state-error); - if (this-type-specifier-structure == NULL) { -if (decl_type != NULL) { - _mesa_glsl_warning(loc, state, empty declaration); -} else { - _mesa_glsl_error(loc, state, -invalid type `%s' in empty declaration, -type_name); -} - } - if (this-type-qualifier.precision != ast_precision_none - this-type-specifier-structure != NULL) { - _mesa_glsl_error(loc, state, precision qualifiers can't be applied - to structures); + if (decl_type == NULL) { + _mesa_glsl_error(loc, state, + invalid type `%s' in empty declaration, + type_name); + } else if (this-type-qualifier.precision != ast_precision_none) { + if (this-type-specifier-structure != NULL) +_mesa_glsl_error(loc, state, + precision qualifiers can't be applied + to structures); + else { +static const char *const precision_names[] = { + highp, + highp, + mediump, + lowp +}; + +_mesa_glsl_warning(loc, state, + empty declaration with precision qualifier, + to set the default precision, use + `precision %s %s;', + precision_names[this-type-qualifier.precision], + type_name); + } + } else { + _mesa_glsl_warning(loc, state, empty declaration); } } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Enable GLX TLS by default in Mesa?
On 08/13/2013 12:04 PM, Vedran Rodic wrote: On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote: As far as I know, --enable-glx-tls just makes things more efficient. Nothing should *rely* on it, or even be able to detect it... Dota 2 crashes without that option when loading the actual game map. I assumed it adds thread safety. Ian explained to me once that Mesa and the X server's GLX must be using compatible TLS options, otherwise disaster occurs. I don't recall anymore his explanation, or which were the dangerous combinations. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
On Tue, Aug 13, 2013 at 10:33 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 01:27 PM, Roland Mainz wrote: On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 11:53 AM, Uwe Schmidt wrote: [snip] Until S3 grants it's IP to OIN or the patents expire, this is going to be the situation. We've been through this mental exercise of the last 5 years more times than I can count, and we always come back to the same place. Please don't hit me with a stick if this has been asked |powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code (sample implementation, ... do not use without a license from S3 and the ritual scarification of at least one software engineer...) but having it off in the default build won't work... right ? That is more difficult for end-users than the current situation. In the current situation, your distro can build Mesa (no S3TC), and, if you live in a country without software patents, you can just drop in the libtxc_dxtn library to get compression. Putting it in Mesa, along with making the distros really uncomfortable, would mean you'd have to rebuild Mesa. Sounds reasonable for me... Did I mention that we've been through this mental exercise a few times? Erm... I'm wondering... why does the S3TC issue come up every few months out of it's grave and haunt the list (and your nerves) ? Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, CJAVASunUnix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Add Gen7 depth stall flushes before disabling depth in BLORP.
The series is Reviewed-by: Chad Versace chad.vers...@linux.intel.com Yet another instance of the reason to unify blorp with normal draw. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
On 08/13/2013 01:42 PM, Roland Mainz wrote: On Tue, Aug 13, 2013 at 10:33 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 01:27 PM, Roland Mainz wrote: On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote: On 08/13/2013 11:53 AM, Uwe Schmidt wrote: [snip] Until S3 grants it's IP to OIN or the patents expire, this is going to be the situation. We've been through this mental exercise of the last 5 years more times than I can count, and we always come back to the same place. Please don't hit me with a stick if this has been asked |powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code (sample implementation, ... do not use without a license from S3 and the ritual scarification of at least one software engineer...) but having it off in the default build won't work... right ? That is more difficult for end-users than the current situation. In the current situation, your distro can build Mesa (no S3TC), and, if you live in a country without software patents, you can just drop in the libtxc_dxtn library to get compression. Putting it in Mesa, along with making the distros really uncomfortable, would mean you'd have to rebuild Mesa. Sounds reasonable for me... Did I mention that we've been through this mental exercise a few times? Erm... I'm wondering... why does the S3TC issue come up every few months out of it's grave and haunt the list (and your nerves) ? I didn't bring it up. Don't ask me! :) Bye, Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Enable GLX TLS by default in Mesa?
On 08/13/2013 12:04 PM, Vedran Rodic wrote: On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote: As far as I know, --enable-glx-tls just makes things more efficient. Nothing should *rely* on it, or even be able to detect it... Dota 2 crashes without that option when loading the actual game map. I assumed it adds thread safety. With TLS the context pointer and the dispatch pointer are stored in thread local storage. Looking them up (which happens on every GL call) is fast. Without TLS the context pointer and the dispatch pointer are stored using pthread_setspecific / pthread_getspecific. Looking them up is hella slow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
Erm... I'm wondering... why does the S3TC issue come up every few months out of it's grave and haunt the list (and your nerves) ? I think it is because the issue looks deceptively simple. Hardware is hardware, right? ASICs do the decompression, not software. Surely blindly copying bits from one device to another *can't* be patent infringement. Surely, right? :\ Patrick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported
On 08/13/2013 11:04 AM, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com Should get rid of some float-to-int conversions (with negation). No piglit regressions (with llvmpipe). --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d9b4ed2..65ba449 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -420,8 +420,6 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); - void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); - void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); - if (native_integers) - try_emit_float_set(ir, op, dst); - return inst; } @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } - /** - * Emits the code to convert the result of float SET instructions to integers. - */ -void -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, -st_dst_reg dst) -{ - if ((op == TGSI_OPCODE_SEQ || -op == TGSI_OPCODE_SNE || -op == TGSI_OPCODE_SGE || -op == TGSI_OPCODE_SLT)) - { - st_src_reg src = st_src_reg(dst); - src.negate = ~src.negate; - dst.type = GLSL_TYPE_FLOAT; - emit(ir, TGSI_OPCODE_F2I, dst, src); - } -} - /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, #define case2fi(f, i) case4(f, f, i, i) #define case2iu(i, u) case4(i, LAST, i, u) +#define casecomp(c, f, i, u) \ + case TGSI_OPCODE_##c: \ + if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \ + else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \ + else if (native_integers) \ + op = TGSI_OPCODE_##f; \ + else op = TGSI_OPCODE_##c; \ + break; + Would you mind cleaning up the formatting of that macro... case x: if (type == GLSL_TYPE_INT) op = ... else if (type == GLSL_TYPE_UINT) op = ... else if (native_integers) op = ... else op = ... break; switch(op) { case2fi(ADD, UADD); case2fi(MUL, UMUL); @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(MAX, IMAX, UMAX); case3(MIN, IMIN, UMIN); case2iu(MOD, UMOD); - - case2fi(SEQ, USEQ); - case2fi(SNE, USNE); - case3(SGE, ISGE, USGE); - case3(SLT, ISLT, USLT); - + + casecomp(SEQ, FSEQ, USEQ, USEQ); + casecomp(SNE, FSNE, USNE, USNE); + casecomp(SGE, FSGE, ISGE, USGE); + casecomp(SLT, FSLT, ISLT, USLT); + case2iu(ISHR, USHR); case2fi(SSG, ISSG); Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported
Am 13.08.2013 23:38, schrieb Brian Paul: On 08/13/2013 11:04 AM, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com Should get rid of some float-to-int conversions (with negation). No piglit regressions (with llvmpipe). --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++-- 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index d9b4ed2..65ba449 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -420,8 +420,6 @@ public: void emit_scalar(ir_instruction *ir, unsigned op, st_dst_reg dst, st_src_reg src0, st_src_reg src1); - void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst); - void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0); void emit_scs(ir_instruction *ir, unsigned op, @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, this-instructions.push_tail(inst); - if (native_integers) - try_emit_float_set(ir, op, dst); - return inst; } @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op) return emit(ir, op, undef_dst, undef_src, undef_src, undef_src); } - /** - * Emits the code to convert the result of float SET instructions to integers. - */ -void -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op, - st_dst_reg dst) -{ - if ((op == TGSI_OPCODE_SEQ || -op == TGSI_OPCODE_SNE || -op == TGSI_OPCODE_SGE || -op == TGSI_OPCODE_SLT)) - { - st_src_reg src = st_src_reg(dst); - src.negate = ~src.negate; - dst.type = GLSL_TYPE_FLOAT; - emit(ir, TGSI_OPCODE_F2I, dst, src); - } -} - /** * Determines whether to use an integer, unsigned integer, or float opcode * based on the operands and input opcode, then emits the result. @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, #define case2fi(f, i) case4(f, f, i, i) #define case2iu(i, u) case4(i, LAST, i, u) +#define casecomp(c, f, i, u) \ + case TGSI_OPCODE_##c: \ + if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \ + else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \ + else if (native_integers) \ + op = TGSI_OPCODE_##f; \ + else op = TGSI_OPCODE_##c; \ + break; + Would you mind cleaning up the formatting of that macro... case x: if (type == GLSL_TYPE_INT) op = ... else if (type == GLSL_TYPE_UINT) op = ... else if (native_integers) op = ... else op = ... break; Ok. I copied it from the case4 macro right above it that's why only one case (the new one) has any indentation :-). Roland switch(op) { case2fi(ADD, UADD); case2fi(MUL, UMUL); @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, unsigned op, case3(MAX, IMAX, UMAX); case3(MIN, IMIN, UMIN); case2iu(MOD, UMOD); - - case2fi(SEQ, USEQ); - case2fi(SNE, USNE); - case3(SGE, ISGE, USGE); - case3(SLT, ISLT, USLT); - + + casecomp(SEQ, FSEQ, USEQ, USEQ); + casecomp(SNE, FSNE, USNE, USNE); + casecomp(SGE, FSGE, ISGE, USGE); + casecomp(SLT, FSLT, ISLT, USLT); + case2iu(ISHR, USHR); case2fi(SSG, ISSG); Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
I don't think this is our problem. If a distro wants S3TC support, its maintainers can package libtxc_dxtn. Some distros really do it. If a distro doesn't want S3TC support, there is nothing we can do about it. Marek On Tue, Aug 13, 2013 at 8:53 PM, Uwe Schmidt simon.schm...@cs-systemberatung.de wrote: Hi, I have read about the issue of implementing the S3TC Extension in Mesa: http://dri.freedesktop.org/wiki/S3TC/ As I understood, the problem is, that encoding and decoding S3TC in software is covered by patents, while passing S3TC compressed data to the GPU is still ok. AS NOW: If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded texture works if format==internalFormat is true. If format!=internalFormat is true, it would fail (as i know). SO MY PROPOSAL: If 'format' is one of the S3TC types, and format!=internalFormat is true, then set internalFormat:=format. Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't, set internalFormat:=format (or any other format, Mesa3D can encode). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Patch] Sharing flags should disable tiling
On Mon, Aug 12, 2013 at 3:05 PM, Marek Olšák mar...@gmail.com wrote: On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin stephane.marche...@gmail.com wrote: Other than hybrid systems (of which there are none with i915 graphics), is there any case where __DRI_IMAGE_USE_SHARE can occur? You could do interesting things like cross-process sharing with it. I think it's worth doing it, no matter what. It's easy to pick up now, and hard to fix up later. Cross-process sharing is mandatory already and exposed via resource_from_handle and resource_get_handle. I don't think this is useful for cross-process sharing anyway, because it disables tiling. No, we need a different flag for this. I can't speak to the gallium flag, but the __DRI_IMAGE_USE_SHARE flag is use for same-gpu cross process sharing under wayland, either using GEM names or Prime fd passing. We can't drop tiling in this case, there's an obvious performance penalty. We need a flag to indicate that a buffer will be used on multiple GPUs. The next level up in the stack (src/egl/drivers/dri2/platform_wayland.c in case of Wayland) needs to know whether or not a buffer will be used in this way and pass the flag when applicable. Kristian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] gbm: fix linking
Armin K wrote: Link to internal libwayland-drm library if Wayland EGL platform is enabled. The library needs to be built before gbm. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67962 In going through the wayland manual build directions[1] the other day, I hit this bug when trying to build weston against current git mesa. This patch fixed the build breakage. So... Tested-by: Bryce Harrington b.harring...@samsung.com Bryce 1: http://wayland.freedesktop.org/building.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT
On Mon, Aug 12, 2013 at 3:07 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com IVB/BYT also has the same L3 cacheability control in MOCS as HSW, so let's make use of it. According to the discussion we had on #intel-gfx a few weeks ago, on IVB all Mesa memory is already marked as cached in DRM allocated PTEs. So this should not have any effect. Or I'm misunderstanding something. As I understand, marking everything uncacheable and then marking just certain things cacheable could make a difference (since AFAIK, you can't mark select regions as uncacheable after you mark PTEs as cacheable on IVB). Can somebody more knowledgeable comment? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.
128 bpp formats are not allowed to be Y-tiled on any architectures except Gen7. +11 Piglits on Sandybridge (mostly regression fixes since the switch to Y-tiling). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867 Cc: Topi Pohjolainen topi.pohjolai...@intel.com Cc: Chad Versace chad.vers...@linux.intel.com Cc: Paul Berry stereotype...@gmail.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index d6643ca..86a2d53 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw, if (brw-gen 6) return I915_TILING_X; + /* From the Sandybridge PRM, Volume 1, Part 2, page 32: +* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX +* or Linear. +* 128 bits per pixel translates to 16 bytes per pixel. This is necessary +* all the way back to 965, but is explicitly permitted on Gen7. +*/ + if (brw-gen != 7 mt-cpp = 16) + return I915_TILING_X; + return I915_TILING_Y | I915_TILING_X; } -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements
On Tue, Aug 13, 2013 at 1:35 PM, Ian Romanick i...@freedesktop.org wrote: and the spec doesn't explicitly forbid it. I was surprised by this, so I verified it. In the GLSL ES 3.0 spec: single_declaration fully_specified_type type_specifier precision_qualifier type_specifier_no_prec precision_qualifier highp, mediump, lowp type_specifier_no_prec type_specifier_nonarray expands to list of built-in types Seems weird, but legitimate. Have we actually seen 'highp float;' in the wild (outside of piglit)? Assuming that the two instances of highp in precision_names is intentional (or was not, but is fixed) Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/9] glsl: Merge precision qualifiers too
On Fri, Aug 9, 2013 at 4:38 PM, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com We never noticed this before because we previously didn't enfoce GLSL ES fragement shader requirements that precision be defined. There may also have been some interaction here with the addition of GL_ARB_shading_language_420pack, but it doesn't appear to me that it added any new bugs (just perhaps uncovered some old ones). Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Matt Turner matts...@gmail.com Cc: 9.2 mesa-sta...@lists.freedesktop.org --- Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.
Sandybridge is the only platform that supports an IF instruction with an embedded comparison. In this case, we need to emit a CMP to go along with the SEL. Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16, fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and isinf-and-isnan fs_fbo. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index a36c248..984b08a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel() emit(MOV(src0, then_mov-src[0])); } - fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); - sel-predicate = if_inst-predicate; - sel-predicate_inverse = if_inst-predicate_inverse; - sel-conditional_mod = if_inst-conditional_mod; + fs_inst *sel; + if (if_inst-conditional_mod) { + /* Sandybridge-specific IF with embedded comparison */ + emit(CMP(reg_null_d, if_inst-src[0], if_inst-src[1], + if_inst-conditional_mod)); + sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); + sel-predicate = BRW_PREDICATE_NORMAL; + } else { + /* Separate CMP and IF instructions */ + sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); + sel-predicate = if_inst-predicate; + sel-predicate_inverse = if_inst-predicate_inverse; + } } } -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Another Take on the S3TC issue
Maybe it's time let S3TC go away and think about something else. Like BPTC. I've started something. Currently, this is very dirty. Please, take a quick look at this : https://docs.google.com/file/d/0B1BiksMm0x0GVjZPZHgyNm0xLVE/edit?usp=sharing Yeah, I know. Ugly. But it basically works. I can nearly recognize this beautiful flower, after the original image passes throught an encoding and a decoding pass. I preferred to start it as a standalone codec to be sure i'm concentrating on it (and not the surrounding code of the mesa tree !). Currently I've just worked on BC7 parts, not BC6H but I may start soon. I'll make the encoder-decoder avaible when it will be in better shapes, on github. I'm going to improve things before the end of this month. When it will be more acceptable, I will start porting the standalone codec into the mesa code (on github again, possibly with many other things regarding mesa softpipe). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: fix border color with normalized texture formats
From: Roland Scheidegger srol...@vmware.com We need to put border color into texture format color space which essentially means clamping for non-float, normalized formats (not entirely sure if we're also meant to quantize the float but it's probably ok not to do it thankfully). For OpenGL we could do this easily outside generated code due to the 1:1 sampler/texture correspondence but not for d3d10 which is terrible (as we recalculate a constant over and over again per shader invocation). Fortunately border color should be rare enough that we don't care THAT much. --- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 66 + 1 file changed, 53 insertions(+), 13 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 65d6e7b..2a4462b 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -179,24 +179,64 @@ lp_build_sample_texel_soa(struct lp_build_sample_context *bld, */ if (use_border) { - /* select texel color or border color depending on use_border */ - LLVMValueRef border_color_ptr = + /* select texel color or border color depending on use_border. */ + LLVMValueRef border_color_ptr = bld-dynamic_state-border_color(bld-dynamic_state, bld-gallivm, sampler_unit); + const struct util_format_description *format_desc; int chan; + format_desc = util_format_description(bld-static_texture_state-format); + /* + * Only replace channels which are actually present. The others should + * get optimized away eventually by sampler_view swizzle anyway but it's + * easier too as we'd need some extra logic for channels where we can't + * determine the format directly otherwise. + */ for (chan = 0; chan 4; chan++) { - LLVMValueRef border_chan = -lp_build_array_get(bld-gallivm, border_color_ptr, - lp_build_const_int32(bld-gallivm, chan)); - LLVMValueRef border_chan_vec = -lp_build_broadcast_scalar(bld-float_vec_bld, border_chan); - - if (!bld-texel_type.floating) { -border_chan_vec = LLVMBuildBitCast(builder, border_chan_vec, - bld-texel_bld.vec_type, ); + unsigned chan_s; + /* reverse-map channel... */ + for (chan_s = 0; chan_s 4; chan_s++) { +if (chan_s == format_desc-swizzle[chan]) { + break; +} + } + if (chan_s = 3) { +LLVMValueRef border_chan = + lp_build_array_get(bld-gallivm, border_color_ptr, + lp_build_const_int32(bld-gallivm, chan)); +LLVMValueRef border_chan_vec = + lp_build_broadcast_scalar(bld-float_vec_bld, border_chan); + +if (!bld-texel_type.floating) { + border_chan_vec = LLVMBuildBitCast(builder, border_chan_vec, + bld-texel_bld.vec_type, ); +} +else { + /* +* For normalized format need to clamp border color (technically +* probably should also quantize the data). Really sucks doing this +* here but can't avoid at least for now since this is part of +* sampler state and texture format is part of sampler_view state. +*/ + unsigned chan_type = format_desc-channel[chan_s].type; + unsigned chan_norm = format_desc-channel[chan_s].normalized; + if (chan_type == UTIL_FORMAT_TYPE_SIGNED chan_norm) { + LLVMValueRef clamp_min; + clamp_min = lp_build_const_vec(bld-gallivm, bld-texel_type, -1.0F); + border_chan_vec = lp_build_clamp(bld-texel_bld, border_chan_vec, + clamp_min, + bld-texel_bld.one); + } + else if (chan_type == UTIL_FORMAT_TYPE_UNSIGNED chan_norm) { + border_chan_vec = lp_build_clamp(bld-texel_bld, border_chan_vec, + bld-texel_bld.zero, + bld-texel_bld.one); + } + /* not exactly sure about all others but I think should be ok? */ +} +texel_out[chan] = lp_build_select(bld-texel_bld, use_border, + border_chan_vec, texel_out[chan]); } - texel_out[chan] = lp_build_select(bld-texel_bld, use_border, - border_chan_vec, texel_out[chan]); } } } -- 1.7.9.5 ___ mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements
On 08/13/2013 03:50 PM, Matt Turner wrote: On Tue, Aug 13, 2013 at 1:35 PM, Ian Romanick i...@freedesktop.org wrote: and the spec doesn't explicitly forbid it. I was surprised by this, so I verified it. In the GLSL ES 3.0 spec: single_declaration fully_specified_type type_specifier precision_qualifier type_specifier_no_prec precision_qualifier highp, mediump, lowp type_specifier_no_prec type_specifier_nonarray expands to list of built-in types Seems weird, but legitimate. C allows empty declarations too. I believe it's a side-effect of function prototypes without formal parameter names. If you can do int foo(int, float, struct S *); it's easy to end up with a parser that can also do int; float; struct S *; It's actually more work to reject those (or generate a warning). Have we actually seen 'highp float;' in the wild (outside of piglit)? Not that I know of. Assuming that the two instances of highp in precision_names is intentional (or was not, but is fixed) I had to put something in the ast_precision_none slot, and that seemed as good a choice as any. didn't seem too good. :) Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/hsw: Populate MOCS for STATE_BASE_ADDRESS (v2)
From: Ville Syrjälä ville.syrj...@linux.intel.com Just spotted these unpopulated MOCS fields when comparing the code against BSpec. Set the MOCS to the same as everywhere else in Haswell: L3-cacheable. v2: Annotate state packet fields (chadv). Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Chad Versace chad.vers...@linux.intel.com --- Ville, I added comments to explain what new fields get set. If this looks good to you, then I'll commit it. src/mesa/drivers/dri/i965/brw_misc_state.c | 7 +-- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 7 ++- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 3bf37b9..9854e26 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -1038,13 +1038,16 @@ static void upload_state_base_address( struct brw_context *brw ) */ if (brw-gen = 6) { + uint8_t mocs = brw-is_haswell ? GEN7_MOCS_L3 : 0; + if (brw-gen == 6) intel_emit_post_sync_nonzero_flush(brw); BEGIN_BATCH(10); OUT_BATCH(CMD_STATE_BASE_ADDRESS 16 | (10 - 2)); - /* General state base address: stateless DP read/write requests */ - OUT_BATCH(1); + OUT_BATCH(mocs 8 | /* General State Memory Object Control State */ + mocs 4 | /* Stateless Data Port Access Memory Object Control State */ + 1); /* General State Base Address Modif Enable */ /* Surface state base address: * BINDING_TABLE_STATE * SURFACE_STATE diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index a4a9081..b82323d 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -74,9 +74,14 @@ void gen6_blorp_emit_state_base_address(struct brw_context *brw, const brw_blorp_params *params) { + uint8_t mocs = brw-is_haswell ? GEN7_MOCS_L3 : 0; + BEGIN_BATCH(10); OUT_BATCH(CMD_STATE_BASE_ADDRESS 16 | (10 - 2)); - OUT_BATCH(1); /* GeneralStateBaseAddressModifyEnable */ + OUT_BATCH(mocs 8 | /* GeneralStateMemoryObjectControlState */ + mocs 4 | /* StatelessDataPortAccessMemoryObjectControlState */ + 1); /* GeneralStateBaseAddressModifEnable */ + /* SurfaceStateBaseAddress */ OUT_RELOC(brw-batch.bo, I915_GEM_DOMAIN_SAMPLER, 0, 1); /* DynamicStateBaseAddress */ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.
On 08/13/2013 03:37 PM, Kenneth Graunke wrote: 128 bpp formats are not allowed to be Y-tiled on any architectures except Gen7. +11 Piglits on Sandybridge (mostly regression fixes since the switch to Y-tiling). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867 Also https://bugs.freedesktop.org/show_bug.cgi?id=64261? Cc: Topi Pohjolainen topi.pohjolai...@intel.com Cc: Chad Versace chad.vers...@linux.intel.com Cc: Paul Berry stereotype...@gmail.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index d6643ca..86a2d53 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw, if (brw-gen 6) return I915_TILING_X; + /* From the Sandybridge PRM, Volume 1, Part 2, page 32: +* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX +* or Linear. +* 128 bits per pixel translates to 16 bytes per pixel. This is necessary +* all the way back to 965, but is explicitly permitted on Gen7. +*/ + if (brw-gen != 7 mt-cpp = 16) + return I915_TILING_X; brw-gen 7? It seems reasonable to expect future hardware to not re-introduce this restriction, right? + return I915_TILING_Y | I915_TILING_X; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT
On 08/13/2013 03:31 PM, Vedran Rodic wrote: On Mon, Aug 12, 2013 at 3:07 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com IVB/BYT also has the same L3 cacheability control in MOCS as HSW, so let's make use of it. According to the discussion we had on #intel-gfx a few weeks ago, on IVB all Mesa memory is already marked as cached in DRM allocated PTEs. So this should not have any effect. Or I'm misunderstanding something. As I understand, marking everything uncacheable and then marking just certain things cacheable could make a difference (since AFAIK, you can't mark select regions as uncacheable after you mark PTEs as cacheable on IVB). Can somebody more knowledgeable comment? On Ivybridge, the PTEs mark only contexts as LLC+L3 cacheable. Everything else is marked as cacheable in LLC, but not L3. So, Ville's patches will give a perf boost to Mesa running on any kernel that continues that cacheing policy. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.
On 08/13/2013 05:45 PM, Ian Romanick wrote: On 08/13/2013 03:37 PM, Kenneth Graunke wrote: 128 bpp formats are not allowed to be Y-tiled on any architectures except Gen7. +11 Piglits on Sandybridge (mostly regression fixes since the switch to Y-tiling). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867 Also https://bugs.freedesktop.org/show_bug.cgi?id=64261? Cc: Topi Pohjolainen topi.pohjolai...@intel.com Cc: Chad Versace chad.vers...@linux.intel.com Cc: Paul Berry stereotype...@gmail.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index d6643ca..86a2d53 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw, if (brw-gen 6) return I915_TILING_X; + /* From the Sandybridge PRM, Volume 1, Part 2, page 32: +* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX +* or Linear. +* 128 bits per pixel translates to 16 bytes per pixel. This is necessary +* all the way back to 965, but is explicitly permitted on Gen7. +*/ + if (brw-gen != 7 mt-cpp = 16) + return I915_TILING_X; brw-gen 7? It seems reasonable to expect future hardware to not re-introduce this restriction, right? Future hardware does re-introduce it. Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.
On 08/13/2013 04:47 PM, Kenneth Graunke wrote: Sandybridge is the only platform that supports an IF instruction with an embedded comparison. In this case, we need to emit a CMP to go along with the SEL. Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16, fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and isinf-and-isnan fs_fbo. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index a36c248..984b08a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel() emit(MOV(src0, then_mov-src[0])); } - fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); - sel-predicate = if_inst-predicate; - sel-predicate_inverse = if_inst-predicate_inverse; - sel-conditional_mod = if_inst-conditional_mod; + fs_inst *sel; + if (if_inst-conditional_mod) { + /* Sandybridge-specific IF with embedded comparison */ This doesn't appear to be SNB-specific code. Can you explain this? Is if_inst-conditional_mod only set on SNB? I really need to learn more about the back end... + emit(CMP(reg_null_d, if_inst-src[0], if_inst-src[1], + if_inst-conditional_mod)); + sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); + sel-predicate = BRW_PREDICATE_NORMAL; + } else { + /* Separate CMP and IF instructions */ + sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); + sel-predicate = if_inst-predicate; + sel-predicate_inverse = if_inst-predicate_inverse; + } } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.
On 08/13/2013 05:47 PM, Chad Versace wrote: On 08/13/2013 05:45 PM, Ian Romanick wrote: On 08/13/2013 03:37 PM, Kenneth Graunke wrote: 128 bpp formats are not allowed to be Y-tiled on any architectures except Gen7. +11 Piglits on Sandybridge (mostly regression fixes since the switch to Y-tiling). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867 Also https://bugs.freedesktop.org/show_bug.cgi?id=64261? Cc: Topi Pohjolainen topi.pohjolai...@intel.com Cc: Chad Versace chad.vers...@linux.intel.com Cc: Paul Berry stereotype...@gmail.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index d6643ca..86a2d53 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw, if (brw-gen 6) return I915_TILING_X; + /* From the Sandybridge PRM, Volume 1, Part 2, page 32: +* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX +* or Linear. +* 128 bits per pixel translates to 16 bytes per pixel. This is necessary +* all the way back to 965, but is explicitly permitted on Gen7. +*/ + if (brw-gen != 7 mt-cpp = 16) + return I915_TILING_X; brw-gen 7? It seems reasonable to expect future hardware to not re-introduce this restriction, right? Future hardware does re-introduce it. Of course. Lol. Reviewed-by: Ian Romanick ian.d.roman...@intel.com Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT
On 08/12/2013 06:07 AM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com IVB/BYT also has the same L3 cacheability control in MOCS as HSW, so let's make use of it. pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most other things show less gains/no regressions, except furmark which loses some 10 points. I didn't have a BYT at hand for testing. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 2 +- src/mesa/drivers/dri/i965/brw_misc_state.c| 2 +- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 4 ++-- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 6 +++--- src/mesa/drivers/dri/i965/gen7_misc_state.c | 2 +- src/mesa/drivers/dri/i965/gen7_vs_state.c | 2 +- src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 4 ++-- 8 files changed, 12 insertions(+), 12 deletions(-) Conceptually, the patch looks good. The (intel-gen == 7) checks should be removed from the changes in the gen7 files. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.
On 08/13/2013 05:49 PM, Ian Romanick wrote: On 08/13/2013 04:47 PM, Kenneth Graunke wrote: Sandybridge is the only platform that supports an IF instruction with an embedded comparison. In this case, we need to emit a CMP to go along with the SEL. Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16, fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and isinf-and-isnan fs_fbo. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index a36c248..984b08a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel() emit(MOV(src0, then_mov-src[0])); } - fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]); - sel-predicate = if_inst-predicate; - sel-predicate_inverse = if_inst-predicate_inverse; - sel-conditional_mod = if_inst-conditional_mod; + fs_inst *sel; + if (if_inst-conditional_mod) { + /* Sandybridge-specific IF with embedded comparison */ This doesn't appear to be SNB-specific code. Can you explain this? Is if_inst-conditional_mod only set on SNB? I really need to learn more about the back end... Normally, control flow looks like: cmp.l.f0(8) null g58,8,1F0F (+f0) if(8) For Sandybridge, the hardware designers extended IF to support built-in comparisons, so you can simply do: if.l(8)g58,8,1F0F They immediately dropped this with Ivybridge; it's not been present on any other platform. fs_inst::conditional_mod represents that conditional modifier (always = 0, never, less, equal, lequal, greater, notequal, gequal). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965/gen7: Don't use L3$ for render targets
On 08/12/2013 06:07 AM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com According to HSW Bspec L3$ evictions may land in LLC regardless of LLC MOCS/PTE settings. That means we shouldn't set scanout buffers as L3 cacheable when writing to them. So far I've been unable to observe this phenomenon on my IVB, but better safe than sorry. Especially since this doesn't appear to hurt performance. Ideally this should be limited to scanout buffers, but that information is not availabe to Mesa. Limiting it to winsys buffers might be a reasonable comporomise, but MOCS setup appears to be done at a lower layer where that information is already lost, and I was too lazy to start passing that infromation down. Let's try harder to add that plumbing. I'll try to think of something tomorrow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states
It's the same as in r600g. I can't merge si_atom with si_pm4_state, because the latter is too big and isn't even driven by the dirty flag. Also I'm gonna share the whole streamout state handling with r600g (just one atom though) and therefore I need the same interface. The advantage is that almost no streamout code will be needed in radeonsi and the old code can be removed. We will also need to port r600_flush_emit and the associated code from r600g, so that cache flushing takes places before state emission (I think this will be required for streamout). --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 src/gallium/drivers/radeonsi/radeonsi_pipe.h | 10 ++ src/gallium/drivers/radeonsi/si_state.h| 8 src/gallium/drivers/radeonsi/si_state_draw.c | 8 +++- 4 files changed, 33 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 19e9d1c..c9a613b 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -114,9 +114,17 @@ err: void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in) { + int i; + /* The number of dwords we already used in the CS so far. */ num_dw += ctx-cs-cdw; + for (i = 0; i SI_NUM_ATOMS(ctx); i++) { + if (ctx-atoms.array[i]-dirty) { + num_dw += ctx-atoms.array[i]-num_dw; + } + } + if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index e370149..b4a6e0c 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -132,6 +132,8 @@ struct r600_constbuf_state uint32_tdirty_mask; }; +#define SI_NUM_ATOMS(rctx) (sizeof((rctx)-atoms)/sizeof((rctx)-atoms.array[0])) + struct r600_context { struct pipe_context context; struct blitter_context *blitter; @@ -145,6 +147,14 @@ struct r600_context { void*custom_blend_decompress; struct r600_screen *screen; struct radeon_winsys*ws; + + union { + struct { + /* Place atoms here. */ + }; + struct si_atom *array[0]; + } atoms; + struct si_vertex_element*vertex_elements; struct pipe_framebuffer_state framebuffer; unsignedfb_log_samples; diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index b01fbf2..09ef56e 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -29,6 +29,14 @@ #include radeonsi_pm4.h +/* This encapsulates a state or an operation which can emitted into the GPU + * command stream. */ +struct si_atom { + void (*emit)(struct r600_context *ctx, struct si_atom *state); + unsignednum_dw; + booldirty; +}; + struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 2007dc4..b951a39 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -665,7 +665,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) { struct r600_context *rctx = (struct r600_context *)ctx; struct pipe_index_buffer ib = {}; - uint32_t cp_coher_cntl; + uint32_t cp_coher_cntl, i; if (!info-count (info-indexed || !info-count_from_stream_output)) return; @@ -729,6 +729,12 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) si_need_cs_space(rctx, 0, TRUE); + for (i = 0; i SI_NUM_ATOMS(rctx); i++) { + if (rctx-atoms.array[i]-dirty) { + rctx-atoms.array[i]-emit(rctx, rctx-atoms.array[i]); + } + } + si_pm4_emit_dirty(rctx); rctx-pm4_dirty_cdwords = 0; -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g/llvm: Add missing %s format string to fprintf.
On Sun, Aug 11, 2013 at 07:37:01PM +0200, Jon Severinsson wrote: This fixes a compilation warning with -Wformat-security. CC: 9.2 mesa-sta...@lists.freedesktop.org Reviewed-by: Tom Stellard thomas.stell...@amd.com I've pushed this patch, thanks. -Tom --- src/gallium/drivers/radeon/radeon_llvm_emit.c |2 +- 1 fil ändrad, 1 tillägg(+), 1 borttagning(-) diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c index 1a4d4fdd..2dd7bf7b 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c @@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_llvm_binary *binary, r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err, out_buffer); if (r) { - fprintf(stderr, err); + fprintf(stderr, %s, err); FREE(err); return 1; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon/llvm: fix compile error with -Werror=format-security
On Tue, Aug 13, 2013 at 06:25:28PM +0200, Maarten Lankhorst wrote: Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com An identical patch was sent to the list a few days ago, and I've just pushed it now. -Tom --- diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c index 1a4d4fd..2dd7bf7 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c @@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_llvm_binary *binary, r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err, out_buffer); if (r) { - fprintf(stderr, err); + fprintf(stderr, %s, err); FREE(err); return 1; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure: link against -lLLVM to determine build type
On Tue, Aug 13, 2013 at 05:53:52PM +0200, Maarten Lankhorst wrote: Fixes a build failure of 9.2 on ubuntu, because libLLVM-3.3.so is not present in /usr/lib/llvm-3.2/lib. I'm trying to understand the problem here, could you give a little more information about how Ubuntu packages LLVM? Where are the LLVM libraries installed and what does llvm-config --libdir --ldflags report? -Tom Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- diff --git a/configure.ac b/configure.ac index 35f6797..579d8d4 100644 --- a/configure.ac +++ b/configure.ac @@ -1870,7 +1870,18 @@ if test x$MESA_LLVM != x0; then if test x$with_llvm_shared_libs = xyes; then dnl We can't use $LLVM_VERSION because it has 'svn' stripped out, LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version` -AS_IF([test -f $LLVM_LIBDIR/lib$LLVM_SO_NAME.so], [llvm_have_one_so=yes]) + +AC_MSG_CHECKING([whether $LLVM_SO_NAME is a monolithic blob]) +save_LIBS=$LIBS +save_LDFLAGS=$LDFLAGS +LDFLAGS=$LDFLAGS $LLVM_LDFLAGS +LIBS=$LIBS -l$LLVM_SO_NAME + +AC_LINK_IFELSE([AC_LANG_CALL([], [LLVMInitializeCore])], + [llvm_have_one_so=yes], [llvm_have_one_so=no]) +LIBS=$save_LIBS +LDFLAGS=$save_LDFLAGS +AC_MSG_RESULT([$llvm_have_one_so]) if test x$llvm_have_one_so = xyes; then dnl LLVM was built using auto*, so there is only one shared object. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] r600/radeonsi: implement new float comparison instructions
On Tue, Aug 13, 2013 at 07:04:56PM +0200, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com Also use ordered comparisons for old cmp instructions. Untested. This patch looks good to me, but I would like to do a piglit run on radeonsi before you commit. I will try to do this tomorrow. -Tom --- src/gallium/drivers/r600/r600_shader.c | 18 --- .../drivers/radeon/radeon_setup_tgsi_llvm.c| 49 2 files changed, 48 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 37298cc..fb766c4 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5743,11 +5743,10 @@ static struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[] = { {105, 0, ALU_OP0_NOP, tgsi_unsupported}, {106, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_NOP, 0, ALU_OP0_NOP, tgsi_unsupported}, - /* gap */ - {108, 0, ALU_OP0_NOP, tgsi_unsupported}, - {109, 0, ALU_OP0_NOP, tgsi_unsupported}, - {110, 0, ALU_OP0_NOP, tgsi_unsupported}, - {111, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_FSEQ, 0, ALU_OP2_SETE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSGE, 0, ALU_OP2_SETGE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSLT, 0, ALU_OP2_SETGT_DX10, tgsi_op2_swap}, + {TGSI_OPCODE_FSNE, 0, ALU_OP2_SETNE_DX10, tgsi_op2_swap}, {TGSI_OPCODE_NRM4, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported}, /* gap */ @@ -5936,11 +5935,10 @@ static struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] = { {105, 0, ALU_OP0_NOP, tgsi_unsupported}, {106, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_NOP, 0, ALU_OP0_NOP, tgsi_unsupported}, - /* gap */ - {108, 0, ALU_OP0_NOP, tgsi_unsupported}, - {109, 0, ALU_OP0_NOP, tgsi_unsupported}, - {110, 0, ALU_OP0_NOP, tgsi_unsupported}, - {111, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_FSEQ, 0, ALU_OP2_SETE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSGE, 0, ALU_OP2_SETGE_DX10, tgsi_op2}, + {TGSI_OPCODE_FSLT, 0, ALU_OP2_SETGT_DX10, tgsi_op2_swap}, + {TGSI_OPCODE_FSNE, 0, ALU_OP2_SETNE_DX10, tgsi_op2_swap}, {TGSI_OPCODE_NRM4, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported}, /* gap */ diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c index 7a47746..8ff9abd 100644 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c @@ -850,18 +850,16 @@ static void emit_cmp( LLVMRealPredicate pred; LLVMValueRef cond; - /* XXX I'm not sure whether to do unordered or ordered comparisons, - * but llvmpipe uses unordered comparisons, so for consistency we use - * unordered. (The authors of llvmpipe aren't sure about using - * unordered vs ordered comparisons either. + /* Use ordered for everything but NE (which is usual for + * float comparisons) */ switch (emit_data-inst-Instruction.Opcode) { - case TGSI_OPCODE_SGE: pred = LLVMRealUGE; break; - case TGSI_OPCODE_SEQ: pred = LLVMRealUEQ; break; - case TGSI_OPCODE_SLE: pred = LLVMRealULE; break; - case TGSI_OPCODE_SLT: pred = LLVMRealULT; break; + case TGSI_OPCODE_SGE: pred = LLVMRealOGE; break; + case TGSI_OPCODE_SEQ: pred = LLVMRealOEQ; break; + case TGSI_OPCODE_SLE: pred = LLVMRealOLE; break; + case TGSI_OPCODE_SLT: pred = LLVMRealOLT; break; case TGSI_OPCODE_SNE: pred = LLVMRealUNE; break; - case TGSI_OPCODE_SGT: pred = LLVMRealUGT; break; + case TGSI_OPCODE_SGT: pred = LLVMRealOGT; break; default: assert(!unknown instruction); pred = 0; break; } @@ -872,6 +870,35 @@ static void emit_cmp( cond, bld_base-base.one, bld_base-base.zero, ); } +static void emit_fcmp( + const struct lp_build_tgsi_action *action, + struct lp_build_tgsi_context * bld_base, + struct lp_build_emit_data * emit_data) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMContextRef context = bld_base-base.gallivm-context; + LLVMRealPredicate pred; + + /* Use ordered for everything but NE (which is usual for + * float comparisons) + */ + switch (emit_data-inst-Instruction.Opcode) { + case TGSI_OPCODE_FSEQ: pred = LLVMRealOEQ; break; + case TGSI_OPCODE_FSGE: pred = LLVMRealOGE; break; +
Re: [Mesa-dev] [PATCH v2 1/2] radeonsi: Don't leave gaps between position exports from vertex shader
On Tue, Aug 13, 2013 at 07:39:10PM +0200, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com If the vertex shader exports clip distances but not point size, use position exports 1/2 instead of 2/3 for the clip distances. Fixes geometry corruption in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com I took a look through the LLVM calls in these patches and they look OK to me. Reviewed-by: Tom Stellard thomas.stell...@amd.com --- v2: No need to export unused position vectors, just export to consecutive position export slots. src/gallium/drivers/radeonsi/radeonsi_shader.c | 135 +++-- src/gallium/drivers/radeonsi/radeonsi_shader.h | 1 + src/gallium/drivers/radeonsi/si_state_draw.c | 6 +- 3 files changed, 83 insertions(+), 59 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index fee6262..dd9581d 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -562,12 +562,11 @@ static void si_alpha_test(struct lp_build_tgsi_context *bld_base, } static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, - unsigned index) + LLVMValueRef (*pos)[9], unsigned index) { struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); struct lp_build_context *base = bld_base-base; struct lp_build_context *uint = si_shader_ctx-radeon_bld.soa.bld_base.uint_bld; - LLVMValueRef args[9]; unsigned reg_index; unsigned chan; unsigned const_chan; @@ -582,6 +581,8 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, } for (reg_index = 0; reg_index 2; reg_index ++) { + LLVMValueRef *args = pos[2 + reg_index]; + args[5] = args[6] = args[7] = @@ -612,10 +613,6 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, args[3] = lp_build_const_int32(base-gallivm, V_008DFC_SQ_EXP_POS + 2 + reg_index); args[4] = uint-zero; - lp_build_intrinsic(base-gallivm-builder, -llvm.SI.export, - LLVMVoidTypeInContext(base-gallivm-context), -args, 9); } } @@ -630,17 +627,18 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) struct tgsi_parse_context *parse = si_shader_ctx-parse; LLVMValueRef args[9]; LLVMValueRef last_args[9] = { 0 }; + LLVMValueRef pos_args[4][9] = { { 0 } }; unsigned semantic_name; unsigned color_count = 0; unsigned param_count = 0; int depth_index = -1, stencil_index = -1; + int i; while (!tgsi_parse_end_of_tokens(parse)) { struct tgsi_full_declaration *d = parse-FullToken.FullDeclaration; unsigned target; unsigned index; - int i; tgsi_parse_token(parse); @@ -716,7 +714,7 @@ handle_semantic: target = V_008DFC_SQ_EXP_POS + 2 + d-Semantic.Index; break; case TGSI_SEMANTIC_CLIPVERTEX: - si_llvm_emit_clipvertex(bld_base, index); + si_llvm_emit_clipvertex(bld_base, pos_args, index); shader-clip_dist_write = 0xFF; continue; case TGSI_SEMANTIC_FOG: @@ -734,9 +732,13 @@ handle_semantic: si_llvm_init_export_args(bld_base, d, index, target, args); - if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ? - (semantic_name == TGSI_SEMANTIC_POSITION) : - (semantic_name == TGSI_SEMANTIC_COLOR)) { + if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX + target = V_008DFC_SQ_EXP_POS + target = (V_008DFC_SQ_EXP_POS + 3)) { + memcpy(pos_args[target - V_008DFC_SQ_EXP_POS], +args, sizeof(args)); + } else if (si_shader_ctx-type == TGSI_PROCESSOR_FRAGMENT +semantic_name == TGSI_SEMANTIC_COLOR) { if (last_args[0]) { lp_build_intrinsic(base-gallivm-builder, llvm.SI.export, @@ -806,66 +808,87 @@