[Mesa-dev] RFH: Android support in Apitrace
Hi, Another area of Apitrace I'm struggling with is Android support, because: - I don't have a feeling if / how much it actually matters to users - I have no idea if it even works or not I'm afraid the status quo is unbearable, whereby Android support is officially advertised, but users keep complaining issues building/using it, and I can't help nor even tell them whether it's supposed to work. Finally, there are alternatives for Android out there. In particular Google seems to be actively working on https://github.com/google/gapid So my plan is to: 1) start advertising that Android is no longer supported (still allow the build, but place loud warnings) 2) yank out support completely after some time (a month at max) If by any chance Android still matters and works (or can be easily made to work), then I'm counting on people to step up and give a hand to make supporting it viable. To halt (or revert) the Android deprecation process I'll need the following: 1) up-to-date instructions on how to build and use with recent Android versions / SDKs 2) a Travis-CI build for Android (if somebody helps with item 1 I can help with this item 2) 3) a very simple sanity test case using an Android emulator (Hopefully a test that can be run from Travis-CI job too, but that can wait. As long as any Android user can easily run that test locally and verify things work minimally I'm happy.) I don't have experience with Android, nor the need/time to learn now, so I'll be relying totally on the community for this one too. I don't think these requests are too much to ask if Android truly matters. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] texobj: add verbose api trace messages to several routines
On Sat, Jun 15, 2013 at 8:05 PM, Carl Worth cwo...@cworth.org wrote: José Fonseca jfons...@vmware.com writes: FYI, with https://github.com/apitrace/apitrace/commit/7700f74f294a28e57860487b917c8807156b3ad1 apitrace no longer crashes with Steam overlay. Thanks, José! This will be quite handy. And it's a fix I'd been hoping to make for apitrace, but hadn't gotten around to. It's also nice to see that you got __libc_dlsym working. When I wrote a similar dlsym wrapper for fips I think I tried that but didn't get it to work for some reason. (I ended up with something that feels more fragile by calling dlvsym to find dlsym.) So I'll revisit this with the apitrace code in mind. But the overlay's rendering is not shown/captured when when apitrace uses LD_PRELOAD. Do you know why this is? No, I don't have a clear understanding. There are many ways where multiple LD_PRELOAD objects can interfere with each other (beyond the problem found). If all preload .so did dlsym(RTLD_NEXT, foo) then they would probably nest safely. But given we need to intercept dlsym and/or dlpen, plus functions like glxGetProcAddress. Even if they nested properly, depending on the ordering of apitrace vs overlay one might get the overlay working but not actually recorded on file. But I get the feeling that something is triggering the Steam overlay to not activate at all. I'd definitely be interested in making apitrace work more transparently here. Me too. But there are so many apps that only work well with LD_LIBRARY_PATH, that make think that the best way forward is not to keep adding hacks for LD_PRELOAD to work, but rather making apitrace seemingly use LD_LIBRARY_PATH out of the box. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] texobj: add verbose api trace messages to several routines
On Tue, Mar 5, 2013 at 9:11 AM, Alexander Monakov amona...@ispras.ruwrote: On Mon, 4 Mar 2013, Carl Worth wrote: I don't think it should be too hard to get this to work, (though it may require a source change to the Steam overlay). I'll do some more experiments tomorrow and see if I can't make a concrete recommendation to be able to give to Valve as a bug report. And, obviously, if there's something we can do to make apitrace more robust here, I'll submit that change as well. I faced a similar issue in my lightweight OpenGL offloading via GLX forking project, primus [1], and I'm successfully using the following approach: 1) The interposition is based on LD_LIBRARY_PATH augmentation, LD_PRELOAD is not employed. Pro: no ugly tricks with dlopen/dlsym interposition. Contra: the interposer needs to know the path to the real OpenGL library. For APITrace, /usr/$LIB/libGL.so.1 will usually be a working default choice (NB: $LIB is a special token expanded by the dynamic loader, not the shell). 2) Specifically in order to work with the Steam overlay, the interposer *protects itself from dlsym interposition* by using real_dlsym = dlsym(dlopen(libdl.so.2), dlsym). Similar protection against dlopen interposition would also make sense. Hope that helps, Alexander [1] https://github.com/amonakov/primus ___ apitrace mailing list apitr...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/apitrace FYI, with https://github.com/apitrace/apitrace/commit/7700f74f294a28e57860487b917c8807156b3ad1 apitrace no longer crashes with Steam overlay. But the overlay's rendering is not shown/captured when when apitrace uses LD_PRELOAD. But using LD_LIBRARY_PATH mechanism captures everything just fine. Detailed instructions on https://github.com/apitrace/apitrace/wiki/Steam Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] and a random apitrace/gallium question..
Thanks. That sounds handy. Another potentially very useful improvement along those lines, would be for apitrace to replay GL_KHR_debug's glObjectLabel() calls, and then use those application supplied labels instead of numbers when dumping state (in qapitrace). This would enable the app developer (or gl implementer) to see e.g. rock texture instead texture 99. And it would be relatively easy to implement on top of what you probably have. Of course, this would require GL apps and vendors to use/implement GL_KHR_debug. But given that it is part of OpenGL 4.3 standard, and is a small superset of ARB_debug_output, that shouldn't be an obstacle. Jose On Mon, Jun 3, 2013 at 5:29 PM, Peter Lohrmann pet...@valvesoftware.comwrote: I have a local change which extends ApiTrace to better support the various debug extensions, and have used it to mark up my traces. I don't track the whole log or handle the registering of callback functions, but do allow glDebugMessageInsertARB(..) and a few others to be included in the trace (and will get replayed if the host driver supports the extension). This allows those messages to also be used by other 3rd party debugging tools. At some point I hope to extend qapitrace to utilize these markers in the list of api calls to give a hierarchical view, but that's a ways off. I'll try to isolate that change and create a branch today to make it more widely available. - Peter -Original Message- From: apitrace-bounces+peterl=valvesoftware@lists.freedesktop.org[mailto: apitrace-bounces+peterl=valvesoftware@lists.freedesktop.org] On Behalf Of Rob Clark Sent: Monday, June 03, 2013 8:59 AM To: Jose Fonseca Cc: mesa-dev@lists.freedesktop.org; apitr...@lists.freedesktop.org Subject: Re: [Mesa-dev] and a random apitrace/gallium question.. On Mon, Jun 3, 2013 at 11:56 AM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - On Mon, Jun 3, 2013 at 10:41 AM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - On Fri, May 31, 2013 at 10:18 AM, José Fonseca jose.r.fons...@gmail.com wrote: I'd support such change. Be it through GL_GREMEDY_string_marker, or ARB_debug_output's glDebugMessageInsertARB(DEBUG_SOURCE_THIRD_PARTY_ARB, ...), or KHR_debug's glPushDebugGroup(). A Gallium interface change would be necessary to pass these annotations to the drivers. This discussion would be more appropriate in Mesa-dev mailing list though. I looked at the relevant specs (KHR_debug, ARB_debug_output), and I believe the most natural / standard-compliant way of implementing this would be to rely on glDebugMessageInsertARB(GL_DEBUG_SOURCE_THIRD_PARTY_ARB). hmm, these look more about letting the gl driver send log msgs to the app.. Far from it. The spec is crystal clear on that regard, from http://www.opengl.org/registry/specs/KHR/debug.txt : [...] This extension also defines debug markers, a mechanism for the OpenGL application to annotate the command stream with markers for discrete events. ahh, my bad.. I stopped reading too soon :-P yeah, then it sounds like a good fit BR, -R [...] 5.5.1 - Debug Messages A debug message is uniquely identified by the source that generated it, a type within that source, and an unsigned integer ID identifying the message within that type. The message source is one of the symbolic constants listed in Table 5.3. The message type is one of the symbolic constants listed in Table 5.4. Debug Output Message Source Messages Generated by --- - DEBUG_SOURCE_API_ARB The GL DEBUG_SOURCE_SHADER_COMPILER_ARB The GLSL shader compiler or compilers for other extension-provided languages DEBUG_SOURCE_WINDOW_SYSTEM_ARBThe window system, such as WGL or GLX DEBUG_SOURCE_THIRD_PARTY_ARB External debuggers or third-party middleware libraries DEBUG_SOURCE_APPLICATION_ARB The application DEBUG_SOURCE_OTHER_ARBSources that do not fit to any of the ones listed above Table 5.3: Sources of debug output messages. Each message must originate from a source listed in this table. although maybe it is the best extension we have? It seems to fit our needs quite well AFAICT. KHR_debug is pretty much a superset of everything out there, plus it is part of core OpenGL 4.3. And there is even a source for our needs -- DEBUG_SOURCE_THIRD_PARTY_ARB -- External debuggers... Jose ___ apitrace mailing list apitr...@lists.freedesktop.org http://lists.freedesktop.org
Re: [Mesa-dev] mesa-8.0.5: LLVM-3.2 patchset and request for cherry-picking
(Switching from dead mesa3d-...@sf.net - mesa-...@fdo.org ML) On Thu, Jan 17, 2013 at 11:23 PM, Sedat Dilek sedat.di...@gmail.com wrote: Hi, with the following patchset (13 patches) I was able to build mesa-8.0.5 with LLVM v3.2. There is one big fat patch called gallivm,draw,llvmpipe: Support wider native registers. [1] which makes backporting hard. Jose? I don't understand the exact question you're asking me. Regardless you crossport this particular patch or not, you will get an entirely untested tree, and I can't advise you myself without spending considerable time looking into the code myself -- time which I really don't have. So I'm afraid you'll need to use your best judgment, and maybe do some piglit runs with llvmpipe if you want to be safer than sorry. Jose Regards, - Sedat - [1] http://cgit.freedesktop.org/mesa/mesa/commit/?id=3469715 P.S.: Patchset fixing build of mesa-8.0.5 with LLVM/CLANG v3.2 [ gallium-auxiliary-fixes-for-8-0-5 (PENDING) ] 4b7b71a rtti was removed from more llvm libraries. Thanks to d0k for the hint via IRC #llvm on irc.oftc.net For more details see [1] and followup [2] discussion (Thanks Johannes Obermayr again)! [1] http://lists.freedesktop.org/archives/mesa-dev/2012-October/029167.html [2] http://lists.freedesktop.org/archives/mesa-dev/2012-October/029184.html [ gallivm-fixes-for-8-0-5 (CHERRY-PICKED) ] 920a940 gallivm: Fix createOProfileJITEventListener namespace with llvm-3.1. d998daf gallivm: Add constructor for raw_debug_ostream. af1f68a gallivm: Add MCRegisterInfo.h to silence benign warnings about missing implementation. ad88aac gallivm: Pass in a MCInstrInfo to createMCInstPrinter on llvm-3.1. 395c791 gallivm: Fix method overriding in raw_debug_ostream. 557632f gallivm: Use InitializeNativeTargetDisassembler(). 6c0144a gallivm: Pass in a MCRegisterInfo to MCInstPrinter on llvm-3.1. 1bb5b0d gallivm: Initialize x86 disassembler on x86_64 too. 4d25e57 gallivm: Replace architecture test with PIPE_ARCH_* 192859a gallivm: Fix LLVM-2.7 build. 2dfd7e5 Initialize only native LLVM Disassembler. [ dri-nouveau-fixes-for-8-0-5 (CHERRY-PICKED) ] abd8713 dri/nouveau: don't use nested functions - EOT - ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [ANNOUNCE] apitrace 3.0
On Sun, Mar 11, 2012 at 7:00 PM, José Fonseca jose.r.fons...@gmail.com wrote: On Sun, Mar 11, 2012 at 6:59 PM, Dave Airlie airl...@gmail.com wrote: (resend to include list, I only sent it to Jose by accident). On Fri, Mar 9, 2012 at 8:22 PM, José Fonseca jose.r.fons...@gmail.com wrote: There are several new features in apitrace that deserve an announcement: I had an idea for a feature the other day but no idea how sane or useful it would actually be. I thought about trying to integrate callgrind and apitrace somehow, so we could instrument the CPU usage for a single frame of an application. I think we'd have to run the app under callgrind with instrumentation disabled (still slow) then enable it for the single frame. Just wondering if anyone else thinks its a good idea or know how to implement it. IIUC, you're looking for the ability to profile the app (and not the GL implementation), for a particular frame, is that right? There is definitely interest in more profiling on apitrace: - there have been some patches for GPU/CPU profiling while tracing (see timing-trace branch) but I believe it is more useful to profile while retracing - there have been some patches for GPU profiling while retracing (timing-retrace branch) but needs some good cleanup - Ryan Gordon commited patches for CPU profiling while retracing (commited, glretrace -p option) Concerning callgrind integration, I personally don't have much trust in callgrind as profiling tool -- it emulates a CPU instead of measuring one. It does so using JIT binary translation, which is why is so slow. Lately i've been using linux perf with good results. Either way, I don't know if there's any interface to start/stop measuring or emit some marker from within the app being profiled. A full opengl wrapper like apitrace it's not necessary -- a LD_PRELOAD library with just glXSwapBuffers is sufficient --, all that's necessary is to start/stop/reset the profiler on that call. I think the hard thing here is really start/stop profiling programmatically. Jose Actually it looks like linux perf has a facility to emit custom profiling events from whitin the app: http://lwn.net/Articles/415839/ A capability I saw in this interesting Intel presentation, which also used it with apitrace: https://events.linuxfoundation.org/slides/2011/linuxcon-japan/lcj2011_linming.pdf It should be possible to trim the perf events with perf script somehow Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [ANNOUNCE] apitrace 3.0
On Sun, Mar 11, 2012 at 6:59 PM, Dave Airlie airl...@gmail.com wrote: (resend to include list, I only sent it to Jose by accident). On Fri, Mar 9, 2012 at 8:22 PM, José Fonseca jose.r.fons...@gmail.com wrote: There are several new features in apitrace that deserve an announcement: I had an idea for a feature the other day but no idea how sane or useful it would actually be. I thought about trying to integrate callgrind and apitrace somehow, so we could instrument the CPU usage for a single frame of an application. I think we'd have to run the app under callgrind with instrumentation disabled (still slow) then enable it for the single frame. Just wondering if anyone else thinks its a good idea or know how to implement it. IIUC, you're looking for the ability to profile the app (and not the GL implementation), for a particular frame, is that right? There is definitely interest in more profiling on apitrace: - there have been some patches for GPU/CPU profiling while tracing (see timing-trace branch) but I believe it is more useful to profile while retracing - there have been some patches for GPU profiling while retracing (timing-retrace branch) but needs some good cleanup - Ryan Gordon commited patches for CPU profiling while retracing (commited, glretrace -p option) Concerning callgrind integration, I personally don't have much trust in callgrind as profiling tool -- it emulates a CPU instead of measuring one. It does so using JIT binary translation, which is why is so slow. Lately i've been using linux perf with good results. Either way, I don't know if there's any interface to start/stop measuring or emit some marker from within the app being profiled. A full opengl wrapper like apitrace it's not necessary -- a LD_PRELOAD library with just glXSwapBuffers is sufficient --, all that's necessary is to start/stop/reset the profiler on that call. I think the hard thing here is really start/stop profiling programmatically. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [ANNOUNCE] apitrace 3.0
There are several new features in apitrace that deserve an announcement: * Top-level `apitrace` command which greatly simplifies using apitrace from command line (Carl Worth from Intel Inc.) * Trace and retrace support for EGL, GLES1, and GLES2 APIs on Linux (Chia-I Wu from LunarG Inc., plus portability enhancements from several other interested parties) * Basic ability to trim traces (also from Carl Worth) * Many bug fixes. A tarball is available from https://github.com/apitrace/apitrace/tags and Windows binaries available from https://github.com/apitrace/apitrace/downloads, but I do recommend building from master's branch source whenever possible, as new enhancements are bound to land soon. Feel free to report any issues on https://github.com/apitrace/apitrace/issues or ask questions on http://lists.freedesktop.org/mailman/listinfo/apitrace . Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] ralloc: Use _vscprintf on MinGW.
MinGW uses MSVC's runtime DLLs for most of C runtime's functions, and there has same semantics for vsnprintf. Not sure how this worked until now -- maybe one of the internal vsnprintf implementations was taking precedence. --- src/glsl/ralloc.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/src/glsl/ralloc.c b/src/glsl/ralloc.c index fb48a91..b4486f8 100644 --- a/src/glsl/ralloc.c +++ b/src/glsl/ralloc.c @@ -33,6 +33,12 @@ #include limits.h #endif +/* Some versions of MinGW are missing _vscprintf's declaration, although they + * still provide the symbol in the import library. */ +#ifdef __MINGW32__ +_CRTIMP int _vscprintf(const char *format, va_list argptr); +#endif + #include ralloc.h #ifdef __GNUC__ @@ -397,7 +403,7 @@ printf_length(const char *fmt, va_list untouched_args) va_list args; va_copy(args, untouched_args); -#ifdef _MSC_VER +#ifdef _WIN32 /* We need to use _vcsprintf to calculate the size as vsnprintf returns -1 * if the number of characters to write is greater than count. */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [ANNOUNCE] apitrace 2.0
There were a bunch of features accumulated on apitrace's master branch, so I've tagged version 2.0: * Flush/sync trace file only when there is an uncaught signal/exception, yielding a 5x speed up while tracing. * Employ google snappy compression library instead of zlib, yielding a further 2x speed up while tracing. * Better GUI performance with very large traces, by loading frames from disk on demand. * Implement and advertise `GL_GREMEDY_string_marker` and `GL_GREMEDY_frame_terminator` extensions. * Mac OS X support. * Support up-to OpenGL 4.2 calls. Tarball available from https://github.com/apitrace/apitrace/tarball/2.0 Feel free to report any issues on https://github.com/apitrace/apitrace/issues or ask questions on http://lists.freedesktop.org/mailman/listinfo/apitrace Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] apitrace 1.0 (was: glretrace)
FYI, over these past months I've been continuing to improve glretrace on my spare time, and lifted several of the previous limitations: added support for DrawArrays/Elements with user pointers; Map/UnmapBuffer; more extensions; several bug fixes. To the point that a great deal of GL apps trace retrace flawlessly (tested e.g., Quake3, Unigine Heaven, Cinebench R11, Autodesk Maya). And recently Zack Rusin offered to write a QT-based GUI for glretrace, also on his personal time. Warning: this GUI is not the cherry on top of the cake, it's the whole sugar coating and it will make you salivate like a rabid dog! It allows to view the traced calls, view the state (parameters, shaders, textures, etc) at any call, and edit the calls, and more. Pretty amazing stuff, useful both for GL users and implementers, and I'm sure Zack will write up about it very soon. We thought the work so far was worthy of a 1.0 release -- not 1.0 as in it's-all-done-and-finished, but more in the sense of being ready to be used by an wider audience, who will hopefully contributing stuff back to it (as the TODO is actually bigger than before, since for every item we strike out, we get ideas to do two more). You can get the source from https://github.com/apitrace/apitrace/tree/1.0 . Build/usage instructions on the README. Jose - Original Message - FYI, I've extended my apitrace project to be able not only to show a list of GL calls, but also replay them. The idea is to facilitate reproduction of bugs, help finding regressions, and do it across different platforms. Code is in http://cgit.freedesktop.org/~jrfonseca/apitrace/log/?h=retrace There is a sample trace on http://people.freedesktop.org/~jrfonseca/traces/ which can be replayed glretrace -db topogun-1.06-orc-84k.trace GL API is massive, and spite code generation being used, there are many holes. The biggest is non-immediate non-vbo arrays/elements drawing (ie., glXxPointer with pointer to vertices instead of offsets). And there's much more listed in the TODO file. There are also many places where the size of arrays depend on another parameter and an auxiliary function is needed. Cover the whole GL API is an herculean task, so help would be very much welcome. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] per-vertex array max_index
On 04/15/2011 07:20 PM, Christoph Bumiller wrote: On 15.04.2011 18:04, Brian Paul wrote: Hi Marek, Back on Jan 10 you removed the per-vertex array max_index field (commit cdca3c58aa2d9549f5188910e2a77b438516714f). I believe this was a mistake. I noticed today that the piglit draw-instanced-divisor test is failing. A bisection points to Jose's commit 17bbc1f0425b3768e26473eccea5f2570dcb26d3. But that's a red herring. If I disable the SSE code paths, the regression goes back to the Jan 10 change. With the GL_ARB_instanced_arrays extension vertex array indexing changes quite a bit. Specifically, some arrays (those with divisors != 0) are now indexed by instance ID, not the primitive's vertex index. The draw_info::max_index field doesn't let us take this into account. I believe that we really need a per-array max_index value. As an example, suppose we're drawing a star field with 1 million instanced stars, each star drawn as a 4-vertex quad. We might use a vertex array indexed by the instance ID to color the stars. The draw call would look like: glDrawArraysInstanced(GL_QUADS, 0, 4, 1000*1000); In this case we'd have two vertex arrays. The first array is the quad vertex positions with four elements. The second array is the star colors with 1 million elements. As it is now, we're setting draw_info::max_index = 4 and we errantly clamp the index into the second array to 4 instead of 1 million. As a temporary work around we can disable clamping of array indexes for instance arrays. But I think we need to revert the Jan 10 commit and then rework some of Jose's related changes. -Brian You know which vertex elements are per-instance, you know the divisor, and you know the max instance index - that should be all the information you need. You just have to clamp it to startInstance + instanceCount for them instead. draw_info::max_index is an optimization hint. In theory it could be always 0xfff and the pipe driver should still cope with it. Provided it is an upper bound of the true max index in the index buffer it should cause no visible difference. My u_draw.c code already ignores instanced elements when computing max_index. And I believe the translate module doesn't clamp to max_index when fetching instanced elements, but I'll have to double check. I didn't look too closely at current st_draw.c code though yet. But it appears the bug is in st_draw.c, as there is no need to artificially limit the max_index passed by the app. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] per-vertex array max_index
On 04/15/2011 08:51 PM, Brian Paul wrote: On 04/15/2011 01:41 PM, José Fonseca wrote: On 04/15/2011 07:20 PM, Christoph Bumiller wrote: On 15.04.2011 18:04, Brian Paul wrote: Hi Marek, Back on Jan 10 you removed the per-vertex array max_index field (commit cdca3c58aa2d9549f5188910e2a77b438516714f). I believe this was a mistake. I noticed today that the piglit draw-instanced-divisor test is failing. A bisection points to Jose's commit 17bbc1f0425b3768e26473eccea5f2570dcb26d3. But that's a red herring. If I disable the SSE code paths, the regression goes back to the Jan 10 change. With the GL_ARB_instanced_arrays extension vertex array indexing changes quite a bit. Specifically, some arrays (those with divisors != 0) are now indexed by instance ID, not the primitive's vertex index. The draw_info::max_index field doesn't let us take this into account. I believe that we really need a per-array max_index value. As an example, suppose we're drawing a star field with 1 million instanced stars, each star drawn as a 4-vertex quad. We might use a vertex array indexed by the instance ID to color the stars. The draw call would look like: glDrawArraysInstanced(GL_QUADS, 0, 4, 1000*1000); In this case we'd have two vertex arrays. The first array is the quad vertex positions with four elements. The second array is the star colors with 1 million elements. As it is now, we're setting draw_info::max_index = 4 and we errantly clamp the index into the second array to 4 instead of 1 million. As a temporary work around we can disable clamping of array indexes for instance arrays. But I think we need to revert the Jan 10 commit and then rework some of Jose's related changes. -Brian You know which vertex elements are per-instance, you know the divisor, and you know the max instance index - that should be all the information you need. You just have to clamp it to startInstance + instanceCount for them instead. draw_info::max_index is an optimization hint. In theory it could be always 0xfff and the pipe driver should still cope with it. Provided it is an upper bound of the true max index in the index buffer it should cause no visible difference. My u_draw.c code already ignores instanced elements when computing max_index. And I believe the translate module doesn't clamp to max_index when fetching instanced elements, but I'll have to double check. I didn't look too closely at current st_draw.c code though yet. But it appears the bug is in st_draw.c, as there is no need to artificially limit the max_index passed by the app. I thought we were using max_index to prevent out-of-bounds buffer reads. Are we not doing that anywhere? Inside the draw module there is a max_index with that meaning that: draw_context-pt.max_index. This is all very subtle, but min_index/max_index is really a property of the index_buffer alone, independent of vertex buffers size and/or any index_bias. src/gallium/docs/source/context.rst could be a bit better... This is what it has currently: All vertex indices must fall inside the range given by ``min_index`` and ``max_index``. In case non-indexed draw, ``min_index`` should be set to ``start`` and ``max_index`` should be set to ``start``+``count``-1. ``index_bias`` is a value added to every vertex index before fetching vertex attributes. It does not affect ``min_index`` and ``max_index``. If there is an index buffer bound, and ``indexed`` field is true, all vertex indices will be looked up in the index buffer. ``min_index``, ``max_index``, and ``index_bias`` apply after index lookup. This would be a better description: In indexed draw, all vertex indices in the vertex buffer should fall inside the range given by ``min_index`` and ``max_index``, but the driver should not crash or hang if not. It is merely an optimization hint so the driver knows which range of vertices will be referenced without having to scan the index buffer with the CPU. A state tracker could always set ``min_index`` and ``max_index`` to 0 and 0x respectively, and the only prejudice would be performance, not stability, which means that the driver should internally guarantee there will be no out of bounds accesses. In case non-indexed draw, ``min_index`` should be set to ``start`` and ``max_index`` should be set to ``start``+``count``-1. ``index_bias`` is a value added to every vertex index from the index buffer before fetching vertex attributes. It does not affect ``min_index`` and ``max_index``. If there is an index buffer bound, and ``indexed`` field is true, all vertex indices will be looked up in the index buffer. Any suggestions for better english welcome. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw-robustness: Test robustness for out-of-bounds vertex fetches.
On 03/31/2011 09:38 PM, Eric Anholt wrote: On Thu, 31 Mar 2011 14:46:32 +0100, jfons...@vmware.com wrote: +/* Test whether out-of-bounds vertex buffer object cause termination. + * + * Note that the original ARB_vertex_buffer_object extension explicitly states + * program termination is allowed when out-of-bounds vertex buffer object + * fetches occur. The ARB_robustness extension does provides an enbale to + * guarantee that out-of-bounds buffer object accesses by the GPU will have + * deterministic behavior and preclude application instability or termination + * due to an incorrect buffer access. But regardless of ARB_robustness + * extension support it is a good idea not to crash. For example, viewperf + * doesn't properly detect NV_primitive_restart and emits 0x indices + * which can result in crashes. + * + * TODO: + * - test out-of-bound index buffer object access + * - test more vertex/element formats + * - test non-aligned offsets + * - provide a command line option to actually enable ARB_robustness + */ Instead of a single draw-robustness test to (eventually) test all these things, a draw-robustness-* collection of tests for each would be way more useful. Sounds good. Will do. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw-robustness: Test robustness for out-of-bounds vertex fetches.
On 03/31/2011 09:45 PM, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/31/2011 06:46 AM, jfons...@vmware.com wrote: From: José Fonsecajfons...@vmware.com Not added to the standard test lists given that ARB_vertex_buffer_object allows program termination out-of-bounds vertex buffer object fetches occur. In anticipation of making real GL_ARB_robustness tests, I'd suggest putting this in tests/spec/ARB_robustness. We can add it to the general test list once it has ARB_robustness support, which I see listed as a todo item. OK. There are a couple comments below. --- tests/general/CMakeLists.gl.txt |1 + tests/general/draw-robustness.c | 201 +++ 2 files changed, 202 insertions(+), 0 deletions(-) create mode 100644 tests/general/draw-robustness.c diff --git a/tests/general/CMakeLists.gl.txt b/tests/general/CMakeLists.gl.txt index bbe6507..d373e35 100644 --- a/tests/general/CMakeLists.gl.txt +++ b/tests/general/CMakeLists.gl.txt @@ -36,6 +36,7 @@ ENDIF (UNIX) add_executable (draw-elements-vs-inputs draw-elements-vs-inputs.c) add_executable (draw-instanced draw-instanced.c) add_executable (draw-instanced-divisor draw-instanced-divisor.c) +add_executable (draw-robustness draw-robustness.c) add_executable (draw-vertices draw-vertices.c) add_executable (draw-vertices-half-float draw-vertices-half-float.c) add_executable (fog-modes fog-modes.c) diff --git a/tests/general/draw-robustness.c b/tests/general/draw-robustness.c new file mode 100644 index 000..a13f568 --- /dev/null +++ b/tests/general/draw-robustness.c @@ -0,0 +1,201 @@ +/* + * Copyright (C) 2011 VMware, Inc. + * Copyright (C) 2010 Marek Olšákmar...@gmail.com + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + * Authors: + * Jose Fonsecajfons...@vmware.com + * Based on code from Marek Olšákmar...@gmail.com + */ + +/* Test whether out-of-bounds vertex buffer object cause termination. + * + * Note that the original ARB_vertex_buffer_object extension explicitly states + * program termination is allowed when out-of-bounds vertex buffer object + * fetches occur. The ARB_robustness extension does provides an enbale to + * guarantee that out-of-bounds buffer object accesses by the GPU will have + * deterministic behavior and preclude application instability or termination + * due to an incorrect buffer access. But regardless of ARB_robustness + * extension support it is a good idea not to crash. For example, viewperf + * doesn't properly detect NV_primitive_restart and emits 0x indices + * which can result in crashes. + * + * TODO: + * - test out-of-bound index buffer object access + * - test more vertex/element formats + * - test non-aligned offsets + * - provide a command line option to actually enable ARB_robustness + */ + +#include piglit-util.h + +int piglit_width = 320, piglit_height = 320; +int piglit_window_mode = GLUT_RGB; + +void piglit_init(int argc, char **argv) +{ +piglit_ortho_projection(piglit_width, piglit_height, GL_FALSE); + +if (!GLEW_VERSION_1_5) { +printf(Requires OpenGL 1.5\n); +piglit_report_result(PIGLIT_SKIP); +} + +glShadeModel(GL_FLAT); +glClearColor(0.2, 0.2, 0.2, 1.0); +} + +static void +random_vertices(GLsizei offset, GLsizei stride, GLsizei count) +{ +GLsizei size = offset + (count - 1)*stride + 2 * sizeof(GLfloat); +GLubyte *vertices; +GLsizei i; + +assert(offset % sizeof(GLfloat) == 0); +assert(stride % sizeof(GLfloat) == 0); + +vertices = malloc(size); +assert(vertices); +if (!vertices) { +return; +} Since this is with a VBO, why not use MapBuffer/UnmapBuffer instead of malloc/free? OK. +if (0) { +fprintf(stderr, vertex_offset = %i\n, vertex_offset); +fprintf(stderr, vertex_stride = %i\n, vertex_stride); +fprintf(stderr,
Re: [Mesa-dev] [PATCH] gallivm: Fix build with llvm-2.9
On 03/27/2011 04:11 PM, Tobias Droste wrote: In llvm-2.9 Target-createMCInstPrinter() takes different arguments Signed-off-by: Tobias Drostetdro...@gmx.de --- src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp index 1f24cb6..76d63ce 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp @@ -207,9 +207,17 @@ lp_disassemble(const void* func) } raw_debug_ostream Out; + TargetMachine *TM = T-createTargetMachine(Triple, ); +#if HAVE_LLVM= 0x0209 + unsigned int AsmPrinterVariant = AsmInfo-getAssemblerDialect(); +#else int AsmPrinterVariant = AsmInfo-getAssemblerDialect(); -#if HAVE_LLVM= 0x0208 +#endif +#if HAVE_LLVM= 0x0209 + OwningPtrMCInstPrinter Printer( + T-createMCInstPrinter(*TM, AsmPrinterVariant, *AsmInfo)); +#elif HAVE_LLVM= 0x0208 OwningPtrMCInstPrinter Printer( T-createMCInstPrinter(AsmPrinterVariant, *AsmInfo)); #else @@ -221,8 +229,6 @@ lp_disassemble(const void* func) return; } - TargetMachine *TM = T-createTargetMachine(Triple, ); - const TargetInstrInfo *TII = TM-getInstrInfo(); /* Applied. Thanks. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Truncated extensions string
On Sat, 2011-03-12 at 01:22 -0800, Kenneth Graunke wrote: On Friday, March 11, 2011 01:23:12 PM Patrick Baggett wrote: I feel like there is some kind of underlying lesson that we, OpenGL app programmers, should be getting out of this... Yes. Don't blindly copy abitrary amounts of data into a fixed size buffer. :) I hate to be trite, but that -is- the entire problem: a classic buffer overflow, the kind we warn people about in early programming courses. There is no buffer overflow, just truncation, e.g., as if strncpy is used. The crash I saw is most likely because the truncated extensions string either has an inconsistent set of extensions or it is missing a very basic extension. As Ian pointed out, it's absolutely trivial to do this correctly: if you're going to copy it, just malloc a buffer that's the correct size. Indeed. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Truncated extensions string
On Tue, 2011-03-15 at 13:26 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/15/2011 08:29 AM, José Fonseca wrote: Attached is a new version of the patch that keeps the extensions sorted alphabetically in the source code, and sorts chronologically in runtime. (Note the patch doesn't apply against the tip of master, but a version a few weeks old -- I'll update before committing to master.) I don't see how this fixes the problem, so perhaps I'm misunderstanding the problem. Don't these buggy apps still get an extension string that's longer than they know how to handle? That's why Ken mentioned the use of an environment variable so that the app only ever sees, for example, the extensions that existed before 2003. After all, the apps that aren't smart enough to dynamically allocate a buffer probably also aren't smart enough to use strncmp *and* add a NUL terminator to the string. Right? Not really: the app actually appears to truncate (i.e., strncpy zero last char). One can even see the truncated extension string in the quake console. I didn't went to the trouble of disassembling the code, but putting the old extensions first has always fixed the issue, avoid the bad performance and crashes. The crash that happens due to no sorting is probably because the truncated extension string is inconsistent, or a very basic extension is missing, causing the app to trip on itself. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GSoC : Thoughts about an OpenGL 4.1 state tracker
I also agree with Marek FWIW. If we want a cleaner / more agile code base, then we could fork off the old mesa drivers which aren't being actively maintained/tested into a separate branch, put them in just-bugfixes/no-new-features life support mode; therefore allowing greater freedom to refactor support for the active/maintained drivers, without the hassle of updating old drivers and its associated risks. Jose On Sun, 2011-03-13 at 18:42 -0700, Marek Olšák wrote: Hi Denis, I don't think forking the current Mesa codebase and making a Core-profile-only state tracker is a good idea. The OpenGL 4.1 Core and Compatibility specifications have 518 and 678 pages, respectively, which means more than 70% of the codebase would be the same. The Core profile is not so popular and we still will have to maintain all the deprecated OpenGL 1.x features for the existing applications to work, which is the majority. The deprecated features are not such a big pain and usually they don't get in the way of new, more modern features. Also what if someone is using the OpenGL 4.1 Compatibility profile? We'd be screwed. Note that NVIDIA have no plans to promote the Core profile, they claim the deprecated features run at full speed on their driver and they advise their users to use the Compatibility profile. Anyway this is more like a decade project than a summer project to me. Look, there is this GL3 to-do list over here: http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt New features are being implemented at a quite slow pace. A complete rewrite would freeze the feature set for a year or two, because already-implemented features would be worked on again. Also you don't wanna develop a new OpenGL state tracker without Intel on board. I guess that's obvious. A much, much more useful project would be to work on the features listed in the to-do list, and in my opinion this would be the most useful GSoC project for OpenGL users. If the main motivation to start from scratch is to clean up the current stack, there is an easier way. The eventual plan could be as follows. First, any necessary work to remove the Mesa IR should be done, after that TGSI could be replaced by the GLSL IR. This is a crucial step for Intel to even consider moving over to Gallium, so it's not like we have any other choice, at least I don't see one. Then the only classic driver would be Gallium itself, at which point the cleanup would begin, potentially leading to a merge of mesa/main and st/mesa. I am especially interested in the performance improvements the simplified code (mainly state management) would bring. I'd be interested in other people's ideas too. Marek On Sun, Mar 13, 2011 at 12:09 PM, Denis Steckelmacher steckde...@yahoo.frmailto:steckde...@yahoo.fr wrote: Hello, I'm a Belgian young student and I follow the Mesa's development for nearly two years now. I'm just 18 yars old and it's the first year I'm elligible to the Google Summer of Code. I originally planned to work on KDE, but I find Mesa more interesting. The project on which I would work is a pure OpenGL 4.1 Core state tracker for Gallium 3D. I know that it isn't the easiest task to do during a summer, but I already started thinking about it and I have plenty of free time from now to months after the summer (I will begin IT studies, but I do programming for nearly eight years now (yes, the half of my life), so I hope it will not be too difficult and time-consuming). I've read parts of the Open GL 4.1 Core specification, the OpenGL 4.1 Core summary card, and many examples. Code-wise, I have already dived in the Gallium3D codebase and I understand it. My nearly two years following of the Mesa's development taught me how a graphics card works. I especially followed the development of the r300g driver (I use an ATI Radeon X1270 graphics card) and the llvmpipe driver. I'll keep it short about me, but feel free to ask me more questions. The purpose of this mail is to ask some questions and to share some ideas about this OpenGL 4.1 state tracker. Keep in mind that I'm young and not a professional in the computer graphics world (and also that my mother language is French, so sorry for my bad English, and the lack of precision in what I say). Here are now some thoughts about this state tracker. Replacing Mesa ? I have read the Google Summer of Code Ideas page on freedesktop.orghttp://freedesktop.org, and there is an OpenGL 3.1 state tracker idea on it. The summary if this idea says that the student should start by copying Mesa and the Mesa state tracker and then remove the unneeded stuff. I don't know wich way could be the easiest, but I think that starting this state tracker from scratch could be more elegant. My idea is to create a new directory under src/gallium/state_trackers, and to put in it most of the state tracker. There are two alternatives :
Re: [Mesa-dev] GSoC : Thoughts about an OpenGL 4.1 state tracker
On Mon, 2011-03-14 at 10:06 -0700, Matt Turner wrote: On Mon, Mar 14, 2011 at 4:52 PM, José Fonseca jfons...@vmware.com wrote: If we want a cleaner / more agile code base, then we could fork off the old mesa drivers which aren't being actively maintained/tested into a separate branch, put them in just-bugfixes/no-new-features life support mode; therefore allowing greater freedom to refactor support for the active/maintained drivers, without the hassle of updating old drivers and its associated risks. What drivers are you talking about? I'm talking about drivers that: - have no fragment shader - haven't active maintainers Perhaps these: dri/tdfx dri/mga dri/i810 dri/sis dri/unichrome dri/mach64 dri/r128 dri/r200 dri/savage windows/icd windows/gldirect windows/gldirect/mesasw windows/gldirect/dx9 windows/gldirect/dx7 windows/gldirect/dx8 windows/gdi windows/fx I'm not sure about the status of these: - dri/radeon - dri/nouveau A quick glance tells me that old drivers like tdfx and savage were only modified 7 and 8 times respectively in 2010, so I don't see old drivers slowing any development down. I see that as a clear indicative of trouble: - probably nobody is testing those drivers as we change Mesa common code - we can't remove old features from Mesa DDI because those drivers need it What would splitting these drivers out of the Mesa codebase allow? Intel's still using the classic infrastructure, so. Suppose we want to: - eliminate fixed function from Mesa DDI - replace Mesa IR and TGSI from Mesa classic driver interface and Gallium respectively, with GLSL2 IR. Is there point in writing a reverse shader - fixed function for older drivers, or rewrite the shader translations for those drivers? Basically, I'm arguing that support for old fashioned hardware be done in a separate tree, that acts as a time vault. I think it is a disservice both for developers and users trying to cover for so many generations of hardware, some very different from anothers, in a single source tree. This is a mere suggestion -- I think we could continue the current approach of adding layers (some very old, others newer, with some doing the translation in between) indefinitely -- I'm simply arguing that it would be probably easier if we could shed out some of the legacy stuff and keep the code a bit leaner. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ir_to_mesa: do not check the number of uniforms against hw limits
On Mon, 2011-03-14 at 11:23 -0700, Brian Paul wrote: On 03/12/2011 07:44 PM, Marek Olšák wrote: The r300 compiler can eliminate unused uniforms and remap uniform locations if their number surpasses hardware limits, so the limit is actually NumParameters + NumUnusedParameters. This is important for some apps under Wine to run. Wine sometimes declares a uniform array of 256 vec4's and some Wine-specific constants on top of that, so in total there is more uniforms than r300 can handle. This was the main motivation for implementing the elimination of unused constants. We should allow drivers to implement fail recovery paths where it makes sense, so giving up too early especially when comes to uniforms is not so good idea, though I agree there should be some hard limit for all drivers. I added the check_resources() code to fix an issue with the SVGA driver. If we can't do resource checking against the ctx-Const.Vertex/FragmentProgram limits we need something else. In Gallium we have the PIPE_SHADER_CAP_MAX_x queries. Are you saying we shouldn't check shaders against those limits either? If we were to push all the shader resource checking down into the Gallium drivers we'll need a new way to propagate error messages back to the user (we can only return NULL from create_X_state() now). Another other problem would be instituting consistant error reporting across all the drivers. We've kind of tiptoed around this issue in the past. It's probably time to come up with some real solutions. Not only some drivers are able to optimize away declared-yet-unused registers; but other drivers drivers may actually need to add extra temps/const regs to implement certain opcodes/state. Both issues make it difficult to make guarantees around PIPE_SHADER_CAP_MAX_x, as we can easily end up advertising too little or too much. It looks there's not much alternative to mimicking GLSL here, i.e., advertise these as limits but allow concrete shaders to pass/fail to compile on a case-by-case basis. I'm not sure what's the best what to convey errors from the drivers: to return pipe_error and extend it to include things like PIPE_ERROR_TOO_MANY_CONSTS PIPE_ERROR_TOO_MANY_TEMPS PIPE_ERROR_TOO_MANY_INSTRUCTIONS etc. and translate them in the state tracker; or use message strings. At a glance, it looks like an error enum would be expressive enough for most stuff. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Bug: could not compile mesa because of lacking component
On Mon, 2011-03-14 at 10:55 -0700, Gustaw Smolarczyk wrote: The commit 110f5e2056f80d0b87f2a4388bc35727070ba6d5 was meant to fix this build error, but it only add x86disassembler. The macro InitializeAllDisassemblers() (from llvm/Target/TargetSelect.h) initializes *all* disassemblers build with llvm (in my case: X86 and Arm; this is precompiled package from ubuntu 11.04), but we only link with X86 one. The attached patch fixes it by initializing only X86 disassembler. I understand that this is the only backend we support now, right? It's the only one we test , but I'd like to avoid tying the code exclusively to X86. Could you try the attached patch instead? Jose diff --git a/configs/linux-llvm b/configs/linux-llvm index dde40a3..359bee2 100644 --- a/configs/linux-llvm +++ b/configs/linux-llvm @@ -31,9 +31,9 @@ endif ifeq ($(MESA_LLVM),1) LLVM_CFLAGS=`llvm-config --cppflags` - LLVM_CXXFLAGS=`llvm-config --cxxflags backend bitreader engine ipo interpreter instrumentation` -Wno-long-long - LLVM_LDFLAGS = $(shell llvm-config --ldflags backend bitreader engine ipo interpreter instrumentation) - LLVM_LIBS = $(shell llvm-config --libs backend bitwriter bitreader engine ipo interpreter instrumentation) + LLVM_CXXFLAGS=`llvm-config --cxxflags` -Wno-long-long + LLVM_LDFLAGS = $(shell llvm-config --ldflags) + LLVM_LIBS = $(shell llvm-config --libs) MKLIB_OPTIONS=-cplusplus else LLVM_CFLAGS= diff --git a/configure.ac b/configure.ac index 70380ff..b510151 100644 --- a/configure.ac +++ b/configure.ac @@ -1649,7 +1649,7 @@ if test x$enable_gallium_llvm = xyes; then if test x$LLVM_CONFIG != xno; then LLVM_VERSION=`$LLVM_CONFIG --version` LLVM_CFLAGS=`$LLVM_CONFIG --cppflags` - LLVM_LIBS=`$LLVM_CONFIG --libs jit interpreter nativecodegen bitwriter x86disassembler` -lstdc++ + LLVM_LIBS=`$LLVM_CONFIG --libs` -lstdc++ LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags` GALLIUM_DRIVERS_DIRS=$GALLIUM_DRIVERS_DIRS llvmpipe ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Truncated extensions string
The problem from http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12493.html is back, and now a bit worse -- it causes Quake3 arena demo to crash (at least the windows version). The full version works fine. I'm not sure what other applications are hit by this. See the above thread for more background. There are two major approaches: 1) sort extensions chronologically instead of alphabetically. See attached patch for that - for those who prefer to see extensions sorted alphabetically in glxinfo, we could modify glxinfo to sort then before displaying 2) detect broken applications (i.e., by process name), and only sort extensions strings chronologically then Personally I think that varying behavior based on process name is a ugly and brittle hack, so I'd prefer 1), but I just want to put this on my back above all, so whatever works is also fine by me. Jose From 33509095b91f20ce0be63b5e9f44a4635acf3ce1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Fonseca?= jfons...@vmware.com Date: Thu, 10 Mar 2011 14:39:08 + Subject: [PATCH] mesa: Sort extensions in extension string by year. The years were obtained automatically by scraping the first year from the spec text file. They are approximate. --- src/mesa/main/extensions.c | 429 ++-- 1 files changed, 214 insertions(+), 215 deletions(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 7504b8a..26eee7d 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -69,230 +69,229 @@ struct extension { /** * \brief Table of supported OpenGL extensions for all API's. * + * The sorted in chronologic order because certain old applications (e.g., + * Quake3 demo) store the extension list in a static size buffer so chronoligic + * order ensure that the extensions that such applications expect will fit into + * that buffer. + * * Note: The GL_MESAX_* extensions are placeholders for future ARB extensions. */ static const struct extension extension_table[] = { - /* ARB Extensions */ - { GL_ARB_ES2_compatibility, o(ARB_ES2_compatibility), GL }, - { GL_ARB_blend_func_extended, o(ARB_blend_func_extended), GL }, - { GL_ARB_copy_buffer, o(ARB_copy_buffer), GL }, - { GL_ARB_depth_buffer_float, o(ARB_depth_buffer_float), GL }, - { GL_ARB_depth_clamp, o(ARB_depth_clamp), GL }, - { GL_ARB_depth_texture, o(ARB_depth_texture), GL }, - { GL_ARB_draw_buffers,o(ARB_draw_buffers),GL }, - { GL_ARB_draw_buffers_blend, o(ARB_draw_buffers_blend), GL }, - { GL_ARB_draw_elements_base_vertex, o(ARB_draw_elements_base_vertex), GL }, - { GL_ARB_draw_instanced, o(ARB_draw_instanced), GL }, - { GL_ARB_explicit_attrib_location,o(ARB_explicit_attrib_location),GL }, - { GL_ARB_fragment_coord_conventions, o(ARB_fragment_coord_conventions), GL }, - { GL_ARB_fragment_program,o(ARB_fragment_program),GL }, - { GL_ARB_fragment_program_shadow, o(ARB_fragment_program_shadow), GL }, - { GL_ARB_fragment_shader, o(ARB_fragment_shader), GL }, - { GL_ARB_framebuffer_object, o(ARB_framebuffer_object), GL }, - { GL_ARB_framebuffer_sRGB,o(EXT_framebuffer_sRGB),GL }, - { GL_ARB_half_float_pixel,o(ARB_half_float_pixel),GL }, - { GL_ARB_half_float_vertex, o(ARB_half_float_vertex), GL }, - { GL_ARB_instanced_arrays,o(ARB_instanced_arrays),GL }, - { GL_ARB_map_buffer_range,o(ARB_map_buffer_range),GL }, - { GL_ARB_multisample, o(ARB_multisample), GL }, - { GL_ARB_multitexture,o(ARB_multitexture),GL }, - { GL_ARB_occlusion_query2,o(ARB_occlusion_query2),GL }, - { GL_ARB_occlusion_query, o(ARB_occlusion_query), GL }, - { GL_ARB_pixel_buffer_object, o(EXT_pixel_buffer_object), GL }, -
Re: [Mesa-dev] [PATCH] gallium/util: Upload manager optimizations
On Fri, 2011-03-11 at 02:06 -0800, Thomas Hellstrom wrote: On 03/10/2011 04:57 PM, José Fonseca wrote: On Thu, 2011-03-10 at 06:01 -0800, Thomas Hellstrom wrote: Make sure that the upload manager doesn't upload data that's not dirty. This speeds up the viewperf test proe-04/1 a factor 5 or so on svga. Sweet! A few comments inline Also introduce an u_upload_unmap() function that can be used instead of u_upload_flush() so that we can pack even more data in upload buffers. With this we can basically reuse the upload buffer across flushes. Signed-off-by: Thomas Hellstromthellst...@vmware.com --- src/gallium/auxiliary/util/u_upload_mgr.c | 41 ++-- src/gallium/auxiliary/util/u_upload_mgr.h | 10 +++ 2 files changed, 42 insertions(+), 9 deletions(-) diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c b/src/gallium/auxiliary/util/u_upload_mgr.c index dcf800a..7768e13 100644 --- a/src/gallium/auxiliary/util/u_upload_mgr.c +++ b/src/gallium/auxiliary/util/u_upload_mgr.c @@ -51,6 +51,7 @@ struct u_upload_mgr { unsigned size; /* Actual size of the upload buffer. */ unsigned offset; /* Aligned offset to the upload buffer, pointing * at the first unused byte. */ + unsigned uploaded_offs; /* Offset below which data is already uploaded */ It's not clear the difference between offset and uploaded_offs. More about this below. }; @@ -72,6 +73,22 @@ struct u_upload_mgr *u_upload_create( struct pipe_context *pipe, return upload; } +void u_upload_unmap( struct u_upload_mgr *upload ) +{ + if (upload-transfer) { + if (upload-size upload-uploaded_offs) { + pipe_buffer_flush_mapped_range(upload-pipe, upload-transfer, +upload-uploaded_offs, +upload-offset - upload-uploaded_offs); + } + pipe_transfer_unmap(upload-pipe, upload-transfer); + pipe_transfer_destroy(upload-pipe, upload-transfer); + upload-transfer = NULL; + upload-uploaded_offs = upload-offset; + upload-map = NULL; + } +} + /* Release old buffer. * * This must usually be called prior to firing the command stream @@ -84,17 +101,10 @@ struct u_upload_mgr *u_upload_create( struct pipe_context *pipe, void u_upload_flush( struct u_upload_mgr *upload ) { /* Unmap and unreference the upload buffer. */ - if (upload-transfer) { - if (upload-size) { - pipe_buffer_flush_mapped_range(upload-pipe, upload-transfer, -0, upload-size); - } - pipe_transfer_unmap(upload-pipe, upload-transfer); - pipe_transfer_destroy(upload-pipe, upload-transfer); - upload-transfer = NULL; - } + u_upload_unmap(upload); pipe_resource_reference(upload-buffer, NULL ); upload-size = 0; + upload-uploaded_offs = 0; } @@ -172,6 +182,19 @@ enum pipe_error u_upload_alloc( struct u_upload_mgr *upload, offset = MAX2(upload-offset, alloc_offset); + if (!upload-map) { + upload-map = pipe_buffer_map_range(upload-pipe, upload-buffer, +0, upload-size, +PIPE_TRANSFER_WRITE | +PIPE_TRANSFER_FLUSH_EXPLICIT | +PIPE_TRANSFER_UNSYNCHRONIZED, + upload-transfer); + Instead of the whole range, we should mapping only the area of interest. That is, of [0, upload-size], it should be [upload-uploaded_offs, upload-size - uploaded_offs]. This mean that the driver / kernel does not need to populate the bytes from [0, uploaded_offs] in the case where the contents is already gone to VRAM. I think uploaded_offs should be renamed to mapped_offset, i.e., the offset from which the buffer is currently mapped. Agreed. + assert(offset= upload-uploaded_offs); + upload-uploaded_offs = offset; + } + assert(offset upload-buffer-width0); assert(offset + size= upload-buffer-width0); assert(size); diff --git a/src/gallium/auxiliary/util/u_upload_mgr.h b/src/gallium/auxiliary/util/u_upload_mgr.h index c9a2ffe..02426ea 100644 --- a/src/gallium/auxiliary/util/u_upload_mgr.h +++ b/src/gallium/auxiliary/util/u_upload_mgr.h @@ -67,6 +67,16 @@ void u_upload_destroy( struct u_upload_mgr *upload ); void u_upload_flush( struct u_upload_mgr *upload ); /** + * Unmap upload buffer + * + * \param upload Upload manager + * + * This is like u_upload_flush() except the upload buffer is kept for + * re-use across flushes. Allows us to pack more data into upload buffers. + */ +void u_upload_unmap
Re: [Mesa-dev] Truncated extensions string
On Fri, 2011-03-11 at 09:04 -0800, Eric Anholt wrote: On Fri, 11 Mar 2011 10:33:13 +, José Fonseca jfons...@vmware.com wrote: The problem from http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12493.html is back, and now a bit worse -- it causes Quake3 arena demo to crash (at least the windows version). The full version works fine. I'm not sure what other applications are hit by this. See the above thread for more background. There are two major approaches: 1) sort extensions chronologically instead of alphabetically. See attached patch for that - for those who prefer to see extensions sorted alphabetically in glxinfo, we could modify glxinfo to sort then before displaying 2) detect broken applications (i.e., by process name), and only sort extensions strings chronologically then Personally I think that varying behavior based on process name is a ugly and brittle hack, so I'd prefer 1), but I just want to put this on my back above all, so whatever works is also fine by me. If this is just a hack for one broken application, and we think that building in a workaround for this particular broken application is important (I don't), I still prefer an obvious hack for that broken application like feeding it a tiny extension string that it cares about, instead of reordering the extension list. There are many versions of Quake3 out there, some fixed, others not, and others enhanced. This means a tiny string would prevent any Quake3 application from finding newer extensions. So I think that if we go for the application name detection then we should present the whole extension string sorted chronologically, instead of giving a tiny string. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] 7 questions and proposals about changes in the Gallium interface
On Mon, 2011-03-07 at 02:38 -0800, Keith Whitwell wrote: On Sun, 2011-03-06 at 18:42 +0100, Marek Olšák wrote: 2) is_resource_referenced Now that the transfer functions are in pipe_context, is this hook really necessary? Good question. I'd like to see those functions go away as they are round-trips baked into the interface which is a pain if you try and remote this stuff. I guess you'd still need to catch the write-after-read case within a single context and turn that into a flush. I think others (Jose in particular) should comment, but I suspect that individual drivers could now do this internally not need to expose the interface in gallium. That's correct. State trackers no longer need this. This interface can be removed, and drivers that need it internally should be updated to use internal private interfaces. 5) Block compression formats naming Would anyone object to cleaning up the names of compression formats? There are (or will be) these formats: DXTn, RGTCn, LATCn, BPTCx. They have many things in common: - All of them have 4x4 pixel blocks. - One block is either 64 bits of 128 bits large. - RGTC and LATC are equal except for swizzling. - RGTC and LATC are based on DXTn encoding. I propose to copy the more consistent D3D11 naming and use the form PIPE_FORMAT_encoding_swizzle_type for all of them: PIPE_FORMAT_BC1_RGB_UNORM // DXT1 = BC1 PIPE_FORMAT_BC1_RGB_SRGB PIPE_FORMAT_BC1_RGBA_UNORM PIPE_FORMAT_BC1_RGBA_SRGB PIPE_FORMAT_BC2_RGBA_UNORM // DXT3 = BC2 PIPE_FORMAT_BC2_RGBA_SRGB PIPE_FORMAT_BC3_RGBA_UNORM // DXT5 = BC3 PIPE_FORMAT_BC3_RGBA_SRGB PIPE_FORMAT_BC4_R_UNORM // RGTC1 = BC4 PIPE_FORMAT_BC4_R_SNORM PIPE_FORMAT_BC4_L_UNORM // LATC1 = BC4 PIPE_FORMAT_BC4_L_SNORM PIPE_FORMAT_BC5_RG_UNORM // RGTC2 = D3D/3DC = BC5 PIPE_FORMAT_BC5_RG_SNORM PIPE_FORMAT_BC5_LA_UNORM // LATC2 = GL/3DC = BC5 PIPE_FORMAT_BC5_LA_SNORM PIPE_FORMAT_BC6_RGB_FLOAT // BPTC (BC6H) PIPE_FORMAT_BC6_RGB_UFLOAT PIPE_FORMAT_BC7_RGBA_UNORM // BPTC PIPE_FORMAT_BC7_RGBA_SRGB The layout for all of them would be UTIL_FORMAT_LAYOUT_BC. UFLOAT is a float without the sign bit. I guess UFLOAT should be used for R11G11B10_FLOAT and R9G9B9E5_FLOAT too. Sounds good again, though this is more Jose's domain than mine. Although I'm all for consistency, I really see no point in a format name which is based on an arbitrary number as used in DX11, instead of a symbolic name with meaning as we have currently Imagine we want to expose a different compressed format, e.g., Khronos' ETC format. Which BC number shall we give? What will happen when Microsoft decides to assign a different number? IMO, renaming compressed formats to PIPE_FORMAT_BC* is pure waste of time. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Fix an incorrect user vertex buffer reference
Nice catch. Jose On Mon, 2011-03-07 at 02:24 -0800, Thomas Hellstrom wrote: st-user_vb[attr] was always pointing to the same user vb, regardless of the value of attr. Together with reverting the temporary workaround for bug 34378, and a fix in the svga driver, this fixes googleearth on svga. Signed-off-by: Thomas Hellstrom thellst...@vmware.com --- src/mesa/state_tracker/st_draw.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index d6e67b7..40afa43 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -429,7 +429,7 @@ setup_non_interleaved_attribs(struct gl_context *ctx, vbuffer[attr].buffer_offset = 0; /* Track user vertex buffers. */ - pipe_resource_reference(st-user_vb[attr], vbuffer-buffer); + pipe_resource_reference(st-user_vb[attr], vbuffer[attr].buffer); st-user_vb_stride[attr] = stride; st-num_user_vbs = MAX2(st-num_user_vbs, attr+1); } @@ -632,10 +632,8 @@ st_draw_vbo(struct gl_context *ctx, struct pipe_index_buffer ibuffer; struct pipe_draw_info info; unsigned i, num_instances = 1; - GLboolean new_array = GL_TRUE; - /* Fix this (Bug 34378): GLboolean new_array = - st-dirty.st (st-dirty.mesa (_NEW_ARRAY | _NEW_PROGRAM)) != 0;*/ + st-dirty.st (st-dirty.mesa (_NEW_ARRAY | _NEW_PROGRAM)) != 0; /* Mesa core state should have been validated already */ assert(ctx-NewState == 0x0); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): intel: Try using glCopyTexSubImage2D in _mesa_meta_BlitFramebuffer
On Thu, 2011-02-24 at 08:43 -0800, Chris Wilson wrote: Module: Mesa Branch: master Commit: c0ad70ae31ee5501281b434d56e389fc92b13a3a URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=c0ad70ae31ee5501281b434d56e389fc92b13a3a Author: Neil Roberts n...@linux.intel.com Date: Sat Feb 5 10:21:11 2011 + intel: Try using glCopyTexSubImage2D in _mesa_meta_BlitFramebuffer In the case where glBlitFramebuffer is being used to copy to a texture without scaling it is faster if we can use the hardware to do a blit rather than having to do a texture render. In most of the drivers glCopyTexSubImage2D will use a blit so this patch makes it check for when glBlitFramebuffer is doing a simple copy and then divert to glCopyTexSubImage2D. This was originally proposed as an extension to the common meta-ops. However, it was rejected as using the BLT is only advantageous for Intel hardware. That doesn't make sense -- as long is it is not disadvantageous for other hardware it should be perfectly fine to put in the shared component. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: new transfer flag: DISCARD_WHOLE_RESOURCE
On Fri, 2011-02-18 at 08:38 -0800, Brian Paul wrote: On 02/18/2011 09:15 AM, jfons...@vmware.com wrote: From: Keith Whitwellkei...@vmware.com --- src/gallium/include/pipe/p_defines.h | 19 +-- 1 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index f66bbaf..2de707f 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -229,9 +229,9 @@ enum pipe_transfer_usage { * * See also: * - OpenGL's ARB_map_buffer_range extension, MAP_INVALIDATE_RANGE_BIT flag. -* - Direct3D's D3DLOCK_DISCARD flag. */ PIPE_TRANSFER_DISCARD = (1 8), + PIPE_TRANSFER_DISCARD_RANGE = (1 8), Does the second flag replace the first one? Are we keeping the first one just as a transitory thing? I'd say to either remove the old one or add a comment explaining what's going on. It's a transitory thing. I'll add a /* DEPRECATED */ comment and then start renaming. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Building Mesa 7.10 on Windows 7 / Visual Studio 2010
On Mon, 2011-02-14 at 15:13 -0800, Brede Johansen wrote: Hi, I have made VS2008 project and solution files based on the scons files. I have also included generation of necessary source files from python as part of the build. This also works in VS2010. My requirement was to get OpenGL software rendering to work so it's not tested for other configurations. I also had to make small changes to the source code, like casting void pointers to the proper type (malloc). The MSVC project files for Mesa haven't been maintained in a while. They'll be removed in the next Mesa release. Instead, take a look at the instructions for building with scons. This is sad news for me and probably a few others on the Windows platform. Most of us are used to writing code and debug inside Visual Studio and would be pretty helpless working from the command line. In December I tried to submit the project and solution files to mesa-user as an attachment but the post was to big and I didn't have the time to follow up. Btw. git is new territory for me. No need to use command line to build. See http://www.scons.org/wiki/IDEIntegration Jose On Mon, Feb 14, 2011 at 3:36 PM, Brian Paul brian.e.p...@gmail.com wrote: On Fri, Feb 11, 2011 at 11:31 AM, Yahya H. Mirza ya...@aurorasoft.net wrote: Hi All, I’ve been trying to build Mesa 7.10 on Windows 7 / Visual Studio 2010 and I have been having some problems. When I opened \windows\VC8\mesa\mesa.sln, it was automatically converted to VS2010. When I tried to build the various projects there were a number of problems including a number of misconfigured build directories. Additionally, when building the mesa project in VS2010, it has trouble finding a path for building “predefined shaders”. Finally, the “glsl_apps_compile” project seems to be out of date. Is there an updated version of this solution file for Mesa7.10 / VS2010, or has anyone successfully built Mesa7.10 with CMAKE? Any suggestions on successfully building Mesa7.10 for Windows7 / VS2010 would be greatly appreciated. The MSVC project files for Mesa haven't been maintained in a while. They'll be removed in the next Mesa release. Instead, take a look at the instructions for building with scons. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] gallium: remove pipe_vertex_buffer::max_index
On Mon, 2011-02-14 at 11:04 -0800, Marek Olšák wrote: On Mon, Feb 14, 2011 at 6:58 PM, José Fonseca jfons...@vmware.commailto:jfons...@vmware.com wrote: Marek, I'm OK with removing pipe_vertex_buffer::max_index but there is a bit more work involved, as they are not really equivalent in the guarantees. pipe_vertex_buffer::max_index is an attribute of the vertex buffer -- it describe the max index that can be fetch from the buffer without running into a buffer overflow. It is an hard limit -- it must be set accurately by the state tracker or crashes will occur. It can be removed because it can be derived from the vertex element size, vertex element stride, vertex buffer offset, and vertex buffer size. pipe_draw_info::max_index is an attribute of the index buffer: it describes the maximum index in the index buffer. It is an hint -- there may be higher indices in the index buffer, and if so it is OK for the driver to ignore those vertices, but it should not crash with a buffer overflow. Therefore, in order to safely remove pipe_vertex_buffer::max_index, we should compute the max_index inside the draw module / pipe drivers, and ensure vertices with higher indices will never be removed. There are a few places in this patch where you replace pipe_vertex_buffer::max_index with ~0 or no checks, which means that places which where previous robust to pipe_draw_info::max_index == ~0 and bogus indices will now start crashing. You're right in theory. In practice, pipe_vertex_buffer::max_index was really derived from the value which is put in pipe_draw_info::max_index and was correctly initialized for user buffers only. It was set to ~0 for hardware buffers. Moreover, it could also be computed with: pipe_vertex_buffer::max_index = (pipe_resource::width0 - pipe_vertex_buffer::buffer_offset) / pipe_vertex_buffer::stride - 1 So it was already redundant. Basically, pipe_resource::width0 is also redundant for user buffers, because it is actually computed from pipe_draw_info::max_index too. It's all logical because the index bounds are the only info we have about user buffers and we compute all the other properties from it. This is how width0 is computed: pipe_resource::width0 = (pipe_draw_info::max_index + 1) * pipe_vertex_buffer::stride; Now we substitute width0 in the first formula: pipe_vertex_buffer::max_index = ((pipe_draw_info::max_index + 1) * pipe_vertex_buffer::stride - pipe_vertex_buffer::buffer_offset) / pipe_vertex_buffer::stride - 1 If we examine the state tracker code, we'll notice that buffer_offset is always 0 for user buffers. After simplification, we get this: pipe_vertex_buffer::max_index = pipe_draw_info::max_index And that's the whole story. That said, I'd like not to call set_vertex_buffers only to update max_index if I can get the same info from pipe_draw_info. Because pipe_vertex_buffer::max_index was really the maximum index value in the index buffer, we don't need to clamp anything when we fetch vertices, the clamping would basically do nothing. That's why I removed the clamping in Draw and put ~0 in the translate::run parameter and in several other places. Does this explanation justify all of my changes in the patch to you? As I said, I'm perfectly fine with the removal of max_index. Mesa state tracker may not have been doing the right thing before. It should have been max_index = (pipe_vertex_buffer::size - pipe_vertex_buffer::buffer_offset - util_format_size(vertex_element-format)) / pipe_vertex_buffer::stride + 1 this is the logic that needs to be reinstated in the draw module to prevent buffer overflows spite bogus indices. But no sweat -- I have a few related draw changes on the pipe and I can easily do this. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] pb_bufmgr_cache: add is_buffer_busy hook and use it instead of non-blocking map
On Sun, 2011-02-13 at 23:58 -0800, Dave Airlie wrote: if(buf-base.base.size size) return 0; @@ -242,13 +240,10 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf, if(!pb_check_usage(desc-usage, buf-base.base.usage)) return 0; - map = pb_map(buf-buffer, PB_USAGE_DONTBLOCK, NULL); - if (!map) { - return -1; - } + if (buf-mgr-base.is_buffer_busy) + if (buf-mgr-base.is_buffer_busy(buf-mgr-base, buf-buffer)) + return -1; Oops, this is wrong. I will locally replace any occurences of buf-mgr-base(.) with buf-mgr-provider(-), which is how it was meant to be, but the idea remains the same. Please review. Marek, I don't understand what you want to do here: you removed the pb_map, but you left the pb_unmap, and what will happen if is_buffer_busy is not defined? I actually suggested this originally, but Jose I think preferred using the dontblock to the buffer mapping. I'd prefer that there is one way of doing this, but I didn't/don't feel strong about this. IMO, having two ways, PB_USAGE_DONTBLOCK and is_buffer_busy, is not cleaner that just PB_USAGE_DONTBLOCK, even if is_buffer_busy is conceptually cleaner. Marek, Would adding an inline function, pb_is_buffer_busy, that calls pb_map(PB_USAGE_DONTBLOCK)+pb_unmap inside work for you? Another way, would be to add is_buffer_busy and have the default implementation to do pb_map/pb_unmap. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] gallium: remove pipe_vertex_buffer::max_index
Marek, I'm OK with removing pipe_vertex_buffer::max_index but there is a bit more work involved, as they are not really equivalent in the guarantees. pipe_vertex_buffer::max_index is an attribute of the vertex buffer -- it describe the max index that can be fetch from the buffer without running into a buffer overflow. It is an hard limit -- it must be set accurately by the state tracker or crashes will occur. It can be removed because it can be derived from the vertex element size, vertex element stride, vertex buffer offset, and vertex buffer size. pipe_draw_info::max_index is an attribute of the index buffer: it describes the maximum index in the index buffer. It is an hint -- there may be higher indices in the index buffer, and if so it is OK for the driver to ignore those vertices, but it should not crash with a buffer overflow. Therefore, in order to safely remove pipe_vertex_buffer::max_index, we should compute the max_index inside the draw module / pipe drivers, and ensure vertices with higher indices will never be removed. There are a few places in this patch where you replace pipe_vertex_buffer::max_index with ~0 or no checks, which means that places which where previous robust to pipe_draw_info::max_index == ~0 and bogus indices will now start crashing. Jose On Sat, 2011-02-12 at 11:04 -0800, Marek Olšák wrote: This is redundant to pipe_draw_info::max_index and doesn't really fit in the optimizations I plan. --- src/gallium/auxiliary/draw/draw_llvm.c | 17 - src/gallium/auxiliary/draw/draw_llvm.h |5 + src/gallium/auxiliary/draw/draw_pt.c |3 +-- src/gallium/auxiliary/draw/draw_pt_fetch.c |4 ++-- src/gallium/auxiliary/draw/draw_pt_fetch_emit.c|2 +- .../auxiliary/draw/draw_pt_fetch_shade_emit.c |2 +- src/gallium/auxiliary/util/u_draw_quad.c |1 - src/gallium/auxiliary/util/u_dump_state.c |1 - src/gallium/docs/d3d11ddi.txt |1 - src/gallium/drivers/svga/svga_state_vs.c |2 +- src/gallium/drivers/trace/tr_dump_state.c |1 - src/gallium/include/pipe/p_state.h |1 - .../state_trackers/d3d1x/dxgi/src/dxgi_native.cpp |1 - .../state_trackers/d3d1x/gd3d11/d3d11_context.h|1 - src/gallium/state_trackers/vega/polygon.c |2 -- src/gallium/tests/graw/fs-test.c |1 - src/gallium/tests/graw/gs-test.c |2 -- src/gallium/tests/graw/quad-tex.c |1 - src/gallium/tests/graw/shader-leak.c |1 - src/gallium/tests/graw/tri-gs.c|1 - src/gallium/tests/graw/tri-instanced.c |2 -- src/gallium/tests/graw/tri.c |1 - src/gallium/tests/graw/vs-test.c |1 - src/mesa/state_tracker/st_draw.c |5 - src/mesa/state_tracker/st_draw_feedback.c |1 - 25 files changed, 11 insertions(+), 49 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index a73bdd7..a5217c1 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -214,13 +214,12 @@ static LLVMTypeRef create_jit_vertex_buffer_type(struct gallivm_state *gallivm) { LLVMTargetDataRef target = gallivm-target; - LLVMTypeRef elem_types[4]; + LLVMTypeRef elem_types[3]; LLVMTypeRef vb_type; elem_types[0] = - elem_types[1] = - elem_types[2] = LLVMInt32TypeInContext(gallivm-context); - elem_types[3] = LLVMPointerType(LLVMInt8TypeInContext(gallivm-context), 0); /* vs_constants */ + elem_types[1] = LLVMInt32TypeInContext(gallivm-context); + elem_types[2] = LLVMPointerType(LLVMInt8TypeInContext(gallivm-context), 0); /* vs_constants */ vb_type = LLVMStructTypeInContext(gallivm-context, elem_types, Elements(elem_types), 0); @@ -229,10 +228,8 @@ create_jit_vertex_buffer_type(struct gallivm_state *gallivm) LP_CHECK_MEMBER_OFFSET(struct pipe_vertex_buffer, stride, target, vb_type, 0); - LP_CHECK_MEMBER_OFFSET(struct pipe_vertex_buffer, max_index, - target, vb_type, 1); LP_CHECK_MEMBER_OFFSET(struct pipe_vertex_buffer, buffer_offset, - target, vb_type, 2); + target, vb_type, 1); LP_CHECK_STRUCT_SIZE(struct pipe_vertex_buffer, target, vb_type); @@ -513,9 +510,7 @@ generate_fetch(struct gallivm_state *gallivm, LLVMValueRef vbuffer_ptr = LLVMBuildGEP(builder, vbuffers_ptr, indices, 1, ); LLVMValueRef vb_stride = draw_jit_vbuffer_stride(gallivm, vbuf); - LLVMValueRef vb_max_index = draw_jit_vbuffer_max_index(gallivm, vbuf); LLVMValueRef
Re: [Mesa-dev] [PATCH 0/6] Mesa/Gallium vertex array state optimizations
Marek, Apart of some subtleties with removing pipe_vertex_buffer::max_index, I think this looks great. I'm OK with addressing the pipe_vertex_buffer::max_index issues after commiting this series, as well behaved applications should not be affected. Jose On Sat, 2011-02-12 at 11:05 -0800, Marek Olšák wrote: Hi, this patch series optimizes vertex array state changes in Mesa/Gallium. The problem with the vbo module and st/mesa is that it re-binds vertex arrays every draw operation instead of only when they get changed by the application, and this series aims to address that issue. Some new issues arose during the implemention though: 1) The VBO module didn't notify the underlying driver when it was changing buffer offsets and other vertex array properties. This is fixed in the 1st patch. 2) If we do not re-bind vertex arrays every draw operation, we must assure that the state is preserved after operations like glBlitFramebuffer. This is resolved in the 3rd patch using cso_cache. 3) Unfortunately, user buffers must be mutable in order to prevent re-binding vertex buffers because we have no way to know how large they are. Instead, a new context hook has been added to Gallium called 'redefine_user_buffer', which notifies a driver that a subrange of a user buffer should be reuploaded, and also redefines its size. I've only tested softpipe and r300g and there were no regressions. r600g should also work and Christopher told me his Nouveau drivers should be ready for this series too. Please review. Marek Olšák (6): vbo: notify a driver that we change buffer offsets, strides, etc. vbo: bind arrays only when necessary gallium: always save and restore vertex buffers using cso_cache gallium: remove pipe_vertex_buffer::max_index st/mesa: set vertex arrays state only when necessary gallium: notify drivers about possible changes in user buffer contents Best regards Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] pb_bufmgr_cache: add is_buffer_busy hook and use it instead of non-blocking map
On Mon, 2011-02-14 at 10:18 -0800, Marek Olšák wrote: On Mon, Feb 14, 2011 at 6:47 PM, José Fonseca jfons...@vmware.commailto:jfons...@vmware.com wrote: On Sun, 2011-02-13 at 23:58 -0800, Dave Airlie wrote: if(buf-base.base.size size) return 0; @@ -242,13 +240,10 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf, if(!pb_check_usage(desc-usage, buf-base.base.usage)) return 0; - map = pb_map(buf-buffer, PB_USAGE_DONTBLOCK, NULL); - if (!map) { - return -1; - } + if (buf-mgr-base.is_buffer_busy) + if (buf-mgr-base.is_buffer_busy(buf-mgr-base, buf-buffer)) + return -1; Oops, this is wrong. I will locally replace any occurences of buf-mgr-base(.) with buf-mgr-provider(-), which is how it was meant to be, but the idea remains the same. Please review. Marek, I don't understand what you want to do here: you removed the pb_map, but you left the pb_unmap, and what will happen if is_buffer_busy is not defined? I didn't leave the pb_unmap call, it was removed too, I just cut it off in my second email, since it wasn't relevant to the typo. Sorry about that. So there's only one way: is_buffer_busy. I actually suggested this originally, but Jose I think preferred using the dontblock to the buffer mapping. I'd prefer that there is one way of doing this, but I didn't/don't feel strong about this. IMO, having two ways, PB_USAGE_DONTBLOCK and is_buffer_busy, is not cleaner that just PB_USAGE_DONTBLOCK, even if is_buffer_busy is conceptually cleaner. The thing is mapping a buffer just to know whether it's being used is unnecessary, and the mapping itself may be slower than a simple is_busy query. Marek, Would adding an inline function, pb_is_buffer_busy, that calls pb_map(PB_USAGE_DONTBLOCK)+pb_unmap inside work for you? Another way, would be to add is_buffer_busy and have the default implementation to do pb_map/pb_unmap. I can add a piece of code that uses pb_map/pb_unmap if the is_buffer_busy hook is not set, so that the original behavior is preserved. Would that be ok with you? Here's a new patch: pb_bufmgr_cache: add is_buffer_busy hook and use it instead of non-blocking map This is cleaner and implementing the hook is optional. diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h b/src/gallium/auxiliary/pipebuffer/pb_bufm index 2ef0216..960068c 100644 --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h @@ -82,6 +82,10 @@ struct pb_manager */ void (*flush)( struct pb_manager *mgr ); + + boolean + (*is_buffer_busy)( struct pb_manager *mgr, + struct pb_buffer *buf ); }; diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c b/src/gallium/auxiliary/pipebuffer/p index a6eb403..25accef 100644 --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c @@ -227,8 +227,6 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf, pb_size size, const struct pb_desc *desc) { - void *map; - if(buf-base.base.size size) return 0; @@ -242,13 +240,18 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf, if(!pb_check_usage(desc-usage, buf-base.base.usage)) return 0; - map = pb_map(buf-buffer, PB_USAGE_DONTBLOCK, NULL); - if (!map) { - return -1; + if (buf-mgr-provider-is_buffer_busy) { + if (buf-mgr-provider-is_buffer_busy(buf-mgr-provider, buf-buffer)) + return -1; + } else { + void *ptr = pb_map(buf-buffer, PB_USAGE_DONTBLOCK, NULL); + + if (!ptr) + return -1; + + pb_unmap(buf-buffer); } - pb_unmap(buf-buffer); - return 1; } This looks a better solution in the interim. We can ensure implement is_buffer_busy everywhere, and replace this fallback with an assert(buf-mgr-provider-is_buffer_busy) at a later stage. Thanks. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] sRGB in gallium (was: allow rendering to sRGB textures if EXT_fb_srgb is unsupported)
On Wed, 2011-02-09 at 17:55 -0800, Dave Airlie wrote: On Thu, Feb 10, 2011 at 11:27 AM, Marek Olšák mar...@gmail.com wrote: In this case, we always use the corresponding linear format in create_surface, therefore we should check for linear format support as well. Seems sane to me. Dave. The patch looks good to me. But it reminds me of something I've been planning to ask here for some time now: Wouldn't it be easier/cleaner if sRGB sampling/rendering support was not done through formats, but through enable/disable bits in pipe_sampler_view/pipe_rt_blend_state? I know that DX9 has it as a state, DX10 doesn't even have SRGB anywher -- I suppose it has to handed in the shaders. How does the recent hardware cope with this? different RGB/sRGB formats, sampling/rendertarget state, or shader instruction? Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] sRGB in gallium
On Thu, 2011-02-10 at 02:06 -0800, Christoph Bumiller wrote: On 10.02.2011 09:47, José Fonseca wrote: On Wed, 2011-02-09 at 17:55 -0800, Dave Airlie wrote: On Thu, Feb 10, 2011 at 11:27 AM, Marek Olšák mar...@gmail.com wrote: In this case, we always use the corresponding linear format in create_surface, therefore we should check for linear format support as well. Seems sane to me. Dave. The patch looks good to me. But it reminds me of something I've been planning to ask here for some time now: Wouldn't it be easier/cleaner if sRGB sampling/rendering support was not done through formats, but through enable/disable bits in pipe_sampler_view/pipe_rt_blend_state? I know that DX9 has it as a state, DX10 doesn't even have SRGB anywher -- I suppose it has to handed in the shaders. How does the recent hardware cope with this? different RGB/sRGB formats, sampling/rendertarget state, or shader instruction? nv50,nvc0 have both - SRGB framebuffer formats (but only for the A8B8G8R8/A8R8G8B8 formats) and an SRGB switch that can be made part of a cso. But, I'm not sure if there is a difference between using an SRGB RT_FORMAT and using a non-SRGB format but setting the FRAMEBUFFER_SRGB bit. Also, the sampler *views* have an orthogonal SRGB bit. Everybody, Thanks for your replies. I get the impression that it looks doable and net result looks positive but not really something we need to go out of our way now to implement. It's definitely not to be done in a shader though, blending wouldn't work properly that way I think. Good point. D3D9 is a bit lenient about this, but I just checked that EXT_framebuffer_sRGB.txt is not. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix the Mesa IR copy propagation to not read past writes to the reg.
Looks good to me FWIW. When dst_reg.reladdr is set we could still restrict the reset to dst_reg.writemask bits, but this was not done before either. Jose On Fri, 2011-02-04 at 12:50 -0800, Eric Anholt wrote: Fixes glsl-vs-post-increment-01. --- src/mesa/program/ir_to_mesa.cpp | 47 +- 1 files changed, 40 insertions(+), 7 deletions(-) diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 3794c0d..d0ec23f 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -2742,13 +2742,46 @@ ir_to_mesa_visitor::copy_propagate(void) /* Continuing the block, clear any written channels from * the ACP. */ - if (inst-dst_reg.file == PROGRAM_TEMPORARY) { - if (inst-dst_reg.reladdr) { -memset(acp, 0, sizeof(*acp) * this-next_temp * 4); - } else { -for (int i = 0; i 4; i++) { - if (inst-dst_reg.writemask (1 i)) { - acp[4 * inst-dst_reg.index + i] = NULL; + if (inst-dst_reg.file == PROGRAM_TEMPORARY inst-dst_reg.reladdr) { + /* Any temporary might be written, so no copy propagation + * across this instruction. + */ + memset(acp, 0, sizeof(*acp) * this-next_temp * 4); + } else if (inst-dst_reg.file == PROGRAM_OUTPUT + inst-dst_reg.reladdr) { + /* Any output might be written, so no copy propagation + * from outputs across this instruction. + */ + for (int r = 0; r this-next_temp; r++) { +for (int c = 0; c 4; c++) { + if (acp[4 * r + c]-src_reg[0].file == PROGRAM_OUTPUT) + acp[4 * r + c] = NULL; +} + } + } else if (inst-dst_reg.file == PROGRAM_TEMPORARY || + inst-dst_reg.file == PROGRAM_OUTPUT) { + /* Clear where it's used as dst. */ + if (inst-dst_reg.file == PROGRAM_TEMPORARY) { +for (int c = 0; c 4; c++) { + if (inst-dst_reg.writemask (1 c)) { + acp[4 * inst-dst_reg.index + c] = NULL; + } +} + } + + /* Clear where it's used as src. */ + for (int r = 0; r this-next_temp; r++) { +for (int c = 0; c 4; c++) { + if (!acp[4 * r + c]) + continue; + + int src_chan = GET_SWZ(acp[4 * r + c]-src_reg[0].swizzle, c); + + if (acp[4 * r + c]-src_reg[0].file == inst-dst_reg.file + acp[4 * r + c]-src_reg[0].index == inst-dst_reg.index + inst-dst_reg.writemask (1 src_chan)) + { + acp[4 * r + c] = NULL; } } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glXMakeCurrent crashes (was: Re: How to obtain OpenGL implementation/driver information?)
On Fri, 2011-02-04 at 15:26 -0800, Benoit Jacob wrote: - Original Message - On Fre, 2011-02-04 at 14:21 -0800, Benoit Jacob wrote: - Original Message - Benoit Jacob wrote: - Original Message - On Thu, Feb 3, 2011 at 4:37 PM, Benoit Jacob bja...@mozilla.com wrote: I'm trying to see how to implement selective whitelisting/blacklisting of driver versions on X11 (my use case is to whitelist drivers for Firefox). The naive approach consists in creating an OpenGL context and calling glGetString(), however that is not optimal for me, for these reasons: * This has been enough to trigger crashes in the past. Ideally I want to be able to know the driver name, driver version, Mesa version, and any other thing that you think may be relevant. I need to get that information in a fast and safe way. There is no other way than glGetString if you ever experienced crash with it, it would be because you are doing something terribly wrong like using it without current context. It's not glGetString that's crashing, it's glXMakeCurrent. I forwarded a bug report from a user, though he's not been able to reproduce since: https://bugs.freedesktop.org/show_bug.cgi?id=32238 A search in Mesa's bugzilla confirms that I'm not alone: https://bugs.freedesktop.org/show_bug.cgi?id=30557 This latter bug looks like an i915 driver bug, as opposed to a MakeCurrent bug. Since the glGetString way will at best be slow, especially if we have to XSync and check for errors, could you consider exposing this information as new glXGetServerString / glXGetClientString strings? ? I don't understand the logic here. You're hitting a bug in glXCreateContext or MakeCurrent or something like that. So you'd like to add an entire new way to query the same information a driver already provides, just to provide an alternate path that hopefully doesn't exhibit the bug? Just fix the bug! There's no reason for glX extensions to add new functions here. My point is just that bugs exist. Since bugs exist, I am trying to implement a driver blacklist. My problem is that with GLX it's tricky because I can't get answer to the question should I avoid creating GL contexts on this driver without creating a GL context. I proposed to allow handling this in glXQuery(Server|Client)String because these functions are known to be non-crashy. What you're asking for is not possible, because the information you need depends on the context which is current. No shortcuts here I'm afraid. :) We're doing driver blacklists on all platforms, and it tends to be quite easy on other platforms. For example, on Windows, we just read all the driver information from the registry. Couldn't X drivers likewise have some metadata stored on disk, that could be queried via some new API? I proposed GLX because glXQueryServerString already knows about at least the driver vendor. But I don't mind having it exposed elsewhere than in GLX if that makes more sense :) Please take this request seriously: driver blacklists are useful, not limited to Firefox, and certainly not limited to X11. As I say, we blacklist drivers on all platforms, and we'd never have been able to make Firefox 4 releasable without blacklisting many Windows drivers. Zhenyao Mo, in CC, is working on similar features in Chromium. Benoit, I agree that adjusting behavior or blacklist is needed on the real world -- 3D in Firefox/Chrome/IE9 is substantially different from typical 3D games/apps and it will take time for both application and driver developers to iron issues out. I do think that you should make it easy for users to override blacklisting though, so that we can have more testing exposure, and eventually get to a point that neither the driver or app have bugs, and the users can enjoy the HW accelerated eye candy. On Windows Microsoft standardized a global place for settings (a.k.a, registry). As you well know, on Linux there is no such entity, spite a few attempts. For example, there is no single place to set HTTP proxy and proxy exclusions on Linux, so it is really unrealistic to expect such standardized behavior from graphics drivers. Note however, that even on windows the information that you can get from registry is limited. GL specific information like those returned by glGet* also require to create a GL context. I think that either you: a) assume glXMakeCurrent crashes is major driver breakeage and is somebody else (i.e., our) problem; b) trap segfaults in glXMakeCurrent; or c) should use a separate process as suggested by Brian. Jose ___ mesa-dev mailing list
Re: [Mesa-dev] glsl: Add a new opt_copy_propagation variant that does it channel-wise.
Eric, This code is causing segmentation faults on cinebench r11: Program received signal SIGSEGV, Segmentation fault. 0x76d3d7fd in exec_node::remove (this=0x1501590) at src/glsl/list.h:125 125 next-prev = prev; (gdb) bt #0 0x76d3d7fd in exec_node::remove (this=0x1501590) at src/glsl/list.h:125 #1 0x76d53d7f in ir_copy_propagation_elements_visitor::kill (this=0x7fffdb60, k=0x1500c20) at src/glsl/opt_copy_propagation_elements.cpp:390 #2 0x76d533a4 in ir_copy_propagation_elements_visitor::visit_leave (this=0x7fffdb60, ir=0x14b5820) at src/glsl/opt_copy_propagation_elements.cpp:167 #3 0x76d3f029 in ir_assignment::accept (this=0x14b5820, v=0x7fffdb60) at src/glsl/ir_hv_accept.cpp:299 #4 0x76d3e4f0 in visit_list_elements (v=0x7fffdb60, l=0x120e180) at src/glsl/ir_hv_accept.cpp:48 #5 0x76d532af in ir_copy_propagation_elements_visitor::visit_enter (this=0x7fffdb60, ir=0x120e130) at src/glsl/opt_copy_propagation_elements.cpp:151 #6 0x76d3e746 in ir_function_signature::accept (this=0x120e130, v=0x7fffdb60) at src/glsl/ir_hv_accept.cpp:112 #7 0x76d3e4f0 in visit_list_elements (v=0x7fffdb60, l=0x158eb10) at src/glsl/ir_hv_accept.cpp:48 #8 0x76d3e82c in ir_function::accept (this=0x158eae0, v=0x7fffdb60) at src/glsl/ir_hv_accept.cpp:132 #9 0x76d3e4f0 in visit_list_elements (v=0x7fffdb60, l=0x14b9c90) at src/glsl/ir_hv_accept.cpp:48 #10 0x76d5404c in do_copy_propagation_elements (instructions=0x14b9c90) at src/glsl/opt_copy_propagation_elements.cpp:455 #11 0x76d2dbb4 in do_common_optimization (ir=0x14b9c90, linked=true, max_unroll_iterations=32) at src/glsl/glsl_parser_extras.cpp:767 #12 0x76d45e72 in link_shaders (ctx=0x7869a0, prog=0x17f8d30) at src/glsl/linker.cpp:1630 #13 0x76c209f5 in _mesa_glsl_link_shader (ctx=0x7869a0, prog=0x17f8d30) at src/mesa/program/ir_to_mesa.cpp:3211 #14 0x76be9122 in link_program (ctx=0x7869a0, program=12) at src/mesa/main/shaderapi.c:885 #15 0x76bea336 in _mesa_LinkProgramARB (programObj=12) at src/mesa/main/shaderapi.c:1448 #16 0x00421907 in __glLinkProgram (program=12) at /home/jfonseca/projects/apitrace/build/linux/glproc.hpp:2737 #17 0x0044afad in retrace_glLinkProgram (call=...) at /home/jfonseca/projects/apitrace/build/linux/glretrace.cpp:6100 #18 0x00492e23 in retrace_call (call=...) at /home/jfonseca/projects/apitrace/build/linux/glretrace.cpp:30429 #19 0x004a84f2 in display () at /home/jfonseca/projects/apitrace/build/linux/glretrace.cpp:40611 #20 0x764acde3 in processWindowWorkList (window=0x705b20) at src/glut/glx/glut_event.c:1307 #21 0x764acef1 in __glutProcessWindowWorkLists () at src/glut/glx/glut_event.c:1358 #22 0x764acf61 in glutMainLoop () at src/glut/glx/glut_event.c:1379 #23 0x004a892d in main (argc=3, argv=0x7fffe0c8) at /home/jfonseca/projects/apitrace/build/linux/glretrace.cpp:40705 This can be reproduced by building glretrace [1], downloading [2], and doing glretrace -db -v cinebench-r11-test.trace I tried to look at it but I'm too unfamiliar with the code to venture. If you need any more info let me know. Jose [1] http://cgit.freedesktop.org/~jrfonseca/apitrace/ [2] http://people.freedesktop.org/~jrfonseca/traces/cinebench-r11-test.trace On Fri, 2011-02-04 at 10:45 -0800, Eric Anholt wrote: Module: Mesa Branch: master Commit: e31266ed3e3667c043bc5ad1abd65cfdb0fa7fdb URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e31266ed3e3667c043bc5ad1abd65cfdb0fa7fdb Author: Eric Anholt e...@anholt.net Date: Tue Jan 25 10:28:13 2011 +1000 glsl: Add a new opt_copy_propagation variant that does it channel-wise. This patch cleans up many of the extra copies in GLSL IR introduced by i965's scalarizing passes. It doesn't result in a statistically significant performance difference on nexuiz high settings (n=3) or my demo (n=10), due to brw_fs.cpp's register coalescing covering most of those extra moves anyway. However, it does make the debug of wine's GLSL shaders much more tractable, and reduces instruction count of glsl-fs-convolution-2 from 376 to 288. --- src/glsl/Makefile |1 + src/glsl/glsl_parser_extras.cpp|1 + src/glsl/ir_optimization.h |1 + src/glsl/opt_copy_propagation_elements.cpp | 461 4 files changed, 464 insertions(+), 0 deletions(-) diff --git a/src/glsl/Makefile b/src/glsl/Makefile index 4f30742..ec11c8a 100644 --- a/src/glsl/Makefile +++ b/src/glsl/Makefile @@ -69,6 +69,7 @@ CXX_SOURCES = \ opt_constant_propagation.cpp \ opt_constant_variable.cpp \ opt_copy_propagation.cpp \ + opt_copy_propagation_elements.cpp \ opt_dead_code.cpp \ opt_dead_code_local.cpp \
Re: [Mesa-dev] Mesa (master): glsl: Add using statements for standard library functions.
On Fri, 2011-02-04 at 09:58 -0800, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/03/2011 10:36 PM, Jose Fonseca wrote: This is very odd behavior from Sun's compiler -- unless we include cstdio, printf should be available, not std::printf. I think we might use cstdio in some places. I find it odd that Sun's compiler differs from GCC, Visual Studio, ICC, and clang. I'm tempted to say they're doing it wrong. Is there perhaps a command line option for Sun's compiler that could change this behavior? Not only is having to add extra using statements ugly, but I can guarantee it will be missed in the future. If we include cstdio then that would explain it. If cstdio is implemented with something like: namespace std { #include stdio.h }; Then #include cstdio #include stdio.h and #include stdio.h #include cstdio may give different results. Maybe the better answer is to not use cstdio and friends at all. I agree. If there is the change we also use stdio.h somewhere (and there is because we include some mesa headers), then it's probably better to stay away from cstdio and friends. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Gallium proposal: add a user pointer in pipe_resource
On Mon, 2011-01-31 at 11:48 -0800, Christoph Bumiller wrote: On 31.01.2011 19:46, Marek Olšák wrote: With this manager, the drivers don't have to deal with user buffers when they are bound as vertex buffers. They only get real hardware buffers. Please do *not* take away my user buffers and put user vertex arrays at the mercy of a state tracker ! In the DrawArrays case I usually use util/translate and interleave them letting it write directly into my command buffer for immediate mode vertex data submission. Christoph, Is there any reason for not wanting to the same optimization for non-user buffers? If the buffers are small and used only once, wouldn't you still want to write them directly into the command buffer? Because eliminating user buffers does not imply eliminating these optimization opportunities -- the driver can still know how big a buffer is, and the state tracker can set a flag such as PIPE_USAGE_ONCE to help the pipe driver figure out this is a fire and forget buffer. Perhaps we can have a PIPE_CAP for distinguishing the drivers that can inline small buffers, vs those who can and prefer them batched up in big vbos. And lets not forget the user arrays are a deprecated feature of GL. Applications will have to create a vbo even if all they wanna do is a draw a textured quad, therefore small vbos are worthwhile to optimize regardless. I'm not saying we must get rid of user buffers now, but I can't help feeling that it is odd that while recent versions of GL/DX APIs are eliminating index/vertex buffers in user memory, Gallium is optimizing for them... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): util: fix parsing debug options
Looks perfect. Thanks for doing this Marek. Jose On Thu, 2011-01-27 at 11:32 -0800, Marek Olšák wrote: Module: Mesa Branch: master Commit: 387fe8dd475d70f098eabc48a8a3696cf0b72275 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=387fe8dd475d70f098eabc48a8a3696cf0b72275 Author: Marek Olšák mar...@gmail.com Date: Wed Jan 26 11:46:39 2011 +0100 util: fix parsing debug options So that 'foo' can be found in: OPTION=prefixfoosuffix,foo Also allow that debug options can be separated by a non-alphanumeric characters instead of just commas. --- src/gallium/auxiliary/util/u_debug.c | 44 +++-- 1 files changed, 25 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug.c b/src/gallium/auxiliary/util/u_debug.c index 8cf7660..36ce4b5 100644 --- a/src/gallium/auxiliary/util/u_debug.c +++ b/src/gallium/auxiliary/util/u_debug.c @@ -44,6 +44,7 @@ #include util/u_surface.h #include limits.h /* CHAR_BIT */ +#include ctype.h /* isalnum */ void _debug_vprintf(const char *format, va_list ap) { @@ -182,36 +183,41 @@ debug_get_num_option(const char *name, long dfault) static boolean str_has_option(const char *str, const char *name) { - const char *substr; + /* Empty string. */ + if (!*str) { + return FALSE; + } /* OPTION=all */ if (!util_strcmp(str, all)) { return TRUE; } - /* OPTION=name */ - if (!util_strcmp(str, name)) { - return TRUE; - } + /* Find 'name' in 'str' surrounded by non-alphanumeric characters. */ + { + const char *start = str; + unsigned name_len = strlen(name); - substr = util_strstr(str, name); + /* 'start' is the beginning of the currently-parsed word, + * we increment 'str' each iteration. + * if we find either the end of string or a non-alphanumeric character, + * we compare 'start' up to 'str-1' with 'name'. */ - if (substr) { - unsigned name_len = strlen(name); + while (1) { + if (!*str || !isalnum(*str)) { +if (str-start == name_len +!memcmp(start, name, name_len)) { + return TRUE; +} - /* OPTION=name,... */ - if (substr == str substr[name_len] == ',') { - return TRUE; - } +if (!*str) { + return FALSE; +} - /* OPTION=...,name */ - if (substr+name_len == str+strlen(str) substr[-1] == ',') { - return TRUE; - } +start = str+1; + } - /* OPTION=...,name,... */ - if (substr[-1] == ',' substr[name_len] == ',') { - return TRUE; + str++; } } ___ mesa-commit mailing list mesa-com...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-commit ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: require debug options to be separated by commas
On Mon, 2011-01-24 at 20:52 -0800, Marek Olšák wrote: Let's assume there are two options with names such that one is a substring of another. Previously, if we only specified the longer one as a debug option, the shorter one would be considered specified as well (because of strstr). This commit fixes it by checking that each option is surrounded by commas. (a regexp would be nicer, but this is not a performance critical code) --- src/gallium/auxiliary/util/u_debug.c | 39 +- 1 files changed, 38 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug.c b/src/gallium/auxiliary/util/u_debug.c index 59b7613..8cf7660 100644 --- a/src/gallium/auxiliary/util/u_debug.c +++ b/src/gallium/auxiliary/util/u_debug.c @@ -180,6 +180,43 @@ debug_get_num_option(const char *name, long dfault) return result; } +static boolean str_has_option(const char *str, const char *name) +{ + const char *substr; + + /* OPTION=all */ + if (!util_strcmp(str, all)) { + return TRUE; + } + + /* OPTION=name */ + if (!util_strcmp(str, name)) { + return TRUE; + } + + substr = util_strstr(str, name); + + if (substr) { + unsigned name_len = strlen(name); + + /* OPTION=name,... */ + if (substr == str substr[name_len] == ',') { + return TRUE; + } + + /* OPTION=...,name */ + if (substr+name_len == str+strlen(str) substr[-1] == ',') { + return TRUE; + } + + /* OPTION=...,name,... */ + if (substr[-1] == ',' substr[name_len] == ',') { + return TRUE; + } + } + + return FALSE; +} Marek, The intention is good -- it would be nice to get options obeyed properly --, but this will fail to find foo in OPTION=prefixfoosuffix,foo, so it's replacing a bug with another. I'd prefer we stop using strstr completely, and instead do: 1 - find the first ',' or '\0' in the string 2 - compare the previous characters with the option being searched, and return TRUE if matches 3 - if it was ',' go back to 1, but starting from character after ','. 4 - otherwise return FALSE It should be robust and almost the same amount of code. Jose unsigned long debug_get_flags_option(const char *name, @@ -207,7 +244,7 @@ debug_get_flags_option(const char *name, else { result = 0; while( flags-name ) { - if (!util_strcmp(str, all) || util_strstr(str, flags-name )) + if (str_has_option(str, flags-name)) result |= flags-value; ++flags; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Depthbuffer gl_FragDepth / glReadPixels on VMWare
Depth-writing shaders should work. Although it is very hard to know whether they work or not given the depth-buffer can't be read. But it should be possible to verify by checking the values with depth-testing. If it doesn't work then sounds like bug. Jose On Wed, 2011-01-26 at 02:25 -0800, stef...@starnberg-mail.de wrote: Hi Jose, thanks for the info, i got it working now: - Instead of reading depth I just render 256 greylevel quads and readback the color buffer (performance is not an issue here, I just need an image that tells me that the depth buffer is ok) - The GLSL depth-write shader doesn't work when the texture format is DEPTH_COMPONENT. I've changed it to LUMINANCE16 when running on Mesa/VMWare. Again, thanks for the help. Stefan Zitat von José Fonseca jfons...@vmware.com: D3D9 API limits the blits to/from depth-stencil buffers as well. The API is pretty much designed to ensure that depth-stencil buffers stay in VRAM (probably in a hardware specific layout) and never get out of there. Several vendors allow binding the depth buffer as a texture, but they implicitly do shadow mapping. It might be possible to read the depth values with certain non-standard depth formats supported by major vendors. Reading the stencil buffer is pretty much impossible AFAICT. Jose On Tue, 2011-01-25 at 06:04 -0800, stef...@starnberg-mail.de wrote: Hi Jose, thanks for the quick reply: I'm using Win7 for both, guest (32bit) and host (64bit). I do the depth buffer reads only for debugging / regression testing. Would a copy depth-to-texture and shader blit to the color channels work ? Reading the color back buffer via glReadPixels is ok. Regards, Stefan Zitat von José Fonseca jfons...@vmware.com: On Tue, 2011-01-25 at 01:13 -0800, stef...@starnberg-mail.de wrote: Hi, i'm trying to get one of our testsuites running in VMWare (VMware, Inc. Gallium 0.3 on SVGA3D; build: RELEASE; OGL 2.1 Mesa 7.7.1-DEVEL). With the GDI backend everything works fine (tested in 7.7,7.8,7.10). I have a glsl shader that writes depth like void main() { vec4 v = texure2D(tex,gl_TexCoord[0].st); gl_FragColor = vec4(0,0,0,0); gl_FragDepth = v.x; } which doesn't work when running in VMWare's Mesa. Even a simple clear and readback of the depth buffer doesn't work, like: glClearDepth(0.6f); glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT); glReadPixels(0,0,m_Dim[0], m_Dim[1], GL_DEPTH_STENCIL_EXT, GL_UNSIGNED_INT_24_8_EXT, tmpBufferInt); or glReadPixels(0,0,m_Dim[0], m_Dim[1], GL_DEPTH_COMPONENT, GL_FLOAT, tmpBuffer); Both reads return all zeros. I don't know if VMWare's Mesa is a different branch and if this is the right place to report those bugs (if it is a bug) Stefan Stefan, What guest OS and host OS are you using? We can only comment here on the open sourced Linux OpenGL guest drivers. The typical procedure is to file a SR through http://www.vmware.com/support/contacts/ , which ensures the issue will be included in our internal bug database, and then triage and addressed in an eventual release. That said, I can advance that reading the depth-stencil buffer currently doesn't work on Windows hosts due to limitations that D3D9 API imposes on locking depth-stencil buffers. But it should work on Linux/MacOSX hosts. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: support internalFormat=GL_BGRA in TexImage2D
On Sat, 2011-01-22 at 19:18 -0800, Chad Versace wrote: On Sat, Jan 22, 2011 at 17:46, Ian Romanick i...@freedesktop.org wrote: What I want is a way with cmake to build files twice. The first time will be in the existing way. The second time will compile with - -DUSE_OPENGL_ES and will generate a .o file with a different name. ... I don't know cmake well enough to do that, and I keep forgetting to ask Chad. I have done just that with cmake recently; that is, building an executable twice, once using GL and again with GLES2. When Monday arrives, I'll take a look at how to best coerce Piglit to do this. It looks you already know a better way, but just in case, a straightforward way to do this is using out-of-source builds, one runs configures cmake with two seperate build directories (e.g., build/gl, build/gles) enabling GLES on the latter, and then its simply a matter of doing make -C build/gl make -C build/gles Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: Add support for GLES.
Hi Olv, Looks good to me FWIW. I think we should avoid having opengl32.dll or the ICD loading glapi.dll, but that's not a reason to s given you've made it optional. Implementing EGL on Windows without implementing GL doesn't make much sense, so we could have GLES libraries dynamically loading the ICD or something like that. On Windows CE EGL story would be different though -- but I'm not familiar with the ABIs there. BTW, I'm very happy to see somebody else to serious changes on scons. See also http://www.reddit.com/r/programming/comments/eyl75/i_saw_a_book_entitled_die_gnu_autotools_and_i/ for a laugh. ;-) Jose On Thu, 2011-01-20 at 06:23 -0800, Chia-I Wu wrote: From: Chia-I Wu o...@lunarg.com GLES can be enabled by running scons with $ scons gles=yes When gles=yes is given, the build is changed in three ways. First, libmesa.a will be built with FEATURE_ES1 and FEATURE_ES2. This makes DRI drivers and libEGL support and advertise GLES support. Second, GLES libraries will be created. They are libGLESv1_CM, libGLESv2, and libglapi. Last, libGL or opengl32 will link to libglapi. This change is required as _glapi_* will be declared as __declspec(dllimport) in libmesa.a on windows. libmesa.a expects those symbols to be defined in another DLL. Due to this change to GL, GLES support is marked experimental. Note that GLES requires libxml2-python to generate some of its sources. The Windows build is tested with samples from http://code.google.com/p/angleproject/ --- SConstruct |6 ++ common.py |1 + src/SConscript |7 ++ src/gallium/state_trackers/wgl/SConscript |3 + src/gallium/state_trackers/wgl/stw_device.c |3 + src/gallium/targets/egl-static/SConscript | 16 +++- src/gallium/targets/libgl-gdi/SConscript|6 ++ src/gallium/targets/libgl-xlib/SConscript |6 ++ src/mapi/glapi/SConscript |5 + src/mapi/glapi/glapi.h |5 - src/mapi/shared-glapi/SConscript| 116 +++ src/mesa/SConscript | 63 +++ 12 files changed, 228 insertions(+), 9 deletions(-) create mode 100644 src/mapi/shared-glapi/SConscript diff --git a/SConstruct b/SConstruct index 368ad83..a2c2047 100644 --- a/SConstruct +++ b/SConstruct @@ -56,6 +56,12 @@ else: Help(opts.GenerateHelpText(env)) +# fail early for a common error on windows +if env['gles']: +try: +import libxml2 +except ImportError: +raise SCons.Errors.UserError, GLES requires libxml2-python to build ### # Environment setup diff --git a/common.py b/common.py index 76184d5..cbb6162 100644 --- a/common.py +++ b/common.py @@ -90,6 +90,7 @@ def AddOptions(opts): opts.Add(EnumOption('platform', 'target platform', host_platform, allowed_values=('linux', 'cell', 'windows', 'winddk', 'wince', 'darwin', 'embedded', 'cygwin', 'sunos5', 'freebsd8'))) opts.Add('toolchain', 'compiler toolchain', default_toolchain) + opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support', 'no')) opts.Add(BoolOption('llvm', 'use LLVM', default_llvm)) opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes')) opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no')) diff --git a/src/SConscript b/src/SConscript index 201812c..06c6f94 100644 --- a/src/SConscript +++ b/src/SConscript @@ -8,6 +8,10 @@ else: Export('talloc') SConscript('glsl/SConscript') +# When env['gles'] is set, the targets defined in mapi/glapi/SConscript are not +# used. libgl-xlib and libgl-gdi adapt themselves to use the targets defined +# in mapi/glapi-shared/SConscript. mesa/SConscript also adapts itself to +# enable OpenGL ES support. SConscript('mapi/glapi/SConscript') SConscript('mesa/SConscript') @@ -17,5 +21,8 @@ if env['platform'] != 'embedded': SConscript('egl/main/SConscript') SConscript('glut/glx/SConscript') +if env['gles']: +SConscript('mapi/shared-glapi/SConscript') + SConscript('gallium/SConscript') diff --git a/src/gallium/state_trackers/wgl/SConscript b/src/gallium/state_trackers/wgl/SConscript index 1b7597d..7cb953b 100644 --- a/src/gallium/state_trackers/wgl/SConscript +++ b/src/gallium/state_trackers/wgl/SConscript @@ -15,6 +15,9 @@ env.AppendUnique(CPPDEFINES = [ 'BUILD_GL32', # declare gl* as __declspec(dllexport) in Mesa headers 'WIN32_LEAN_AND_MEAN', # http://msdn2.microsoft.com/en-us/library/6dwk3a1z.aspx ]) +if not env['gles']: +# prevent _glapi_* from being declared __declspec(dllimport) +env.Append(CPPDEFINES =
Re: [Mesa-dev] Mesa (master): glsl: Autogenerate builtin_functions.cpp as part of the build process.
I've fixed the getopt.h dependency and talloc linking. MSVC is now building without errors. MinGW cross-compilation is still broken because a different compiler must be used for builtin_functions (ordinary gcc instead of i586-mingw32msvc-gcc) This was also necessary for Michal's glsl frontend, so I'll bring back the scons glue from that time. Jose On Tue, 2011-01-11 at 23:15 -0800, Vinson Lee wrote: Windows MSVC SCons build src\glsl\main.cpp(25) : fatal error C1083: Cannot open include file: 'getopt.h': No such file or directory Linux MinGW SCons build Linking build/windows-x86-debug/glsl/builtin_compiler.exe ... /usr/lib/gcc/i586-mingw32msvc/4.4.4/../../../../i586-mingw32msvc/bin/ld: cannot find -ltalloc From: Kenneth Graunke [kenn...@whitecape.org] Sent: Tuesday, January 11, 2011 9:59 PM To: Vinson Lee Cc: mesa-dev@lists.freedesktop.org Subject: Re: Mesa (master): glsl: Autogenerate builtin_functions.cpp as part of the build process. On Monday, January 10, 2011 11:31:19 pm you wrote: Hi Kenneth, This commit introduced build breakages with the MSVC SCons build and the MinGW SCons build. Vinson (CC'd mesa-dev) Hi Vinson, I assume it's trouble linking with talloc? I'd talked to Jakob about that on IRC - apparently on Linux you need 'talloc' (quoted literal) in the LIBS, while on Windows you need talloc (no ''s - a variable reference). He said to just put 'talloc' and we will fix it. Or, did it break somewhere else...? Aside from reverting it (which I'm loathe to do), I'm not sure how to fix this. Sorry for the trouble. --Kenneth ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Autogenerate builtin_functions.cpp as part of the build process.
This breaks cross-compilation because builtin_function.cpp needs to be built with the native compiler, instead of the cross compiler. I see there is a python program too. Is this native program unavoidable? Jose On Sat, 2011-01-08 at 21:53 -0800, Kenneth Graunke wrote: Now that this works with both make and SCons, builtin_function.cpp no longer needs to live in the repository. --- src/glsl/.gitignore |1 + src/glsl/Makefile| 16 +- src/glsl/SConscript | 24 +- src/glsl/builtin_function.cpp|13637 -- src/glsl/builtins/tools/generate_builtins.py | 11 +- 7 files changed, 78 insertions(+), 13687 deletions(-) glsl/builtin_function.cpp - deleted glsl/builtins/tools/builtin_function.cpp - renamed as glsl/builtin_stubs.cpp diff --git a/src/glsl/.gitignore b/src/glsl/.gitignore index 4c21231..162ed42 100644 --- a/src/glsl/.gitignore +++ b/src/glsl/.gitignore @@ -1,2 +1,3 @@ glsl_compiler glsl_parser.output +builtin_function.cpp diff --git a/src/glsl/Makefile b/src/glsl/Makefile index 86a577e..93b3bc3 100644 --- a/src/glsl/Makefile +++ b/src/glsl/Makefile @@ -133,10 +133,11 @@ default: depend lib$(LIBNAME).a $(APPS) lib$(LIBNAME).a: $(OBJECTS) Makefile $(TOP)/src/glsl/Makefile.template $(MKLIB) -cplusplus -o $(LIBNAME) -static $(OBJECTS) -depend: $(ALL_SOURCES) Makefile +DEPEND_SOURCES=$(subst builtin_function.cpp,,$(ALL_SOURCES)) +depend: $(DEPEND_SOURCES) Makefile rm -f depend touch depend - $(MKDEP) $(MKDEP_OPTIONS) $(INCLUDES) $(ALL_SOURCES) 2 /dev/null + $(MKDEP) $(MKDEP_OPTIONS) $(INCLUDES) $(DEPEND_SOURCES) 2 /dev/null # Remove .o and backup files clean: @@ -174,13 +175,12 @@ glcpp/glcpp-lex.c: glcpp/glcpp-lex.l glcpp/glcpp-parse.c: glcpp/glcpp-parse.y bison -v -o $@ --defines=glcpp/glcpp-parse.h $ -builtins: builtin_function.cpp builtins/profiles/* builtins/ir/* builtins/tools/generate_builtins.py builtins/tools/texture_builtins.py +BUILTIN_OBJECTS = $(subst builtin_function,builtin_stubs,$(OBJECTS)) $(GLSL2_OBJECTS) +builtin_function.cpp: builtins/profiles/* builtins/ir/* builtins/tools/generate_builtins.py builtins/tools/texture_builtins.py $(BUILTIN_OBJECTS) @echo Bootstrapping the compiler... - cp builtins/tools/builtin_function.cpp . - make glsl_compiler + $(APP_CXX) $(INCLUDES) $(CFLAGS) $(LDFLAGS) $(BUILTIN_OBJECTS) $(TALLOC_LIBS) -o builtin_compiler @echo Regenerating builtin_function.cpp... - $(PYTHON2) $(PYTHON_FLAGS) builtins/tools/generate_builtins.py builtin_function.cpp - @echo Rebuilding the real compiler... - make glsl_compiler + $(PYTHON2) $(PYTHON_FLAGS) builtins/tools/generate_builtins.py $(PWD)/builtin_compiler builtin_function.cpp + rm builtin_compiler builtin_stubs.o -include depend diff --git a/src/glsl/SConscript b/src/glsl/SConscript index f179721..5a0d396 100644 --- a/src/glsl/SConscript +++ b/src/glsl/SConscript @@ -2,6 +2,8 @@ import common Import('*') +from sys import executable as python_cmd + env = env.Clone() env.Prepend(CPPPATH = [ @@ -20,7 +22,6 @@ sources = [ 'ast_function.cpp', 'ast_to_hir.cpp', 'ast_type.cpp', -'builtin_function.cpp', 'glsl_lexer.cpp', 'glsl_parser.cpp', 'glsl_parser_extras.cpp', @@ -79,9 +80,28 @@ sources = [ 'strtod.c', ] +env.Prepend(LIBS = ['talloc']) +env.Append(CPPPATH = ['#/src/glsl']) + +builtin_compiler = env.Program( +target = 'builtin_compiler', +source = sources + ['main.cpp', 'builtin_stubs.cpp', +'../mesa/program/hash_table.c', +'../mesa/program/symbol_table.c'], +) + +env.CodeGenerate( +target = 'builtin_function.cpp', +script = 'builtins/tools/generate_builtins.py', +source = builtin_compiler, +command = python_cmd + ' $SCRIPT $SOURCE $TARGET' +) + +env.Depends('builtin_function.cpp', ['builtins/tools/generate_builtins.py', 'builtins/tools/texture_builtins.py'] + Glob('builtins/ir/*')) + glsl = env.ConvenienceLibrary( target = 'glsl', -source = sources, +source = sources + [ 'builtin_function.cpp' ], ) Export('glsl') diff --git a/src/glsl/builtins/tools/generate_builtins.py b/src/glsl/builtins/tools/generate_builtins.py index e2de9db..8b11338 100755 --- a/src/glsl/builtins/tools/generate_builtins.py +++ b/src/glsl/builtins/tools/generate_builtins.py @@ -5,12 +5,20 @@ import re from glob import glob from os import path from subprocess import Popen, PIPE +from sys import argv # Local module: generator for texture lookup builtins from texture_builtins import generate_texture_functions builtins_dir = path.join(path.dirname(path.abspath(__file__)), ..) +# Get the path to the standalone GLSL compiler +if len(argv) !=
Re: [Mesa-dev] [PATCH] cmake: add build system to some of the egl demos
Commited. Thanks. Jose On Thu, 2011-01-06 at 11:58 -0800, kristof.ralov...@gmail.com wrote: On 2011-01-04, José Fonseca jfons...@vmware.com wrote: Kristof, It looks good overall with the exception that the build will fail if the EGL library is not available. Please guard the egl subdirectories and/or targets with if (EGL_egl_LIBRARY) ... endif (EGL_egl_LIBRARY) and make sure that when EGL library is not available the build will still succeed. Jose Jose, thanks for the feedback. Updated patch is attached. Thanks, Kristof On Mon, 2011-01-03 at 13:15 -0800, kristof.ralov...@gmail.com wrote: From da2803a63896362940f0d36cb6412ae46cfd345a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?RALOVICH,=20Krist=C3=B3f?= tad...@freemail.hu Date: Mon, 3 Jan 2011 22:13:51 +0100 Subject: [PATCH] cmake: add build system to some of the egl demos --- CMakeLists.txt|2 ++ src/CMakeLists.txt|2 ++ src/egl/CMakeLists.txt|2 ++ src/egl/eglut/CMakeLists.txt |7 +++ src/egl/opengl/CMakeLists.txt | 15 +++ 5 files changed, 28 insertions(+), 0 deletions(-) create mode 100644 src/egl/CMakeLists.txt create mode 100644 src/egl/eglut/CMakeLists.txt create mode 100644 src/egl/opengl/CMakeLists.txt diff --git a/CMakeLists.txt b/CMakeLists.txt index cd84233..7b5dcf9 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -6,6 +6,8 @@ find_package (OpenGL REQUIRED) find_package (GLUT REQUIRED) find_package (X11) +find_library(EGL_egl_LIBRARY EGL /usr/lib) + find_library (GLEW_glew_LIBRARY GLEW /usr/lib ) diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index fa377d1..7f874a3 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -14,6 +14,8 @@ add_subdirectory (vp) add_subdirectory (vpglsl) add_subdirectory (gs) +add_subdirectory(egl) + if (X11_FOUND) add_subdirectory (xdemos) endif (X11_FOUND) diff --git a/src/egl/CMakeLists.txt b/src/egl/CMakeLists.txt new file mode 100644 index 000..0318575 --- /dev/null +++ b/src/egl/CMakeLists.txt @@ -0,0 +1,2 @@ +add_subdirectory(eglut) +add_subdirectory(opengl) \ No newline at end of file diff --git a/src/egl/eglut/CMakeLists.txt b/src/egl/eglut/CMakeLists.txt new file mode 100644 index 000..76d48df --- /dev/null +++ b/src/egl/eglut/CMakeLists.txt @@ -0,0 +1,7 @@ +if(X11_FOUND) + add_library(eglut_x11 eglut.h eglut.c eglutint.h eglut_x11.c) + target_link_libraries(eglut_x11 ${OPENGL_gl_LIBRARY}) +endif(X11_FOUND) + +add_library(eglut_screen eglut.h eglut.c eglutint.h eglut_screen.c) +target_link_libraries(eglut_screen ${OPENGL_gl_LIBRARY}) diff --git a/src/egl/opengl/CMakeLists.txt b/src/egl/opengl/CMakeLists.txt new file mode 100644 index 000..ede9ec3 --- /dev/null +++ b/src/egl/opengl/CMakeLists.txt @@ -0,0 +1,15 @@ +include_directories(${EGL_INCLUDE_DIR} + ../eglut + ) + +add_executable(eglinfo eglinfo.c) +target_link_libraries(eglinfo ${EGL_egl_LIBRARY}) + +add_executable(eglgears_screen eglgears.c) +target_link_libraries(eglgears_screen ${EGL_egl_LIBRARY} eglut_screen) + +if(X11_FOUND) + add_executable(eglgears_x11 eglgears.c) + target_link_libraries(eglgears_x11 ${EGL_egl_LIBRARY} eglut_x11) +endif(X11_FOUND) + -- 1.7.1 Python File (no console) attachment (ATT1..txt) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Uploading PIPE_FORMAT_B8G8R8A8_UNORM to pixmap texture in PIPE_FORMAT_B8G8R8X8_UNORM
Jerome, Hmm. B8G8R8A8 - B8G8R8X8 should be accepted by util_is_format_compatible. It looks like util_is_format_compatible arguments are swapped, ie., it should be assert(util_is_format_compatible(util_format_description(src-format), util_format_description(dst-format))); instead of assert(util_is_format_compatible(util_format_description(dst-format), util_format_description(src-format))); Jose On Wed, 2010-12-15 at 10:55 -0800, Jerome Glisse wrote: Hi, I am facing an issue which i am not sure what is the proper things to do about. piglit tfp test try to upload PIPE_FORMAT_B8G8R8A8_UNORM to texture pixmap which is PIPE_FORMAT_B8G8R8X8_UNORM r600g assert in blitter util because format are not compatible. Should all pixmap texture considered with A8 ? Should i just disable the assert in case of A-X convertion ? Other answer ? https://bugs.freedesktop.org/show_bug.cgi?id=32370 Cheers, Jerome ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glretrace
On Mon, 2010-12-06 at 23:27 -0800, tom fogal wrote: Jose Fonseca jfons...@vmware.com writes: I can remove python2.6 dependency as you suggest, but although this new string formatting method is seldom used it actually makes the code much more readable and I was hoping to spread its use, so I'd like you to try one thing first before giving it up: I agree that it's a *lot* prettier than the old-style % (a) formatters! I don't know which Debian release you have, but Debian has been shipping a python2.6 for a long time. Only in unstable testing. Stable is still running 2.5: http://packages.debian.org/stable/python/python Ah. I Debian a lot, but I always use unstable (which has 2.6 for a long time) and somehow thought there was a stable release more recently. Apparently not. If you install it and pull the latest apitrace code from git [. . .] then it should pickup the /usr/bin/python2.6 interpreter binary instead of the /usr/bin/python, therefore not failing. I don't have a /usr/bin/python2.6, as you might guess from the above. Does this work for you? You may need to remove the CMakeCache.txt file first. Nah, no 2.6 so of course it can't find it. I was surprised that the aforementioned CMake macro didn't cause an error to pop up though; I would've expected configuration to fail because my python isn't up to snuff, but it doesn't appear to notice. This was really a question of is it worth it to support 2.5? I can say that it looks like it will be a moot issue for Debian soon: http://www.debian.org/News/2010/20101116b I don't know about other distros... but I do know that Debian's been far too long without a stable =(. So maybe the solution here is just, use a system with a python from the last couple years. If you're comfortable upgrading your development machine to squeeze then that would be super. I suppose there are many other reasons to upgrade beyond python. There are less disruptive alternatives to upgrade like chroots. Apparently it is really straightforward to get a squeeze chroot [1] Jose [1] http://www.codelain.com/forum/index.php?topic=11548.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [GLSL] defined expressions
On Mon, 2010-12-06 at 17:10 -0800, Carl Worth wrote: On Mon, 06 Dec 2010 20:23:52 +, José Fonseca jfons...@vmware.com wrote: I suppose it is possible for space to be significant on a conditional expression, e.g., #if foo () vs #if foo There's no significant whitespace here. A function-like macro can be invoked without or without intervening whitespace between the macro name and the '(' character. (It's only when *defining* the macro that the space is significant.) Meanwhile, a single macro foo cannot exist as both a function-like and an object-like macro at the same time. If there isn't an obvious cleaner solution, would you mind I commit my original patch: ... +| DEFINED SPACE IDENTIFIER { + int v = hash_table_find (parser-defines, $3) ? 1 : 0; + $$ = _token_create_ival (parser, INTEGER, v); + } | DEFINED '(' IDENTIFIER ')' { int v = hash_table_find (parser-defines, $3) ? 1 : 0; $$ = _token_create_ival (parser, INTEGER, v); This is at least inadequate in that it doesn't also add the case for: DEFINED SPACE '(' IDENTIFIER ')' Good point. But a much cleaner solution would be to simply change the conditional_tokens production to not have space_tokens = 1. Ken and I discussed this over lunch today and he agreed to do that. It doesn't makes things any worse, and it would allow us to focus on other problems. Yes, getting this fixed and moving on will be great. :-) Sounds great! Thanks Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] LLVM_CFLAGS
On Mon, 2010-12-06 at 10:58 -0800, Brian Paul wrote: I'd like to simply try --cppflags for now and see if that fixes the problem for us. There's a few -D flags that we also need; it's not just the -I path. I agree. The scons build has been using --cppflags since the very beginning. I don't recall any problem. On 12/06/2010 11:27 AM, Bob Gleitsmann wrote: There is an option --includefiles that returns the directory with the include files. It isn't prefixed by -I so that would have to be worked out. Is that all that is needed? Seems like it should be. BG On Monday, December 06, 2010 12:58:11 pm Corbin Simpson wrote: Isn't there a --cflags-only-I or something along those lines? Sending from a mobile, pardon the brevity. ~ C. On Dec 6, 2010 9:42 AM, Dan Nicholsondbn.li...@gmail.com wrote: On Mon, Dec 6, 2010 at 9:15 AM, Brian Paulbri...@vmware.com wrote: On 12/05/2010 02:06 AM, Bob Gleitsmann wrote: Hello, Can someone explain the value of including this in mesa/src/gallium/Makefile.template: ifeq ($(MESA_LLVM),1) LIBRARY_DEFINES += $(LLVM_CFLAGS) endif ? This effectively adds the LLVM cflags to gcc compiles if LLVM is enabled. One side-effect of this is to include -O3 optimization no matter what, making debugging very difficult. Removing it seems to have no catastrophic effects (or even detectable ones). But maybe I am missing something. We need some of the LLVM C flags to get the -I path for headers, for example. I think we should use llvm-config --cppflags instead of --cflags. If you're using autoconf, try changing this line: LLVM_CFLAGS=`$LLVM_CONFIG --cflags` in configure.ac to use --cppflags instead. Does that help? I think the question is, what else is in llvm-config --cflags? If all that's needed is the include paths, then --cppflags would be sufficient. However, if there are macro definitions or compiler flags (e.g. -fomit-frame-pointer) in --cflags that are needed to properly use llvm, then I guess you need --cflags. -fomit-frame-pointer doesn't cause ABI changes. It's fine to mix code built with/without it. This is really a bug in llvm, but the last time someone here tried to bring it up to the llvm people, it was dismissed. Here's that bug: http://llvm.org/bugs/show_bug.cgi?id=8220 There's some more in the mesa archives, but I can't find the thread now. Sadly, that bug became a discussion about pkg-config which was not what was intended. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [GLSL] defined expressions
On Fri, 2010-12-03 at 15:37 -0800, Carl Worth wrote: On Fri, 3 Dec 2010 13:34:09 -0800, Kenneth Graunke kenn...@whitecape.org wrote: On Friday 03 December 2010 08:01:06 José Fonseca wrote: parser-space_tokens = 1; statement conditional_tokens stanza -- it causes space tokens to be emitted halfway before reaching the DEFINED token. Indeed. I'm not sure why conditional_tokens sets this. That code was pretty much copy pasted from pp_tokens, too, so I'm wondering if it can be removed from there as well. Carl, can we safely remove this line from conditional_tokens? What about pp_tokens? The current glcpp test suite (glcpp-test) says it's safe to remove it From conditional_tokens. Take that for what it's worth I guess... It's definitely not safe to remove it from pp_tokens, (unless you also make some other changes). If you just remove that you'll find that something like: #define foo bar baz foo will result in barbaz rather than bar baz as desired. One option here that might simplify the implementation considerably is to not include space tokens in the replacement list like this, but to instead simply insert spaces between all tokens when printing the resulting stream. That would cause some minor expansion of the results. For example input of: (this,that) would become: ( this , that ) But that probably doesn't really harm much. During development of the preprocessor it has been *very* convenient to not do expansion like this so that results could be verified based on direct comparison with output from gcc -E, (and attempts to test with diff but ignoring amounts of whitespace proved error prone). Now that the test suite uses its own files of expected output, this issue probably doesn't exist. I don't quite understand space_tokens' role here, or how it is supposed to work. I confess I'm not very familiar with bison/flex inner-workings: I though that the tokenizer was producing tokens quite ahead of the parser, so the parser couldn't change the behavior of the tokenizer in flight -- but perhaps I'm mixing up with other lexel/grammar generation tools. The parser can exert some influence over the tokenizer. But doing this reliably is difficult since the tokenizer can do read-ahead. I definitely agree that it would be greatly preferable to have the tokenizer and parser more cleanly separated. But various messy aspects of the C preprocessor language do make this difficult. Apparently it can. I believe Ian's assessment is right. The distinction between #define foo () and #define foo() is a case where the C preprocessor language makes whitespace significant. But the current space_tokens control is not used for this. Instead the lexer is currently handling this distinction itself. So space_tokens is really extra code to try to get really clean output, (without extra space expansion as shown above). If removing space_tokens makes the implementation much cleaner then the slightly less clean output might be very well worth it. I'll leave that for you to decide, Ken. I suppose it is possible for space to be significant on a conditional expression, e.g., #if foo () vs #if foo If there isn't an obvious cleaner solution, would you mind I commit my original patch: diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 8475e08..3b84c06 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -462,6 +462,10 @@ conditional_token: int v = hash_table_find (parser-defines, $2) ? 1 : 0; $$ = _token_create_ival (parser, INTEGER, v); } +| DEFINED SPACE IDENTIFIER { + int v = hash_table_find (parser-defines, $3) ? 1 : 0; + $$ = _token_create_ival (parser, INTEGER, v); + } | DEFINED '(' IDENTIFIER ')' { int v = hash_table_find (parser-defines, $3) ? 1 : 0; $$ = _token_create_ival (parser, INTEGER, v); It doesn't makes things any worse, and it would allow us to focus on other problems. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [GLSL] defined expressions
Hi Ian, The current GLSL preprocessor fails to parse expressions such as #if 1 == 0 || defined UNDEFINED which is used by CINEBENCH r11. The problem is the parser-space_tokens = 1; statement conditional_tokens stanza -- it causes space tokens to be emitted halfway before reaching the DEFINED token. The attached patch is a quick bandaid to this problem, but I suspect that other space-related issues may subsist. I don't quite understand space_tokens' role here, or how it is supposed to work. I confess I'm not very familiar with bison/flex inner-workings: I though that the tokenizer was producing tokens quite ahead of the parser, so the parser couldn't change the behavior of the tokenizer in flight -- but perhaps I'm mixing up with other lexel/grammar generation tools. Also attached is an extended version of the defined-01.vert piglit glslparsertest test which covers more variations. Jose // [config] // expect_result: pass // glsl_version: 110 // [end config] #define DEF 1 #undef UNDEF #if defined DEF #else #error Wrong #endif #if defined UNDEF #error Wrong #else #endif #if 1 == 0 || defined DEF #else #error Wrong #endif #if 1 == 0 || defined UNDEF #error Wrong #else #endif #if defined(DEF) #else #error Wrong #endif #if defined(UNDEF) #error Wrong #else #endif #if 1 == 0 || defined(DEF) #else #error Wrong #endif #if 1 == 0 || defined(UNDEF) #error Wrong #else #endif void main() { gl_Position = gl_Vertex; } diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 8475e08..3b84c06 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -462,6 +462,10 @@ conditional_token: int v = hash_table_find (parser-defines, $2) ? 1 : 0; $$ = _token_create_ival (parser, INTEGER, v); } +| DEFINED SPACE IDENTIFIER { + int v = hash_table_find (parser-defines, $3) ? 1 : 0; + $$ = _token_create_ival (parser, INTEGER, v); + } | DEFINED '(' IDENTIFIER ')' { int v = hash_table_find (parser-defines, $3) ? 1 : 0; $$ = _token_create_ival (parser, INTEGER, v); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] glretrace
FYI, I've extended my apitrace project to be able not only to show a list of GL calls, but also replay them. The idea is to facilitate reproduction of bugs, help finding regressions, and do it across different platforms. Code is in http://cgit.freedesktop.org/~jrfonseca/apitrace/log/?h=retrace There is a sample trace on http://people.freedesktop.org/~jrfonseca/traces/ which can be replayed glretrace -db topogun-1.06-orc-84k.trace GL API is massive, and spite code generation being used, there are many holes. The biggest is non-immediate non-vbo arrays/elements drawing (ie., glXxPointer with pointer to vertices instead of offsets). And there's much more listed in the TODO file. There are also many places where the size of arrays depend on another parameter and an auxiliary function is needed. Cover the whole GL API is an herculean task, so help would be very much welcome. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): glsl: fix assorted MSVC warnings
On Tue, 2010-11-16 at 11:55 -0800, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/15/2010 05:48 PM, Brian Paul wrote: case ir_unop_b2f: assert(op[0]-type-base_type == GLSL_TYPE_BOOL); for (unsigned c = 0; c op[0]-type-components(); c++) { -data.f[c] = op[0]-value.b[c] ? 1.0 : 0.0; +data.f[c] = op[0]-value.b[c] ? 1.0F : 0.0F; Please don't do this. This particular MSVC warning should just be disabled. If this warning were generated for non-literals and for literals that actually did lose precision being stored to a float, it might have a chance at having some value. Instead, it's just noise. Individual warnings can be disabled with a pragma, and this one should probably be disabled in mesa/compiler.h: #pragma warning(disable: 4244) There may be a way to do it from the command line, but I don't know what it is. It's -wd4244. The F suffixes on constants are also worthless, and they make the code ugly. I had the impression it was more than a warning, namely that the compilers would use double precision intermediates instead of single precision floats when constants don't have the 'f' suffix. Gcc does it. Take for example: float foo(float x) { return 1.0 / x + 5.0; } float foof(float x) { return 1.0f / x + 5.0f; } If you compile it on x64 with gcc -g0 -O3 -S -o - test.c you'll get .file foo.c .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc unpcklps%xmm0, %xmm0 cvtps2pd%xmm0, %xmm1 movsd .LC0(%rip), %xmm0 divsd %xmm1, %xmm0 addsd .LC1(%rip), %xmm0 unpcklpd%xmm0, %xmm0 cvtpd2ps%xmm0, %xmm0 ret .cfi_endproc .LFE0: .size foo, .-foo .p2align 4,,15 .globl foof .type foof, @function foof: .LFB1: .cfi_startproc movaps %xmm0, %xmm1 movss .LC2(%rip), %xmm0 divss %xmm1, %xmm0 addss .LC3(%rip), %xmm0 ret .cfi_endproc .LFE1: .size foof, .-foof .section.rodata.cst8,aM,@progbits,8 .align 8 .LC0: .long 0 .long 1072693248 .align 8 .LC1: .long 0 .long 1075052544 .section.rodata.cst4,aM,@progbits,4 .align 4 .LC2: .long 1065353216 .align 4 .LC3: .long 1084227584 .ident GCC: (Debian 4.4.5-6) 4.4.5 .section.note.GNU-stack,,@progbits And as you can see, one function uses double precision, and the other uses floating point. Code quality is much better in the latter. Expecting that they will be added everywhere when no other compiler generates this warning is a losing battle. I really think this is a battle everybody should fight. Perhaps the condition ? 1.0 : 0.0 is something that a compiler should eliminate, but single precision expressions should use 'f' suffix on constants seems to be a good rule of thumb to follow. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] mesademos: SCons - Cmake
SCons is very powerful (the only crossplatform tool with convenience library support that I know), but there it requires a lot of costumization (i.e, all scons/*.py). It's worth the time investment for Mesa/Gallium drivers, but don't feel like maintain all that glue everywhere. For simple stuff like GL demos and test applications CMake does everything needed out of the box. So I'm replacing all SCons build infrastructure in mesademos (which was broken on windows since the repository split) with CMake. This is now done in the mesa/demos' cmake branch. Linux, cross MinGW, and MSVC (both 32bit and 64bit) build perfectly. I've made a package of 32bit and 64bit MSVC binaries and uploaded to http://people.freedesktop.org/~jrfonseca/mesademos/mesademos-20101110-windows.7z I'd like to merge the branch and remove SCons. On MSVC CMake generates a Visual studio project, so we could remove the existing (and broken) project files too. Let me know if there any objections/suggestions. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add CAPs for indirect addressing and lower it in st/mesa when needed
On Wed, 2010-11-10 at 12:29 -0800, Marek Olšák wrote: Required because ATI and NVIDIA DX9 GPUs do not support indirect addressing of temps, inputs, outputs, and consts (FS-only) or the hw support is so limited that we cannot use it. This should make r300g and possibly nvfx more feature complete. Signed-off-by: Marek Olšák mar...@gmail.com --- src/gallium/include/pipe/p_defines.h |5 + src/mesa/state_tracker/st_extensions.c |9 + 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 53f7b60..6cca301 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -483,7 +483,12 @@ enum pipe_shader_cap PIPE_SHADER_CAP_MAX_TEMPS, PIPE_SHADER_CAP_MAX_ADDRS, PIPE_SHADER_CAP_MAX_PREDS, + /* boolean caps */ PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED, + PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR, + PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR, + PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR, + PIPE_SHADER_CAP_INDIRECT_CONST_ADDR, }; /** diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 2720f44..1327491 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -175,6 +175,15 @@ void st_init_limits(struct st_context *st) options-EmitNoCont = !screen-get_shader_param(screen, i, PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED); + options-EmitNoIndirectInput = !screen-get_shader_param(screen, i, +PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR); + options-EmitNoIndirectOutput = !screen-get_shader_param(screen, i, + PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR); + options-EmitNoIndirectTemp = !screveryeen-get_shader_param(screen, i, +PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR); + options-EmitNoIndirectUniform = !screen-get_shader_param(screen, i, +PIPE_SHADER_CAP_INDIRECT_CONST_ADDR); + if(options-EmitNoLoops) options-MaxUnrollIterations = MIN2(screen-get_shader_param(screen, i, PIPE_SHADER_CAP_MAX_INSTRUCTIONS), 65536); } Marek, I don't object the caps per se -- they seem unavoidable -- but this will change the current behavior for all existing drivers. So either this change is immediately followed with one to implement handle these new caps on all pipe drivers (it's OK if you don't even build them), or the caps semantics should be negated, e.g, PIPE_SHADER_CAP_NO_INDIRECT_INPUT_ADDR. I don't feel strongly either way, but it has happened too often for drivers nobody is currently looking at become broken due to new caps. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add CAPs for indirect addressing and lower it in st/mesa when needed
On Wed, 2010-11-10 at 13:24 -0800, Marek Olšák wrote: On Wed, Nov 10, 2010 at 9:48 PM, José Fonseca jfons...@vmware.commailto:jfons...@vmware.com wrote: On Wed, 2010-11-10 at 12:29 -0800, Marek Olšák wrote: Required because ATI and NVIDIA DX9 GPUs do not support indirect addressing of temps, inputs, outputs, and consts (FS-only) or the hw support is so limited that we cannot use it. This should make r300g and possibly nvfx more feature complete. Signed-off-by: Marek Olšák mar...@gmail.commailto:mar...@gmail.com --- src/gallium/include/pipe/p_defines.h |5 + src/mesa/state_tracker/st_extensions.c |9 + 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 53f7b60..6cca301 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -483,7 +483,12 @@ enum pipe_shader_cap PIPE_SHADER_CAP_MAX_TEMPS, PIPE_SHADER_CAP_MAX_ADDRS, PIPE_SHADER_CAP_MAX_PREDS, + /* boolean caps */ PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED, + PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR, + PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR, + PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR, + PIPE_SHADER_CAP_INDIRECT_CONST_ADDR, }; /** diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 2720f44..1327491 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -175,6 +175,15 @@ void st_init_limits(struct st_context *st) options-EmitNoCont = !screen-get_shader_param(screen, i, PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED); + options-EmitNoIndirectInput = !screen-get_shader_param(screen, i, + PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR); + options-EmitNoIndirectOutput = !screen-get_shader_param(screen, i, + PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR); + options-EmitNoIndirectTemp = !screveryeen-get_shader_param(screen, i, + PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR); + options-EmitNoIndirectUniform = !screen-get_shader_param(screen, i, + PIPE_SHADER_CAP_INDIRECT_CONST_ADDR); + if(options-EmitNoLoops) options-MaxUnrollIterations = MIN2(screen-get_shader_param(screen, i, PIPE_SHADER_CAP_MAX_INSTRUCTIONS), 65536); } Marek, I don't object the caps per se -- they seem unavoidable -- but this will change the current behavior for all existing drivers. So either this change is immediately followed with one to implement handle these new caps on all pipe drivers (it's OK if you don't even build them), or the caps semantics should be negated, e.g, PIPE_SHADER_CAP_NO_INDIRECT_INPUT_ADDR. I don't feel strongly either way, but it has happened too often for drivers nobody is currently looking at become broken due to new caps. I was going to fix get_shader_param in all the drivers afterwards to match the current behavior. I am well aware of the consequences the patch would have. Great. Sorry for stating the obvious. Looks good to me FWIW. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] os: Allow file streams to be open in binary mode.
Michal, I think we can just use binary all the time. Jose On Thu, 2010-11-04 at 09:29 -0700, Michal Krol wrote: From efd52ac32547c80d1d8317fe2934a6742968a394 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Thu, 4 Nov 2010 17:29:01 +0100 Subject: [PATCH] os: Allow file streams to be open in binary mode. Explicitly request binary file mode by adding OS_STREAM_CREATE_BINARY flag to os_file_stream_create(). Without that files created on windows will be garbled. --- src/gallium/auxiliary/os/os_stream.h| 10 -- src/gallium/auxiliary/os/os_stream_stdc.c |6 -- src/gallium/auxiliary/util/u_debug.c|2 +- src/gallium/auxiliary/util/u_debug_refcnt.c |2 +- src/gallium/drivers/trace/tr_dump.c |2 +- 5 files changed, 15 insertions(+), 7 deletions(-) diff --git a/src/gallium/auxiliary/os/os_stream.h b/src/gallium/auxiliary/os/os_stream.h index 6c6050b..0e9acfa 100644 --- a/src/gallium/auxiliary/os/os_stream.h +++ b/src/gallium/auxiliary/os/os_stream.h @@ -55,6 +55,11 @@ struct os_stream (*vprintf)(struct os_stream *stream, const char* format, va_list ap); }; +/** + * OS stream creation flags. + */ +#define OS_STREAM_CREATE_BINARY 0x1 + static INLINE void os_stream_close(struct os_stream *stream) @@ -116,7 +121,8 @@ os_stream_printf (struct os_stream* stream, const char *format, ...) } struct os_stream * -os_file_stream_create(const char *filename); +os_file_stream_create(const char *filename, + uint creation_flags); struct os_stream * @@ -139,7 +145,7 @@ os_str_stream_get_and_close(struct os_stream *stream); #if defined(PIPE_SUBSYSTEM_WINDOWS_DISPLAY) -#define os_file_stream_create(_filename) os_null_stream_create() +#define os_file_stream_create(_filename, _creation_flags) os_null_stream_create() #endif #endif /* _OS_STREAM_H_ */ diff --git a/src/gallium/auxiliary/os/os_stream_stdc.c b/src/gallium/auxiliary/os/os_stream_stdc.c index 37e7d06..adc2be4 100644 --- a/src/gallium/auxiliary/os/os_stream_stdc.c +++ b/src/gallium/auxiliary/os/os_stream_stdc.c @@ -93,7 +93,8 @@ os_stdc_stream_vprintf (struct os_stream* _stream, const char *format, va_list a struct os_stream * -os_file_stream_create(const char *filename) +os_file_stream_create(const char *filename, + uint creation_flags) { struct os_stdc_stream *stream; @@ -106,7 +107,8 @@ os_file_stream_create(const char *filename) stream-base.flush = os_stdc_stream_flush; stream-base.vprintf = os_stdc_stream_vprintf; - stream-file = fopen(filename, w); + stream-file = fopen(filename, +(creation_flags OS_STREAM_CREATE_BINARY) ? wb : w); if(!stream-file) goto no_file; diff --git a/src/gallium/auxiliary/util/u_debug.c b/src/gallium/auxiliary/util/u_debug.c index 504e6d2..bb1680a 100644 --- a/src/gallium/auxiliary/util/u_debug.c +++ b/src/gallium/auxiliary/util/u_debug.c @@ -632,7 +632,7 @@ debug_dump_float_rgba_bmp(const char *filename, bmih.biClrUsed = 0; bmih.biClrImportant = 0; - stream = os_file_stream_create(filename); + stream = os_file_stream_create(filename, OS_STREAM_CREATE_BINARY); if(!stream) goto error1; diff --git a/src/gallium/auxiliary/util/u_debug_refcnt.c b/src/gallium/auxiliary/util/u_debug_refcnt.c index 40a26c9..cee1615 100644 --- a/src/gallium/auxiliary/util/u_debug_refcnt.c +++ b/src/gallium/auxiliary/util/u_debug_refcnt.c @@ -119,7 +119,7 @@ void debug_reference_slowpath(const struct pipe_reference* p, debug_reference_de { const char* filename = debug_get_option(GALLIUM_REFCNT_LOG, NULL); if(filename filename[0]) - stream = os_file_stream_create(filename); + stream = os_file_stream_create(filename, 0); if(stream) debug_refcnt_state = 1; diff --git a/src/gallium/drivers/trace/tr_dump.c b/src/gallium/drivers/trace/tr_dump.c index 51a4ea9..dcc1a47 100644 --- a/src/gallium/drivers/trace/tr_dump.c +++ b/src/gallium/drivers/trace/tr_dump.c @@ -251,7 +251,7 @@ boolean trace_dump_trace_begin() if(!stream) { - stream = os_file_stream_create(filename); + stream = os_file_stream_create(filename, 0); if(!stream) return FALSE; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] os: Allow file streams to be open in binary mode.
On Thu, 2010-11-04 at 09:58 -0700, Michal Krol wrote: W dniu 2010-11-04 17:56, José Fonseca pisze: Michal, I think we can just use binary all the time. Really? There are some consumers of this function that produce text files, like XML output of trace. Though I think we should be fine with those having linux newlines... Different newlines on XML is not a problem. Binary is fine, really. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Revamping how to specify targets to build with scons
I've just pushed a long time requested improvement to scons build system. Now is simply a matter of naming what to build. For example: scons libgl-xlib scons libgl-gdi scons graw-progs scons llvmpipe and so on. And there is still the possibility of scepcified subdirs, e.g. scons src/gallium/drivers If nothing is specified then everything will be build. There might be some rough corners over the next days. Please bare with me. To create new command line targets use the env.Alias method. For example: mytarget = env.SharedLibrary() env.Alias('mytarget', mytarget) and you'll be able to build it by passing mytarget to scons. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Proposal for a long-term shader compiler (and IR) architecture
On Mon, 2010-10-18 at 10:52 -0700, Keith Whitwell wrote: On Mon, Oct 18, 2010 at 9:18 AM, Jerome Glisse j.gli...@gmail.com wrote: On Fri, Oct 15, 2010 at 7:44 PM, John Kessenich jo...@lunarg.com wrote: Hi, LunarG has decided to work on an open source, long-term, highly-functional, and modular shader and kernel compiler stack. Attached is our high-level proposal for this compiler architecture (LunarGLASS). We would like to solicit feedback from the open source community on doing this. I have read several posts here where it seems the time has come for something like this, and in that spirit, I hope this is consistent with the desire and direction many contributors to this list have already alluded to. Perhaps the biggest point of the proposal is to standardize on LLVM as an intermediate representation. This is actually done at two levels within the proposal; one at a high-level IR close to the source language and one at a low-level IR close to the target architecture. The full picture is in the attached document. Based on feedback to this proposal, our next step is to more precisely define the two forms of LLVM IR. Please let me know if you have any trouble reading the attached, or any questions, or any feedback regarding the proposal. Thanks, JohnK Just a quick reply (i won't have carefully read through this proposition before couple weeks) last time i check LLVM didn't seemed to fit the bill for GPU, newer GPU can be seen as close to scalar but not completely, there are restriction on instruction packing and the amount of data computation unit of gpu can access per cycle, also register allocation is different from normal CPU, you don't wan to do register peeling on GPU. So from my POV instruction scheduling packing and register allocation are interlace process (where you store variable impact instruction packing). Also on newer gpu it makes sense to use a mixed scalar/vector representation to preserve things like dot product. LLVM has always been able represent both scalar and vectors. Although the dot product is not natively represented in IR, one can perfectly define an dot product intrinsic which takes two vectors and returns a scalar. I haven't look at the backends, but I believe the same applies. Last loop, jump, function have kind of unsual restriction unlike any CPU (thought i haven't broad CPU knowledge) Bottom line is i don't think LLVM is anywhere near what would help us. I think this is the big question mark with this proposal -- basically can it be done? I also think there are indeed challenges translating LLVM IR to something like TGSI, Mesa IR; and I was skeptical about standardizing on LLVM IR for quite some time, but lately I've been reaching the conclusion that there's so much momentum behind LLVM that the benefits/synergy one gets by leveraging it will most likely exceed the pitfalls. But I never felt much skepticism for GPU code generation. There is e.g., a LLVM PTX backend already out there. And if it's not easy to make a LLVM backend for a particular GPU, then it should be at very least possible to implement a LLVM backend that generates a code in a representation very close to the GPU code, and do the final steps (e.g., register allocation, scheduling, etc) in a custom pass. Therefore benefiting from all high level optimizations that happened before. If it can't be done, we'll find out quickly, if it can then we can stop debating whether or not it's possible. Indeed. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] What I'm working on
On Mon, 2010-10-11 at 14:49 -0700, Ian Romanick wrote: As has been discussed numerous times, the assembly-like IRs in Mesa (classic Mesa IR and TGSI) are completely useless for generating code for GPUs. llvmpipe has shown that they are also completely useless for generating code for CPUs. Ian, I don't fully understand the above. I'm not sure if there's actually an disagreement or just different terminology. TGSI is indeed inadequate for optimization (as in TGSI - TGSI transformations), but it was never intended for that role: there was always the assumption it would be feed into an optimization compiler (ie. TGSI - xxx - GPU code). The role of TGSI was to be API agnostic (i.e., support xxx - TGSI, yyy - TGSI). I admit this is not advantage the only frontend is Mesa, but TGSI proved to be very useful when used in conjunction with other APIs beyond GL. It also makes hashing easy. FWIW, I'm fine with TGSI being replaced with something else (LLVM IR, mesa/src/glsl's), provided that IR is a superset of TGSI. Also, for the record, I don't think the problem is resemblance to assembly. Take LLVM IR for example: it is an assembly-like IR in SSA form, as every instruction consists of an opcode plus operands, and operands can't be complex expressions. What makes TGSI and Mesa IR inadequate for compilation is not their assembly resemblance, but the fact they lack any other auxiliary structure (a tree form or SSA form would be advisable, but they are merely an one directional list), and therefore are very difficult to walk, analyze, or rewrite. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Anonymous unions (Was: [Bug 30789] mesa git fails to build)
On Tue, 2010-10-12 at 07:58 -0700, Brian Paul wrote: On 10/12/2010 02:06 AM, José Fonseca wrote: What you guys feel about anonymous unions? I happened to committed some code with anonymous unions, but it caused gcc to choke when -std=c99 option is specified, which is only specified with automake but scons. After some search, it looks like anonymous unions are not part of C99, but are part of C++ and will reportedly be part of C1X [1]. I think all major compilers support it. I heard they are also often used together with bit fields to describe hardware registers. But for this to work to gcc we need to remove -std=c99, or replace with -std=gnu99, or pass -fms-extensions together with -std=c99. I don't care much either way. I'd just like to hear what's the general opinion on this to avoid ping-ponging on this matter. Jose [1] http://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html#Unnamed-Fields When I have a choice, I prefer to go with what is more portable. I think this is especially important for core Mesa/gallium to maximize portability to new compilers/platforms. You never know what's going to come along. On Tue, 2010-10-12 at 09:47 -0700, tom fogal wrote: I'd vote against it. I remember hitting an issue porting some of my own code that used anonymous unions. Further, a downstream app I work on needs to support AIX's native build environment; I haven't checked, but that toolchain always gives me grief so I doubt it supports anonymous unions. OK. It's gone now. Brian, Tom, thanks for the feedback. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: add initial support to double opcodes in llvmpipe.
Igor, How can I test your patches? They don't apply against gallium-double-opcode branch. They apply cleanly against master but master doesn't have the double opcodes. Could you please make it easy for me can test what you're suggesting without wasting too much time? In detail: - update gallium-double-opcode branch - apply your suggested llvmpipe change - commit your modified try.py with another name, e.g., tri-dmul.py Jose On Wed, 2010-09-29 at 08:08 -0700, Igor Oliveira wrote: Yep, See bellow. Basically the output is not smooth because the OUT[0].x is the msb from double enad the OUT[0].y is the lsb. Igor diff --git a/src/gallium/tests/python/samples/tri.py b/src/gallium/tests/python/samples/tri.py index 6d17c88..5b74c16 100644 --- a/src/gallium/tests/python/samples/tri.py +++ b/src/gallium/tests/python/samples/tri.py @@ -154,8 +154,8 @@ def test(dev): rgba = FloatArray(4); rgba[0] = 0.0 rgba[1] = 0.0 -rgba[2] = 0.0 -rgba[3] = 0.0 +rgba[2] = 1.0 +rgba[3] = 1.0 ctx.clear(PIPE_CLEAR_COLOR | PIPE_CLEAR_DEPTHSTENCIL, rgba, 1.0, 0xff) # vertex shader @@ -176,8 +176,10 @@ def test(dev): FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR, CONSTANT -0:MOV OUT[0], IN[0] -1:END +DCL TEMP[0] +0:DMUL TEMP[0].xy, IN[0], IN[0] +1:MOV OUT[0], TEMP[0] +2:END ''') ctx.set_fragment_shader(fs) @@ -189,25 +191,30 @@ def test(dev): verts[ 1] = 0.8 # y1 verts[ 2] = 0.2 # z1 verts[ 3] = 1.0 # w1 + verts[ 4] = 1.0 # r1 -verts[ 5] = 0.0 # g1 -verts[ 6] = 0.0 # b1 -verts[ 7] = 1.0 # a1 +verts[ 5] = 1.0 # g1 +verts[ 6] = 0.6 # b1 +verts[ 7] = 0.6 # a1 + verts[ 8] = -0.8 # x2 verts[ 9] = -0.8 # y2 verts[10] = 0.5 # z2 verts[11] = 1.0 # w2 -verts[12] = 0.0 # r2 -verts[13] = 1.0 # g2 -verts[14] = 0.0 # b2 -verts[15] = 1.0 # a2 + +verts[12] = 0.6 # r2 +verts[13] = 0.6 # g2 +verts[14] = 0.6 # b2 +verts[15] = 0.6 # a2 + verts[16] = 0.8 # x3 verts[17] = -0.8 # y3 verts[18] = 0.8 # z3 verts[19] = 1.0 # w3 -verts[20] = 0.0 # r3 -verts[21] = 0.0 # g3 -verts[22] = 1.0 # b3 + +verts[20] = 1.0 # r3 +verts[21] = 1.0 # g3 +verts[22] = 0.6 # b3 verts[23] = 1.0 # a3 ctx.draw_vertices(PIPE_PRIM_TRIANGLES, On Wed, Sep 29, 2010 at 10:57 AM, José Fonseca jfons...@vmware.com wrote: Could you please send me the modified tri.py as well. Thanks Jose On Wed, 2010-09-29 at 07:41 -0700, Igor Oliveira wrote: Hi Jose, I updated the patch(see below), I am using python samples, modifying the tri.py to use it because the regress tests are outdated. Now we have we can get the msb and the lsb values from double operations. diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 441aeba..70330dc 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -963,6 +963,68 @@ emit_kil( lp_build_mask_update(bld-mask, mask); } +static LLVMValueRef +lp_cast_to_double(struct lp_build_context *bld, + LLVMValueRef a, + LLVMValueRef b) +{ + struct lp_type type; + LLVMValueRef res; + LLVMTypeRef vec_type; + LLVMTypeRef vec_double_type; + + type = lp_type_uint_vec(64); + vec_type = lp_build_vec_type(type); + + a = LLVMBuildBitCast(bld-builder, a, vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, vec_type, ); + + res = LLVMBuildShl(bld-builder, a, lp_build_const_int_vec(type, 32),); + res = LLVMBuildOr(bld-builder, res, b, ); + + a = LLVMBuildBitCast(bld-builder, a, bld-vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, bld-vec_type, ); + + type = lp_type_float_vec(64); + bld-type = type; + vec_double_type = lp_build_vec_type(type); + res = LLVMBuildBitCast(bld-builder, res, vec_double_type, ); + + return res; +} + +static LLVMValueRef +lp_cast_from_double_msb(struct lp_build_context *bld, +LLVMValueRef double_value) +{ + LLVMTypeRef double_type; + LLVMValueRef res; + struct lp_type type = lp_type_uint_vec(64); + + double_type = lp_build_vec_type(type); + res = LLVMBuildBitCast(bld-builder, double_value, double_type, ); + res = LLVMBuildLShr(bld-builder, res, lp_build_const_int_vec(type, 32), ); + res = LLVMBuildBitCast(bld-builder, res, bld-vec_type, ); + + return res; +} + + +static LLVMValueRef +lp_cast_from_double_lsb(struct lp_build_context *bld, +LLVMValueRef double_value) +{ + LLVMTypeRef double_type; + LLVMValueRef res; + struct lp_type
Re: [Mesa-dev] [PATCH] Implement x86_64 atomics for compilers w/o intrinsics.
I didn't test but looks fine to me. Please commit. Jose On Wed, 2010-09-29 at 12:37 -0700, tom fogal wrote: ping? -tom tom fogal tfo...@sci.utah.edu writes: A friend of mine had trouble building 7.8.2 on an old gcc3.3 system (no gcc intrinsics). I put together this patch and he said his build worked. Our software doesn't thread so it's not really verified otherwise. I was a bit surprised we didn't need int64_t variants for x86_64. Maybe that's needed in general, but just not in swrast / under Mesa w/ our particular option set? Anyway, okay for 7.8 and master? -tom --- =_aa0 Content-Type: text/x-diff; charset=us-ascii Content-ID: 7248.128555187...@shigeru.sci.utah.edu Content-Description: 0001-Implement-x86_64-atomics-for-compilers-w-o-intrinsi c.patch Content-Transfer-Encoding: quoted-printable =46rom cc32ff741c5d32a66531a586b1f9268b94846c58 Mon Sep 17 00:00:00 2001 From: Tom Fogal tfo...@alumni.unh.edu Date: Sun, 26 Sep 2010 18:57:59 -0600 Subject: [PATCH] Implement x86_64 atomics for compilers w/o intrinsics. Really old gcc's (3.3, at least) don't have support for the intrinsics we need. This implements a fallback for that case. --- src/gallium/auxiliary/util/u_atomic.h | 47 = + 1 files changed, 47 insertions(+), 0 deletions(-) diff --git a/src/gallium/auxiliary/util/u_atomic.h b/src/gallium/auxiliary= /util/u_atomic.h index a156823..8434491 100644 --- a/src/gallium/auxiliary/util/u_atomic.h +++ b/src/gallium/auxiliary/util/u_atomic.h @@ -29,6 +29,8 @@ #define PIPE_ATOMIC_ASM_MSVC_X86= #elif (defined(PIPE_CC_GCC) defined(PIPE_ARCH_X86)) #define PIPE_ATOMIC_ASM_GCC_X86 +#elif (defined(PIPE_CC_GCC) defined(PIPE_ARCH_X86_64)) +#define PIPE_ATOMIC_ASM_GCC_X86_64 #elif defined(PIPE_CC_GCC) (PIPE_CC_GCC_VERSION =3D 401) #define PIPE_ATOMIC_GCC_INTRINSIC #else @@ -36,6 +38,51 @@ #endif = = +#if defined(PIPE_ATOMIC_ASM_GCC_X86_64) +#define PIPE_ATOMIC GCC x86_64 assembly + +#ifdef __cplusplus +extern C { +#endif + +#define p_atomic_set(_v, _i) (*(_v) =3D (_i)) +#define p_atomic_read(_v) (*(_v)) + +static INLINE boolean +p_atomic_dec_zero(int32_t *v) +{ + unsigned char c; + + __asm__ __volatile__(lock; decl %0; sete %1:+m(*v), =3Dqm(c) + ::memory); + + return c !=3D 0; +} + +static INLINE void +p_atomic_inc(int32_t *v) +{ + __asm__ __volatile__(lock; incl %0:+m(*v)); +} + +static INLINE void +p_atomic_dec(int32_t *v) +{ + __asm__ __volatile__(lock; decl %0:+m(*v)); +} + +static INLINE int32_t +p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new) +{ + return __sync_val_compare_and_swap(v, old, _new); +} + +#ifdef __cplusplus +} +#endif + +#endif /* PIPE_ATOMIC_ASM_GCC_X86_64 */ + = #if defined(PIPE_ATOMIC_ASM_GCC_X86) = -- = 1.7.0.2 --- =_aa0 Content-Type: text/plain; charset=us-ascii MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev --- =_aa0-- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] first attempt at shader stencil export
On Thu, 2010-09-30 at 07:32 -0700, Brian Paul wrote: On 09/30/2010 12:41 AM, Dave Airlie wrote: some background: so on r600g, the only way to directly write to the stencil is via the shader, using a transfer would require an extra step to flush the DS buffer out via the pipe again to make it useable by the hw as a DS buffer. So using the current gallium stencil draw would be a major overhead, compared to just doing it properly. So to do it properly I decided to expose the GL_ARB_shader_stencil_export type functionality. http://cgit.freedesktop.org/~airlied/mesa/log/?h=gallium-stencil-export 7 commits in here: two simple one liners, add the cap to gallium headers, and add FRAG_RESULT_STENCIL to mesa. then a fix to the mesa texstore so it will store S8 in an 8/24 so we can put the stencil values in a texture. mesa state tracker support to use the new capability to hw accel drawpixels on the stencil (outputs to Y of FRAG_RESULT_STENCIL). r600g support for the capability take the second TGSI_SEMANTIC_POSITION output and use its Y value as stencil (looking at this now, I should probably be taking the X value really). very initial mesa/GLSL support for the extension (completely untested). initial softpipe support for the capability a demonstration. TODO: probably stop using Y (this was just because the hw uses Y), the GLSL extension just defines an integer output for the fragprog. fix the 24/8 texstore variant. write some test code for piglit and test the GL extension/GLSL bits. I'm a lot more interested in the non-GL extension bits as it allows stencil writes to work properly on r600g. A few comments. Instead of using semantic name=TGSI_SEMANTIC_POSITION, index=1 to indicate the stencil output, how about just defining a new TGSI_SEMANTIC_STENCIL label instead? I think that'd be clearer. I think the fragment stencil result is supposed to be an integer. But in the softpipe changes, it looks like you're doing a [0,1]-[0,255] scaling at one point. If we want to use this feature for implementing glDraw/CopyPixels functionality, the stencil values will be coming from a texture. How do we sample that texture? That is, which components of the texture image return the Z and/or stencil components? Basically, we need to add a new row to the table in the tgsi.rst file for Z/Stencil textures. Also, we need to clarify that fetching a stencil value returns an integer value, not a float normalized to [0,1]. DirectX 10 documentation concerning depth textures is quite peculiar, they suggest using typecasts: http://msdn.microsoft.com/en-us/library/bb205074(v=VS.85).aspx#reading_the_depth-stencil_buffer_as_a_texture But they specifically mention reading z24s8 stencil values this way, although it could be certainly possible, by reading the texture as i32. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] depth writing and missing something
On Thu, 2010-09-30 at 07:57 -0700, Brian Paul wrote: On 09/29/2010 07:34 PM, Dave Airlie wrote: So I've been playing with stencil writing on r600, and it lead me to examine the depth writing. So at the moment in gallium if we are writing to depth via DrawPixels, we construct a fragment program that samples from a texture. TEX OUT[0].z, IN[1], SAMP[0], 2D Now from what I can see the format we pick to sample from is Z24_UNORM_S8_USCALED, which from the u_format.csv file seems to say it will put the results into X and Y components. I think what the u_format.csv file says isn't totally consistant with what the table at the bottom of tgsi.rst says. We never did nail down the Z/Gallium cell in the later table. To be consistant with what the u_format.csv file says, I think the tgsi.rgs table needs to be updated to read: Texture Comps GalliumOpenGL Direct3D 9 +--+--++--+ | Z| (Z, ?, ?, ?) | (z, z, z, 1)* | (0, z, 0, 1) | +--+--++--+ | ZS | (Z, S, ?, ?) | (z, z, z, 1)* | tbd | +--+--++--+ | S| (?, S, ?, ?) | undefined | tbd | +--+--++--+ What do people think about that? Sounds good to me. I think the last two cells underneath Direct3D 9 should be (0, z, 0, 1) and undefined. Now if we sample from the X and Y components and the texture dest writemask is Z, I can't see how any value arrives in the right place. It seems to work but I'm a bit lost where the magic is. I'd have expected there to be swizzles in the sampler but I can't see them being set to anything special either. I think we've just been getting lucky by the TEX instructions returning (Z,Z,Z,?). If we go with the table changes above, we need to either specify a sampler swizzle or do something like this in the shader: TEX TEMP[0].x, IN[1], SAMP[0], 2D MOV OUT[0].z, TEMP[0].; Yes. I thought the state trackers were already doing that, but apparently not. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium
On Tue, 2010-09-21 at 05:00 -0700, Henri Verbeet wrote: On 21 September 2010 13:13, Luca Barbieri l...@luca-barbieri.com wrote: Why are you claiming this? I assume it's because of the comment in tpf.h, which states that it has been written according to Microsoft's documentation, which is available solely from reading the d3d11TokenizedProgramFormat.h header in the DDK. Using the header, which is documented in the DDK documentation as the place to look for documentation of the format, as reference, doesn't seem to me unusual or problematic. ... Could you please explain your concerns in more detail? The basic rule Wine uses is that if you've ever seen Microsoft code, you can't work on similar code for Wine. That may or may not be more strict than would legally be required in a given jurisdiction, but Wine simply doesn't want to take that risk. I think Bridgman expressed the issue fairly well at some point as Stay well back from the abyss. The only reason we can look at e.g. PSDK headers at all is because you can't copyright facts like constant values or structure layouts. Even then, we have to be very careful to only take the facts from the header and not anything that would be considered expression. Non-trivial macros would probably be considered code. Wine certainly wouldn't go through all the trouble of figuring out the formats from scratch if we considered looking at the DDK implementation an acceptable risk. Also, at least a couple of years ago the license for the DDK explicitly restricted its use to Windows development. That's again something that may or may not hold up, but we're not going to try. Note that I've looked at neither the DDK nor your actual implementation. The fact that apparently you looked at the DDK while writing the code makes me cautious to avoid it. Mesa never had such tight requirements: there are many contributors, including full time professionals, which have seen a lot of stuff under NDA, not to mention publicly available documents and reference code. Nothing changed with Luca's contribution -- either you've been tainted before, or you won't be tainted now. But, even assuming for the sake of argument that Luca saw Microsoft reference code, if his contribution is not derived code, then you haven't seen any Microsoft code, and nothing prevents you to contribute to whatever you want. WINE may refuse to integrate Luca's code directly, but that's a different matter. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium
On Tue, 2010-09-21 at 02:14 -0700, Jakob Bornecrantz wrote: On Tue, Sep 21, 2010 at 10:42 AM, Keith Whitwell kei...@vmware.com wrote: On Mon, 2010-09-20 at 16:28 -0700, Luca Barbieri wrote: A couple of questions - it looks like this is a drop-in for the d3d10/11 runtime, rather than an implementation of the DDI. Yes. I think that makes sense, but it could also be possible to split it into two pieces implementing either side of the d3d10 DDI interface. Any thoughts on whether that's interesting to you? I wrote it this way first of all because it's clearly easier to just write the code to support one interface, rather than writing two pieces, and it avoids unnecessary reliance on Microsoft interfaces, which often tend to be imperfectly documented. Not going through the DDI also clearly reduces CPU overhead and keeps the codebase simpler. I think a DDI implementation over Gallium could just live along as a sibling to the COM implementation, sharing common code, which is already split out into modules such as d3d1xshader and d3d1xstutil. The shader parser and translator can be fully shared and several conversions (e.g. DXGI_FORMAT - pipe_format) are already separate from the main code, although perhaps more could be factored out. Instead, layering the COM API over the DDI API doesn't necessarily seem to be a win, especially because Gallium is so close to the D3D10/11 interfaces that it's not even clear that using the DDI is much easier than just using Gallium directly. I don't think I'll do it myself as an hobby project though. Sounds good Luca, just interested in your plans for this. I don't see any reason not to merge this to master straight away -- this is all self-contained in its own directory doesn't seem like it will regress anything else... Maybe not the binaries, but it looks like you pushed already anyways. Is there any reason for the binaries to be commited? Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium
On Tue, 2010-09-21 at 07:13 -0700, Luca Barbieri wrote: Is there any reason for the binaries to be commited? The idea is to allow people to test (to be added) Wine support or run the demos in Windows to see what they are supposed to do without needing to setup a Windows build environment. Sure. I don't doubt their are useful. But why have they be checked in? Building doesn't depend on it, or does it? So a tarball people.freedesktop.org or mesa3d.org should be enough. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] python/tests: Fixed tri.py for API and TGSI syntax changes.
Tilman, Fell free to push this and future fixes to the python scripts. They haven't received any love for some time, especially after graw. Jose On Sun, 2010-09-19 at 12:57 -0700, Tilman Sauerbeck wrote: Signed-off-by: Tilman Sauerbeck til...@code-monkey.de --- The same fix needs to be applied to a bunch of other Python scripts, but tri.py seems to be a good starting point. src/gallium/tests/python/samples/tri.py |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/tests/python/samples/tri.py b/src/gallium/tests/python/samples/tri.py index fed929d..6d17c88 100644 --- a/src/gallium/tests/python/samples/tri.py +++ b/src/gallium/tests/python/samples/tri.py @@ -88,8 +88,8 @@ def test(dev): # rasterizer rasterizer = Rasterizer() -rasterizer.front_winding = PIPE_WINDING_CW -rasterizer.cull_mode = PIPE_WINDING_NONE +rasterizer.front_ccw = False +rasterizer.cull_face = PIPE_FACE_NONE rasterizer.scissor = 1 ctx.set_rasterizer(rasterizer) @@ -161,8 +161,8 @@ def test(dev): # vertex shader vs = Shader(''' VERT -DCL IN[0], POSITION, CONSTANT -DCL IN[1], COLOR, CONSTANT +DCL IN[0] +DCL IN[1] DCL OUT[0], POSITION, CONSTANT DCL OUT[1], COLOR, CONSTANT 0:MOV OUT[0], IN[0] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: Actually care what check_soa_dependencies says
Hi Jakob, On Sat, 2010-09-18 at 07:30 -0700, Jakob Bornecrantz wrote: On Sat, Sep 18, 2010 at 4:26 PM, Jakob Bornecrantz wallbra...@gmail.com wrote: Looking over some of the piglit failings that Vinsons have posted running on softpipe (we are down to 3005/3048). At first I was just going to make the output not turn into a warn, but looking at the function it looks like it actually should return the status and fail. This fixes both. This is a good idea IMO. Given the existing of gallivm and tgsi_exec.c, both being pretty correct, and the former very fast, I'm not sure what use is tgsi_sse2 catering for anymore. So I think that at very least it should not make things worse. A few cleanups: MOV is being whilelisted twice -- the second one can be removed, and the /* XXX: ... */ comment moved to the top of the function. Ops, looks like I ctrl-z the tests this was happening in. Its warning on shaders/fp-abs-01 and the ouput is: Warning: src/dst aliasing in instruction is not handled: 1: COS TEMP[1], TEMP[1]. For those who where interested in it. There are a bunch of opcodes for which tgsi_sse2 actually doesn't trip over, but are not white-listed yet, like COS. I went through the code and added all I could find. Also, tgsi_check_soa_dependencies ignores indirect registers. Patch attached. Jose diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 298f3d0..2609ea0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -605,8 +605,10 @@ tgsi_check_soa_dependencies(const struct tgsi_full_instruction *inst) for (i = 0; i inst-Instruction.NumSrcRegs; i++) { if ((inst-Src[i].Register.File == inst-Dst[0].Register.File) - (inst-Src[i].Register.Index == - inst-Dst[0].Register.Index)) { + ((inst-Src[i].Register.Index == +inst-Dst[0].Register.Index) || + inst-Src[i].Register.Indirect || + inst-Dst[0].Register.Indirect)) { /* loop over dest channels */ uint channelsWritten = 0x0; FOR_EACH_ENABLED_CHANNEL(*inst, chan) { diff --git a/src/gallium/auxiliary/tgsi/tgsi_sse2.c b/src/gallium/auxiliary/tgsi/tgsi_sse2.c index 785a9fb..379771c 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_sse2.c +++ b/src/gallium/auxiliary/tgsi/tgsi_sse2.c @@ -2835,6 +2835,26 @@ check_soa_dependencies(const struct tgsi_full_instruction *inst) case TGSI_OPCODE_MOV: case TGSI_OPCODE_MUL: case TGSI_OPCODE_XPD: + case TGSI_OPCODE_RCP: + case TGSI_OPCODE_RSQ: + case TGSI_OPCODE_EXP: + case TGSI_OPCODE_LOG: + case TGSI_OPCODE_DP3: + case TGSI_OPCODE_DP4: + case TGSI_OPCODE_DP2A: + case TGSI_OPCODE_EX2: + case TGSI_OPCODE_LG2: + case TGSI_OPCODE_POW: + case TGSI_OPCODE_XPD: + case TGSI_OPCODE_DPH: + case TGSI_OPCODE_COS: + case TGSI_OPCODE_SIN: + case TGSI_OPCODE_TEX: + case TGSI_OPCODE_TXB: + case TGSI_OPCODE_TXP: + case TGSI_OPCODE_NRM: + case TGSI_OPCODE_NRM4: + case TGSI_OPCODE_DP2: /* OK - these opcodes correctly handle SOA dependencies */ break; default: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): glsl: Fix ' format not a string literal and no format arguments' warning.
On Thu, 2010-09-16 at 11:44 -0700, Vinson Lee wrote: g++ -o build/linux-x86-debug/glsl/loop_controls.os -c -O0 -g3 -m32 -mstackrealign -mmmx -msse -msse2 -Wall -Wno-long-long -ffast-math -fmessage-length=0 -Wmissing-field-initializers -Werror=pointer-arith -fPIC -DHAVE_LLVM=0x0207 -DDEBUG -DNDEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_POSIX_SOURCE -D_POSIX_C_SOURCE=199309L -D_SVID_SOURCE -D_BSD_SOURCE -D_GNU_SOURCE -DPTHREADS -DHAVE_POSIX_MEMALIGN -Isrc/mapi -Isrc/mesa -Iinclude -Isrc/gallium/include -Isrc/gallium/auxiliary -Isrc/gallium/drivers -Isrc/gallium/winsys src/glsl/loop_controls.cpp src/glsl/loop_controls.cpp: In function ‘int calculate_iterations(ir_rvalue*, ir_rvalue*, ir_rvalue*, ir_expression_operation)’: src/glsl/loop_controls.cpp:88: warning: format not a string literal and no format arguments Could you repeat the command line, but with g++ -o build/linux-x86-debug/glsl/loop_controls.I -E ... instead of g++ -o build/linux-x86-debug/glsl/loop_controls.os -c and look for the code in question in loop_controls.I and see how it looks? Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] gallium: introduce get_shader_param (ALL DRIVERS CHANGED)
FWIW, I think this is a worthwhile interface cleanup. Your commit comment says it all. I did think of this when introducing the current shader caps enums, but informal feedback I got was that it was better to stick with the current get_param system. Now that the number of shader types is growing perhaps this becomes more appealing to everyone. Jose On Sun, 2010-09-05 at 18:30 -0700, Luca Barbieri wrote: Currently each shader cap has FS and VS versions. However, we want a version of them for geometry, tessellation control, and tessellation evaluation shaders, and want to be able to easily query a given cap type for a given shader stage. Since having 5 duplicates of each shader cap is unmanageable, add a new get_shader_param function that takes both a shader cap from a new enum and a shader stage. Drivers with non-unified shaders will first switch on the shader and, within each case, switch on the cap. Drivers with unified shaders instead first check whether the shader is supported, and then switch on the cap. MAX_CONST_BUFFERS is now per-stage. The geometry shader cap is removed in favor of checking whether the limit of geometry shader instructions is greater than 0, which is also used for tessellation shaders. WARNING: all drivers changed and compiled but only nvfx tested ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] Replace reference to tgsi-instruction-set.txt.
Submitted. Thanks. Jose On Tue, 2010-09-07 at 02:59 -0700, Tilman Sauerbeck wrote: That file has been replaced by tgsi.rst. --- src/gallium/include/pipe/p_shader_tokens.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index c4bd17e..74488de 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -199,7 +199,7 @@ struct tgsi_property_data { * * For more information on semantics of opcodes and * which APIs are known to use which opcodes, see - * auxiliary/tgsi/tgsi-instruction-set.txt + * gallium/docs/source/tgsi.rst */ #define TGSI_OPCODE_ARL 0 #define TGSI_OPCODE_MOV 1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] gallium: swap TGSI_PROCESSOR_VERTEX and TGSI_PROCESSOR_FRAGMENT
Luca, I think this is not a worthwhile change, at least not by itself. First, this will cause the binary TGSI representation to change, and can break some code which expects this particular ordering. For example, you forgot to update tgsi_dump.c's processor_type_names[] array. And there might be more. (And we'll have to review all private branches and update them for this). Furthermore, if the idea is to make PIPE_PROCESSOR_* and TGSI_PROCESSOR_* match, then the first thing that comes to mind is: why do we need two equal named enums? If we're going through the trouble of changing this, then IMO, the change should be to delete one of these enums. Jose On Sun, 2010-09-05 at 18:30 -0700, Luca Barbieri wrote: These didn't match PIPE_SHADER_*, and it seems much better to make all such indices match. Vertex is first because cards with vertex shaders came first. --- src/gallium/include/pipe/p_shader_tokens.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index c4bd17e..636e7f0 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -40,8 +40,8 @@ struct tgsi_header unsigned BodySize : 24; }; -#define TGSI_PROCESSOR_FRAGMENT 0 -#define TGSI_PROCESSOR_VERTEX1 +#define TGSI_PROCESSOR_VERTEX0 +#define TGSI_PROCESSOR_FRAGMENT 1 #define TGSI_PROCESSOR_GEOMETRY 2 struct tgsi_processor ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] [BRANCH] Floating point textures and rendering for Mesa, softpipe and llvmpipe
On Fri, 2010-08-27 at 05:49 -0700, Luca Barbieri wrote: I created a new branch called floating which includes an apparently successful attempt at full support of floating-point textures and render targets. I believe it is fundamentally correct, but should be considered a prototype and almost surely contains some oversights. Specifically, the following is included: [...] 10. llvmpipe is converted to use float tiles instead of uint8_t tiles 11. For llvmpipe, blending is reworked to work with both fixed and floating point target, and fragment clamping is properly supported [...] * LLVMpipe LLVMpipe has the fundamental problem that it uses an intermediate tiled format with byte components for all surface formats. To resolve this, this patchset switches it to use float components for all formats. Note that this is still a bad choice, and the same format of the surface should be used if natively supported by the CPU, or otherwise a suitable more precise format should be chosen. After this drastic change, the modifications imitate the changes to softpipe. Note that this code must not be merged as is, since using floats will likely degrade performance and memory usage unacceptably. The tile code should instead be rewritten to support all possible tile formats. Also, while this patch blends in floating point, it should blend in the destination format, since that gets correct clamping for free and is faster, since the destination format is often smaller. * Unsolved stuff [...] * Merging this branch To merge this, the following would be necessary: [...] 4. Rewrite the tiling code in llvmpipe to support arbitrary formats Hi Luca, It's an impressive amount of work you did here. I'll comment only on the llvmpipe of the changes for now. First, thanks for going the extra length of updating llvmpipe. Admittedly, always using a floating point is not ideal. A better solution would be to choose a swizzled data type (unorm8, fixed point, float, etc) that matched the color buffer format. But we've been seeing some results which point that the whole color buffer swizzling idea might be overrated: it increases memory bandwidth usage substantially, and for many shaders where derivatives aren't necessary or can be determined from inputs there's no need to process 2x2 quads of pixels or even computations in SoA. That is, we might end up eliminating swizzling from llvmpipe altogether in the medium term. So instead of going through a lot of work to support multiple swizzled types I'd prefer to keep the current simplistic (always 8bit unorm) swizzled type, and simply ignore errors in the clamping/precision loss when rendering to formats with higher precision dynamic range. When we add support to render directly to the unswizzled formats, support to render these other types will come as well. In summary, apart of your fragment clamping changes, I'd prefer to keep the rest of llvmpipe unchanged (and innacurate) for the time being. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] swizzling in llvmpipe [was: other stuff]
On Wed, 2010-09-01 at 09:24 -0700, Luca Barbieri wrote: It's an impressive amount of work you did here. I'll comment only on the llvmpipe of the changes for now. Thanks for your feedback! Admittedly, always using a floating point is not ideal. A better solution would be to choose a swizzled data type (unorm8, fixed point, float, etc) that matched the color buffer format. Exactly, also because we'll want unnormalized formats and 64-bit formats too. But we've been seeing some results which point that the whole color buffer swizzling idea might be overrated: it increases memory bandwidth usage substantially, Why? It should decrease it due to a lower number of cache misses, thanks to having a 2D instead of a 1D neighborhood of a pixel in the cache. We're talking about rendertarget swizzling, and not texture swizzling (llvmpipe doesn't do texture swizzling yet -- all textures must be linear). Operating in tiles already lowers the caches when writing/blending to the rendertarget. Having those tiles swizzled internally only makes computing derivatives and SoA computations easier. Nothing more. The only increase is that due to unswizzling for transfer or presentation, but that should be much lower that the rendering load on complex applications. It can surely be application dependent -- we haven't done enough prototyping and testing to be sure. It is also somewhat unintuitive -- a lot of what I'm writing here is a rationalization of data points we gathered (i.e, an interpretation), and not the final answers. I suspect one thing that doesn't help currently is that the tiles are too big to fit on L1 cache. It's possible that by doing smaller tiles the swizzled+unswizzled versions fit, and there is less trashing, making render-to-swizzled + unswizzle as competing as render-to-unswizzled. and for many shaders where derivatives aren't necessary or can be determined from inputs there's no need to process 2x2 quads of pixels or even computations in SoA. That is, we might end up eliminating swizzling from llvmpipe altogether in the medium term. Even with AoS, there is still the advantage in cache locality of an AoS-but-swizzled layout. The increased cost in address computation may well be much worse that the cache benefits though, unless some smart way of doing it can be found. It still sounds that you're referring to sampling from a texture and not rendering/blending to it. Of course the are related (we only want one swizzled layout at most), but the current swizzled layout was chosen to make blending easy; and not to make texture sampling easy. No SoA swizzled layout makes texture sampling easier or more efficient. I think vertex shaders might be the first place to look at for switching to AoS, since they don't have derivatives, and the input is much harder to keep in SoA form. Were LLVM vertex shaders done as SoA just because the existing llvmpipe code did that, or because it's actually better? The former: because the TGSI - SoA code was there and already robust. So instead of going through a lot of work to support multiple swizzled types I'd prefer to keep the current simplistic (always 8bit unorm) swizzled type, and simply ignore errors in the clamping/precision loss when rendering to formats with higher precision dynamic range. In summary, apart of your fragment clamping changes, I'd prefer to keep the rest of llvmpipe unchanged (and innacurate) for the time being. Note that this totally destroys the ability to use llvmpipe for high dynamic range rendering, which fundamentally depends on the ability to store unclamped and relatively high precision values. Using llvmpipe is important because softpipe can be so slow on advanced applications (e.g. the Unigine demos) to be unusable even for testing. I'm not giving a death sentence -- simply prioritization. I'm hopeful we'll get there not too long. Perhaps a middle ground could be to have the option to choose the swizzled tile format at compilation time. Applying or not my patch will have this effect, but requires it to be maintained, and it will obviously conflict a lot with other changes. Also some of the work will be required anyway to support rendering directly to textures of any format if one goes that route. More than floating point issue per se, I'm actually more concerned with keeping this part of the llvmpipe simple to allow experimentation and rethink the role of swizzled formats. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: set the MaxVarying GLSL constant
It makes sense now. I think you should replace the comment in the code with the paragraph you just wrote in your reply and then push it. Jose On Sun, 2010-08-29 at 07:55 -0700, Marek Olšák wrote: PIPE_CAP_MAX_FS_INPUTS specifies the number of COLORn + GENERICn inputs and is set in MaxNativeAttribs. It's always 2 colors + N generic attributes. The GLSL compiler never uses COLORn for varyings, so I subtract the 2 colors to get the maximum number of varyings (generic attributes) supported by a driver. Marek On Sun, Aug 29, 2010 at 12:06 PM, Jose Fonseca jfons...@vmware.com wrote: Marek, Could you elaborate why 2 color attributes are being subtracted? Jose From: mesa-dev-bounces +jfonseca=vmware@lists.freedesktop.org [mesa-dev-bounces +jfonseca=vmware@lists.freedesktop.org] On Behalf Of Marek Olšák [mar...@gmail.com] Sent: Saturday, August 28, 2010 18:08 To: mesa-dev@lists.freedesktop.org; Brian Paul Subject: Re: [Mesa-dev] [PATCH] st/mesa: set the MaxVarying GLSL constant May I push this? It makes glsl-max-varyings pass with r300g. Marek On Wed, Aug 25, 2010 at 5:27 AM, Marek Olšák mar...@gmail.commailto:mar...@gmail.com wrote: --- src/mesa/state_tracker/st_extensions.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 90e7867..dacba2c 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -161,6 +161,9 @@ void st_init_limits(struct st_context *st) pc-MaxNativeTemps = screen-get_param(screen, PIPE_CAP_MAX_VS_TEMPS); pc-MaxNativeAddressRegs = screen-get_param(screen, PIPE_CAP_MAX_VS_ADDRS); pc-MaxNativeParameters = screen-get_param(screen, PIPE_CAP_MAX_VS_CONSTS); + + /* Subtract 2 color attributes. */ + c-MaxVarying = screen-get_param(screen, PIPE_CAP_MAX_FS_INPUTS) - 2; } -- 1.7.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCHES] clang compatibility
On Sun, 2010-08-22 at 02:35 -0700, nobled wrote: The first three attached patches make it possible to compile Mesa with LLVM/Clang: 1. Add -lstdc++ when linking glsl_compiler and glcpp 2. Move -lstdc++ from the Gallium-specific Makefile.dri to DRI_LIB_DEPS in configure (fixes linking classic Mesa drivers) 3. Since autoconf gives GCC=yes even when using clang (since it just tests for the __GNUC__ macro), don't check for a minimum version of 3.3 if $(CC) points to a clang executable. (unfortunately I'm not sure how to properly detect clang, short of test-compiling a file that contains #ifdef __clang__. I.e. if $(CC) = 'cc', and 'cc' is an alternatives symlink to llvm-clang, this doesn't detect that case.) The rest are just fixes to compiler warnings: 4. dri: Fix implicit declaration 5. program: Fix struct/class confusion 6. dr/radeon: Fix printf format 7. llvmpipe: Fix memory leak With the first three patches, I can compile Mesa with clang 2.7 in Ubuntu Lucid if I export three variables before configure: export CC=llvm-clang export CXX=llvm-clang export CPPFLAGS=/usr/lib/clang/1.1/include ./configure (Yeah, the third one is really prone to breakage with new versions and I'm still trying to figure out how to not need it; it should also get passed as part of MKDEP_OPTIONS in configure.ac, TBH.) The llvmpipe patch looks correct to me, as draw_destroy doesn't destroy vbuf. But if so then interface looks bad to me -- what's the point of putting a destroy callback in the interface if it's never used by the other side of the interface? Perhaps Keith can make a call of what's the right thing to do here. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Fix and re-enable MMX-disabling code
On Mon, 2010-08-16 at 10:30 -0700, nobled wrote: I think this might be the problem that kept -disable-mmx from working; can somebody test it? (attached) @@ -152,15 +151,13 @@ lp_set_target_options(void) * See also: * - http://llvm.org/bugs/show_bug.cgi?id=3287 * - http://l4.me.uk/post/2009/06/07/llvm-wrinkle-3-configuration-what-configuration/ -* -* XXX: Unfortunately this is not working. */ - static boolean first = FALSE; + static boolean first = TRUE; if (first) { static const char* options[] = { prog, -disable-mmx }; llvm::cl::ParseCommandLineOptions(2, const_castchar**(options)); first = FALSE; } Lol. Quite likely! Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] rtasm/translate_sse: support Win64
On Sun, 2010-08-15 at 09:22 -0700, tom fogal wrote: Luca Barbieri l...@luca-barbieri.com writes: I just discovered that Microsoft wisely decided to use their own calling convention on Win64... Yep. This hasn't actually been tested on Win64 though. Does mingw or cygwin support win64? I don't know if it's just MS that uses a new calling convention, or all win64 code, but if the former this should probably be conditionalized on _MSC_VER as well. MinGW at least does. There are even 64bit cross compilers packaged in debian. I hadn't opportunity to play with it yet though. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Merge of glsl2 branch to master
On Fri, 2010-08-13 at 20:40 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ian Romanick wrote: I propose that we merge master to glsl2 on *Friday, August 13th* (one week from today). Barring unforeseen issues, I propose that we merge glsl2 to master on *Monday, August 16th*. The master - glsl2 merge is complete. There don't appear to be any regressions in the glsl2 branch caused by the merge. My plan is to merge glsl2 - master on Monday evening, pacific time. There are still three build issues with MSVC. All three either have patches or a proposed fix. I've been working on this for almost 12 hours today, so I'm not going to post the combinations of test results that I usually post. Sorry. Ian, Thanks for all your diligence with Windows portability. There are still some issues with MSVC but it's pretty minor stuff (e.g log2f exp2f). I can get to it the next couple of days, so feel free to merge as far as I'm concerned. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Thu, 2010-08-12 at 14:46 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 José Fonseca wrote: OK. What about this: For GLUT, GLEW, LLVM and all other dependencies I'll just make a SDK with the binaries, with debug release, 32 64 bit, MinGW MSVC versions. One seldom needs to modify the source anyway, and they have active upstream development. But I perceive talloc as different from all above: it's very low level and low weight library, providing very basic functionality, and upstream never showed interest for Windows portability. I'd really prefer to see the talloc source bundled (and only compiled on windows), as a quick way to have glsl2 merged without causing windows build failures. This seems like a reasonable compromise. Is this something that you and / or Aras can tackle? I don't have a Windows build system set up, so I wouldn't be able to test any build system changes that I made. I've pushed a new branch glsl2-win32 that includes Aras' patch, and all necessary fixes to get at least MinGW build successfully. I had to rename some tokens in order to avoid collisions with windows.h defines. Aras didn't mention this problem before. Perhaps the indirect windows.h include can be avoided, or you prefer to handle this some other way. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] u_cpu_detect: make arch and little_endian constants, add abi and x86-64
On Fri, 2010-08-13 at 06:47 -0700, Luca Barbieri wrote: A few related changes: 1. Make x86-64 its own architecture (nothing was using so util_cpu_caps.arch, so nothing can be affected) Just remove util_cpu_caps.arch. It's there simply due to its historical ancestry. We have PIPE_ARCH already. 2. Turn the CPU arch and endianness into macros, so that the compiler can evaluate that at constant time and eliminate dead code Ditto. We have PIPE_ENDIAN or something already. 3. Add util_cpu_abi to know about non-standard ABIs like Win64 That's not really prescribed by the CPU. We have PIPE_OS_* already. There's no merit in duplicating in util_caps what's already provided by p_config.h / p_compiler.h Jose --- src/gallium/auxiliary/gallivm/lp_bld_pack.c |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |2 +- src/gallium/auxiliary/util/u_cpu_detect.c | 19 +- src/gallium/auxiliary/util/u_cpu_detect.h | 39 ++-- 4 files changed, 38 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_pack.c b/src/gallium/auxiliary/gallivm/lp_bld_pack.c index ecfb13a..8ab742a 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_pack.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_pack.c @@ -171,7 +171,7 @@ lp_build_unpack2(LLVMBuilderRef builder, msb = lp_build_zero(src_type); /* Interleave bits */ - if(util_cpu_caps.little_endian) { + if(util_cpu_little_endian) { *dst_lo = lp_build_interleave2(builder, src_type, src, msb, 0); *dst_hi = lp_build_interleave2(builder, src_type, src, msb, 1); } diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 3075065..d4b8b4f 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1840,7 +1840,7 @@ lp_build_sample_2d_linear_aos(struct lp_build_sample_context *bld, unsigned i, j; for(j = 0; j h16.type.length; j += 4) { - unsigned subindex = util_cpu_caps.little_endian ? 0 : 1; + unsigned subindex = util_cpu_little_endian ? 0 : 1; LLVMValueRef index; index = LLVMConstInt(elem_type, j/2 + subindex, 0); diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c index b1a8c75..73ce146 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.c +++ b/src/gallium/auxiliary/util/u_cpu_detect.c @@ -391,23 +391,6 @@ util_cpu_detect(void) memset(util_cpu_caps, 0, sizeof util_cpu_caps); - /* Check for arch type */ -#if defined(PIPE_ARCH_MIPS) - util_cpu_caps.arch = UTIL_CPU_ARCH_MIPS; -#elif defined(PIPE_ARCH_ALPHA) - util_cpu_caps.arch = UTIL_CPU_ARCH_ALPHA; -#elif defined(PIPE_ARCH_SPARC) - util_cpu_caps.arch = UTIL_CPU_ARCH_SPARC; -#elif defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) - util_cpu_caps.arch = UTIL_CPU_ARCH_X86; - util_cpu_caps.little_endian = 1; -#elif defined(PIPE_ARCH_PPC) - util_cpu_caps.arch = UTIL_CPU_ARCH_POWERPC; - util_cpu_caps.little_endian = 0; -#else - util_cpu_caps.arch = UTIL_CPU_ARCH_UNKNOWN; -#endif - /* Count the number of CPUs in system */ #if defined(PIPE_OS_WINDOWS) { @@ -504,7 +487,7 @@ util_cpu_detect(void) #ifdef DEBUG if (debug_get_option_dump_cpu()) { - debug_printf(util_cpu_caps.arch = %i\n, util_cpu_caps.arch); + debug_printf(util_cpu_caps.arch = %i\n, util_cpu_arch); debug_printf(util_cpu_caps.nr_cpus = %u\n, util_cpu_caps.nr_cpus); debug_printf(util_cpu_caps.x86_cpu_type = %u\n, util_cpu_caps.x86_cpu_type); diff --git a/src/gallium/auxiliary/util/u_cpu_detect.h b/src/gallium/auxiliary/util/u_cpu_detect.h index 4b3dc39..e81e4b5 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.h +++ b/src/gallium/auxiliary/util/u_cpu_detect.h @@ -36,6 +36,7 @@ #define _UTIL_CPU_DETECT_H #include pipe/p_compiler.h +#include pipe/p_config.h enum util_cpu_arch { UTIL_CPU_ARCH_UNKNOWN = 0, @@ -43,19 +44,49 @@ enum util_cpu_arch { UTIL_CPU_ARCH_ALPHA, UTIL_CPU_ARCH_SPARC, UTIL_CPU_ARCH_X86, - UTIL_CPU_ARCH_POWERPC + UTIL_CPU_ARCH_X86_64, + UTIL_CPU_ARCH_POWERPC, + + /* non-standard ABIs, only used in util_cpu_abi */ + UTIL_CPU_ABI_WIN64 }; +/* Check for arch type */ +#if defined(PIPE_ARCH_MIPS) +#define util_cpu_arch UTIL_CPU_ARCH_MIPS +#elif defined(PIPE_ARCH_ALPHA) +#define util_cpu_arch UTIL_CPU_ARCH_ALPHA +#elif defined(PIPE_ARCH_SPARC) +#define util_cpu_arch UTIL_CPU_ARCH_SPARC +#elif defined(PIPE_ARCH_X86) +#define util_cpu_arch UTIL_CPU_ARCH_X86 +#elif defined(PIPE_ARCH_X86_64) +#define util_cpu_arch UTIL_CPU_ARCH_X86_64 +#elif defined(PIPE_ARCH_PPC) +#define util_cpu_arch UTIL_CPU_ARCH_POWERPC +#else +#define util_cpu_arch UTIL_CPU_ARCH_UNKNOWN +#endif + +#ifdef WIN64 +#define
Re: [Mesa-dev] [PATCH 2/3] rtasm: add minimal x86-64 support and new instructions (v2)
Luca, This is great stuff. But one request: if Win64 is untested, please make sure it is disabled by default until somebody had opportunity to test it. Unfortunately I'm really busy with other stuff ATM and don't have the time. Jose On Fri, 2010-08-13 at 06:47 -0700, Luca Barbieri wrote: Changes in v2: - Win64 support (untested) - Use u_cpu_detect.h constants instead of #ifs This commit adds minimal x86-64 support: only movs between registers are supported for r8-r15, and x64_rexw() must be used to ask for 64-bit operations. It also adds several new instructions for the new translate_sse code. --- src/gallium/auxiliary/rtasm/rtasm_cpu.c|6 +- src/gallium/auxiliary/rtasm/rtasm_x86sse.c | 455 ++-- src/gallium/auxiliary/rtasm/rtasm_x86sse.h | 69 - 3 files changed, 493 insertions(+), 37 deletions(-) diff --git a/src/gallium/auxiliary/rtasm/rtasm_cpu.c b/src/gallium/auxiliary/rtasm/rtasm_cpu.c index 2e15751..0461c81 100644 --- a/src/gallium/auxiliary/rtasm/rtasm_cpu.c +++ b/src/gallium/auxiliary/rtasm/rtasm_cpu.c @@ -30,7 +30,7 @@ #include rtasm_cpu.h -#if defined(PIPE_ARCH_X86) +#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) static boolean rtasm_sse_enabled(void) { static boolean firsttime = 1; @@ -49,7 +49,7 @@ static boolean rtasm_sse_enabled(void) int rtasm_cpu_has_sse(void) { /* FIXME: actually detect this at run-time */ -#if defined(PIPE_ARCH_X86) +#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) return rtasm_sse_enabled(); #else return 0; @@ -59,7 +59,7 @@ int rtasm_cpu_has_sse(void) int rtasm_cpu_has_sse2(void) { /* FIXME: actually detect this at run-time */ -#if defined(PIPE_ARCH_X86) +#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) return rtasm_sse_enabled(); #else return 0; diff --git a/src/gallium/auxiliary/rtasm/rtasm_x86sse.c b/src/gallium/auxiliary/rtasm/rtasm_x86sse.c index 63007c1..88b182b 100644 --- a/src/gallium/auxiliary/rtasm/rtasm_x86sse.c +++ b/src/gallium/auxiliary/rtasm/rtasm_x86sse.c @@ -22,8 +22,9 @@ **/ #include pipe/p_config.h +#include util/u_cpu_detect.h -#if defined(PIPE_ARCH_X86) +#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) #include pipe/p_compiler.h #include util/u_debug.h @@ -231,6 +232,10 @@ static void emit_modrm( struct x86_function *p, assert(reg.mod == mod_REG); + /* TODO: support extended x86-64 registers */ + assert(reg.idx 8); + assert(regmem.idx 8); + val |= regmem.mod 6; /* mod field */ val |= reg.idx 3;/* reg field */ val |= regmem.idx; /* r/m field */ @@ -363,6 +368,12 @@ int x86_get_label( struct x86_function *p ) */ +void x64_rexw(struct x86_function *p) +{ + if(util_cpu_arch == UTIL_CPU_ARCH_X86_64) + emit_1ub(p, 0x48); +} + void x86_jcc( struct x86_function *p, enum x86_cc cc, int label ) @@ -449,6 +460,52 @@ void x86_mov_reg_imm( struct x86_function *p, struct x86_reg dst, int imm ) emit_1i(p, imm); } +void x86_mov_imm( struct x86_function *p, struct x86_reg dst, int imm ) +{ + DUMP_RI( dst, imm ); + if(dst.mod == mod_REG) + x86_mov_reg_imm(p, dst, imm); + else + { + emit_1ub(p, 0xc7); + emit_modrm_noreg(p, 0, dst); + emit_1i(p, imm); + } +} + +void x86_mov16_imm( struct x86_function *p, struct x86_reg dst, uint16_t imm ) +{ + DUMP_RI( dst, imm ); + emit_1ub(p, 0x66); + if(dst.mod == mod_REG) + { + emit_1ub(p, 0xb8 + dst.idx); + emit_2ub(p, imm 0xff, imm 8); + } + else + { + emit_1ub(p, 0xc7); + emit_modrm_noreg(p, 0, dst); + emit_2ub(p, imm 0xff, imm 8); + } +} + +void x86_mov8_imm( struct x86_function *p, struct x86_reg dst, uint8_t imm ) +{ + DUMP_RI( dst, imm ); + if(dst.mod == mod_REG) + { + emit_1ub(p, 0xb0 + dst.idx); + emit_1ub(p, imm); + } + else + { + emit_1ub(p, 0xc6); + emit_modrm_noreg(p, 0, dst); + emit_1ub(p, imm); + } +} + /** * Immediate group 1 instructions. */ @@ -520,7 +577,7 @@ void x86_push( struct x86_function *p, } - p-stack_offset += 4; + p-stack_offset += sizeof(void*); } void x86_push_imm32( struct x86_function *p, @@ -530,7 +587,7 @@ void x86_push_imm32( struct x86_function *p, emit_1ub(p, 0x68); emit_1i(p, imm32); - p-stack_offset += 4; + p-stack_offset += sizeof(void*); } @@ -540,23 +597,33 @@ void x86_pop( struct x86_function *p, DUMP_R( reg ); assert(reg.mod == mod_REG); emit_1ub(p, 0x58 + reg.idx); - p-stack_offset -= 4; + p-stack_offset -= sizeof(void*); } void x86_inc( struct x86_function *p, struct x86_reg reg ) { DUMP_R( reg ); - assert(reg.mod ==
Re: [Mesa-dev] [PATCH 1/3] u_cpu_detect: make arch and little_endian constants, add abi and x86-64
On Fri, 2010-08-13 at 21:41 +0100, José Fonseca wrote: On Fri, 2010-08-13 at 06:47 -0700, Luca Barbieri wrote: A few related changes: 1. Make x86-64 its own architecture (nothing was using so util_cpu_caps.arch, so nothing can be affected) Just remove util_cpu_caps.arch. It's there simply due to its historical ancestry. We have PIPE_ARCH already. 2. Turn the CPU arch and endianness into macros, so that the compiler can evaluate that at constant time and eliminate dead code Ditto. We have PIPE_ENDIAN or something already. From p_config.h: /* * Endian detection. */ #if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) #define PIPE_ARCH_LITTLE_ENDIAN #elif defined(PIPE_ARCH_PPC) || defined(PIPE_ARCH_PPC_64) #define PIPE_ARCH_BIG_ENDIAN #else #define PIPE_ARCH_UNKNOWN_ENDIAN #endif Basically, in my perspective, util_cpu_caps should *only* have the stuff that can vary at run time. Everything else should be macros in p_config.h/p_compiler.h. The rest of the patches in the series look OK to me. Jose 3. Add util_cpu_abi to know about non-standard ABIs like Win64 That's not really prescribed by the CPU. We have PIPE_OS_* already. There's no merit in duplicating in util_caps what's already provided by p_config.h / p_compiler.h Jose --- src/gallium/auxiliary/gallivm/lp_bld_pack.c |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |2 +- src/gallium/auxiliary/util/u_cpu_detect.c | 19 +- src/gallium/auxiliary/util/u_cpu_detect.h | 39 ++-- 4 files changed, 38 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_pack.c b/src/gallium/auxiliary/gallivm/lp_bld_pack.c index ecfb13a..8ab742a 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_pack.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_pack.c @@ -171,7 +171,7 @@ lp_build_unpack2(LLVMBuilderRef builder, msb = lp_build_zero(src_type); /* Interleave bits */ - if(util_cpu_caps.little_endian) { + if(util_cpu_little_endian) { *dst_lo = lp_build_interleave2(builder, src_type, src, msb, 0); *dst_hi = lp_build_interleave2(builder, src_type, src, msb, 1); } diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 3075065..d4b8b4f 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1840,7 +1840,7 @@ lp_build_sample_2d_linear_aos(struct lp_build_sample_context *bld, unsigned i, j; for(j = 0; j h16.type.length; j += 4) { - unsigned subindex = util_cpu_caps.little_endian ? 0 : 1; + unsigned subindex = util_cpu_little_endian ? 0 : 1; LLVMValueRef index; index = LLVMConstInt(elem_type, j/2 + subindex, 0); diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c index b1a8c75..73ce146 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.c +++ b/src/gallium/auxiliary/util/u_cpu_detect.c @@ -391,23 +391,6 @@ util_cpu_detect(void) memset(util_cpu_caps, 0, sizeof util_cpu_caps); - /* Check for arch type */ -#if defined(PIPE_ARCH_MIPS) - util_cpu_caps.arch = UTIL_CPU_ARCH_MIPS; -#elif defined(PIPE_ARCH_ALPHA) - util_cpu_caps.arch = UTIL_CPU_ARCH_ALPHA; -#elif defined(PIPE_ARCH_SPARC) - util_cpu_caps.arch = UTIL_CPU_ARCH_SPARC; -#elif defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) - util_cpu_caps.arch = UTIL_CPU_ARCH_X86; - util_cpu_caps.little_endian = 1; -#elif defined(PIPE_ARCH_PPC) - util_cpu_caps.arch = UTIL_CPU_ARCH_POWERPC; - util_cpu_caps.little_endian = 0; -#else - util_cpu_caps.arch = UTIL_CPU_ARCH_UNKNOWN; -#endif - /* Count the number of CPUs in system */ #if defined(PIPE_OS_WINDOWS) { @@ -504,7 +487,7 @@ util_cpu_detect(void) #ifdef DEBUG if (debug_get_option_dump_cpu()) { - debug_printf(util_cpu_caps.arch = %i\n, util_cpu_caps.arch); + debug_printf(util_cpu_caps.arch = %i\n, util_cpu_arch); debug_printf(util_cpu_caps.nr_cpus = %u\n, util_cpu_caps.nr_cpus); debug_printf(util_cpu_caps.x86_cpu_type = %u\n, util_cpu_caps.x86_cpu_type); diff --git a/src/gallium/auxiliary/util/u_cpu_detect.h b/src/gallium/auxiliary/util/u_cpu_detect.h index 4b3dc39..e81e4b5 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.h +++ b/src/gallium/auxiliary/util/u_cpu_detect.h @@ -36,6 +36,7 @@ #define _UTIL_CPU_DETECT_H #include pipe/p_compiler.h +#include pipe/p_config.h enum util_cpu_arch { UTIL_CPU_ARCH_UNKNOWN = 0, @@ -43,19 +44,49 @@ enum util_cpu_arch { UTIL_CPU_ARCH_ALPHA, UTIL_CPU_ARCH_SPARC, UTIL_CPU_ARCH_X86, - UTIL_CPU_ARCH_POWERPC + UTIL_CPU_ARCH_X86_64, + UTIL_CPU_ARCH_POWERPC
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Wed, 2010-08-11 at 17:40 -0700, Dave Airlie wrote: On Thu, Aug 12, 2010 at 9:42 AM, José Fonseca jfons...@vmware.com wrote: On Wed, 2010-08-11 at 12:52 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 José Fonseca wrote: Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? I think we learned our lesson with GLEW. Trying to keep a copy of an external dependency in our tree only leads to sadness. I have no intention to repeat that mistake. Having the GLEW source was quite useful for windows. I guess the lesson would not be not ship source, but only build it if necessary. If talloc source is not needed for Linux then we can just build it when absent. You can even ignore it from automake. Can't you guys setup a separate windows deps repo? that you gather all the prereqs for building on Windows in one place and recommend people check it out before mesa? Unfortunately there is no standard way to install headers and libraries in windows -- there's no /usr/include or /usr/lib, and there might be multiple MSVC versions. There are ways around it, sure. Might be worth giving a shot eventually. Really optimising for the wrong bunch of people here by dragging this stuff into mesa git. Many projects do this: they include the source of other projects, to make it easier to build without having to build all dependencies. If this is still bothersome, I can make sure it's only built on non-unix platforms. We can even put in src/thirdparty/... or somewhere else out of the way. I really can't see how this causes inconvenience to anybody. Furthermore, I really don't know any wrong bunch of people here. But I do know at least one prejudiced person with delusions of grandeur. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Thu, 2010-08-12 at 04:10 -0700, Dave Airlie wrote: On Thu, Aug 12, 2010 at 9:00 PM, José Fonseca jfons...@vmware.com wrote: On Wed, 2010-08-11 at 17:40 -0700, Dave Airlie wrote: On Thu, Aug 12, 2010 at 9:42 AM, José Fonseca jfons...@vmware.com wrote: On Wed, 2010-08-11 at 12:52 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 José Fonseca wrote: Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? I think we learned our lesson with GLEW. Trying to keep a copy of an external dependency in our tree only leads to sadness. I have no intention to repeat that mistake. Having the GLEW source was quite useful for windows. I guess the lesson would not be not ship source, but only build it if necessary. If talloc source is not needed for Linux then we can just build it when absent. You can even ignore it from automake. Can't you guys setup a separate windows deps repo? that you gather all the prereqs for building on Windows in one place and recommend people check it out before mesa? Unfortunately there is no standard way to install headers and libraries in windows -- there's no /usr/include or /usr/lib, and there might be multiple MSVC versions. There are ways around it, sure. Might be worth giving a shot eventually. Really optimising for the wrong bunch of people here by dragging this stuff into mesa git. Many projects do this: they include the source of other projects, to make it easier to build without having to build all dependencies. We spend a lot of time unbundling stuff in projects because its a really bad model, esp for things like security updates. What happens if the version of talloc we ship ends up a with a problem and nobody is tracking upstream and notices the issue, this doesn't scale, at some point you have to draw a line, again you have 0 experience with packaging mesa for its main purpose and use case, so excuse me if I dismiss the rest of your pithy comments. Good points, but I still don't see how my proposal of not building them on linux does not address them. talloc is a bit different than glew I can accept that, since glew isn't a mesa build req, but should we start shipping bison, flex etc, is it only build reqs or runtime reqs? no. but that's why we commit the bison and flex output in the repository for example. it would be nice instead of being an ass you perhaps helped draw up some guidelines with all the other Windows folks. my feelings exactly when I read the last sentence of your previous email. in a hindsight perhaps you meant distributions + end users vs developers, but I understood it as windows developers == bunch of wrong people, and linux developers == bunch of good people. who cares. there isn't wrong bunch of people regardless. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Thu, 2010-08-12 at 07:58 -0700, Matt Turner wrote: On Thu, Aug 12, 2010 at 7:00 AM, José Fonseca jfons...@vmware.com wrote: Really optimising for the wrong bunch of people here by dragging this stuff into mesa git. Many projects do this: they include the source of other projects, to make it easier to build without having to build all dependencies. This is true, but it's also quite annoying. Take ioquake3 for example, in order to make it simpler for Windows and Mac OS developers, they include in SVN things like the libSDL headers, OpenAL headers, static libraries for OpenAL, libSDL, and libcurl, and the sources for libspeex, libcurl, and zlib. So the Linux packagers have to jump through hoops trying to untangle this mess. What Dave's saying with optimizing for the wrong people is that we're including lots of things in the Mesa code that most people (most users of Mesa are Linux/BSD/UNIX users) don't need, and by including these things with Mesa, we're making it more difficult to package Mesa for most people. Frankly, it's also a bit hard to empathize with any this makes Windows development harder arguments when we don't have the code that you're building on Windows. But I digress... How come? All code to build on softpipe/llvmpipe on windows is checked into mesa (at least until talloc dependency is added). Read http://www.mesa3d.org/install.html#scons for softpipe instructions, and http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/llvmpipe/README for llvmpipe. It gives you a software implementation of OpenGL way better than the one that comes bundled with Windows. Surely moving external build dependencies to another git repository would satisfy everyone. On another note, Dave might sound annoyed in these emails, but if he's like me, it's because he's read lots of emails about problems caused by Windows that, again, most people don't care about, and has held back on saying anything until now. It is frustrating to see things like merging glsl2 held up because of Windows. That's short sighted. Windows has a lot of weight economically, and Gallium is only where it is today because it tapped into that. So you focus on the negative aspects of supporting Windows, but you ignore all the goodness that came from (in particular all the QA and bugfixing done with windows apps, that are far more complex than most stuff available for linux, in particular games CAD apps). Anyway, if you turn this into a Windows vs Linux battle I'm bound to loose here. Especially because I don't even use it personally -- for me it's no different than another embedded platform for which I write drivers and debug remotely from my Linux development machine. It's just that I don't like to shit where I eat. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Thu, 2010-08-12 at 08:33 -0700, Matt Turner wrote: On Thu, Aug 12, 2010 at 11:27 AM, José Fonseca jfons...@vmware.com wrote: On Thu, 2010-08-12 at 07:58 -0700, Matt Turner wrote: On Thu, Aug 12, 2010 at 7:00 AM, José Fonseca jfons...@vmware.com wrote: Really optimising for the wrong bunch of people here by dragging this stuff into mesa git. Many projects do this: they include the source of other projects, to make it easier to build without having to build all dependencies. This is true, but it's also quite annoying. Take ioquake3 for example, in order to make it simpler for Windows and Mac OS developers, they include in SVN things like the libSDL headers, OpenAL headers, static libraries for OpenAL, libSDL, and libcurl, and the sources for libspeex, libcurl, and zlib. So the Linux packagers have to jump through hoops trying to untangle this mess. What Dave's saying with optimizing for the wrong people is that we're including lots of things in the Mesa code that most people (most users of Mesa are Linux/BSD/UNIX users) don't need, and by including these things with Mesa, we're making it more difficult to package Mesa for most people. [...] That's not really my intention. I think we can easily end the whole discussion just by moving the build dependencies that are used for Windows to a separate repository. It shouldn't make things more difficult for the VMware guys (I wouldn't think, at least) and it would make things cleaner for the Linux guys. OK. What about this: For GLUT, GLEW, LLVM and all other dependencies I'll just make a SDK with the binaries, with debug release, 32 64 bit, MinGW MSVC versions. One seldom needs to modify the source anyway, and they have active upstream development. But I perceive talloc as different from all above: it's very low level and low weight library, providing very basic functionality, and upstream never showed interest for Windows portability. I'd really prefer to see the talloc source bundled (and only compiled on windows), as a quick way to have glsl2 merged without causing windows build failures. (Hopefully in the future we could have jakob Bornecrantz or Aras Pranckevičius re-implement a BSD-like version of it, and therefore eliminating all the different licensing concerns.) Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Tue, 2010-08-10 at 22:57 -0700, Aras Pranckevicius wrote: No, it's missing most of the API that talloc provides. Also, http://github.com/aras-p/glsl-optimizer/ ported it to windows. Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? First things first: I needed to make it compile work quickly, so I just dropped talloc.h and talloc.c into my project (http://github.com/aras-p/glsl-optimizer/tree/glsl2/src/glsl/msvc/talloc/) and made some fixes. So I wouldn't call it a full port, but apparently it was enough without taking everything that is in official talloc tarball. However, I'm not really sure if that is legal license wise (talloc is LGPLv3). I'm not a lawyer, but my understanding is that if you copy the license files too, then it should be alright. I'd personally prefer src/talloc instead of src/glsl/msvc/talloc/ . That said, the only fixes I had to do are Visual C++ specific (I'm using VS2008): * compile talloc.c as C++ (otherwise 'inline' etc. are errors on MSVC). This required wrapping talloc.h and talloc.c in extern C blocks: http://github.com/aras-p/glsl-optimizer/commit/ceee99ebe0c606de6ed093c2aec20f8ecae5b673 Using C++ is overkill. The solution is simply to use the MSVC inline keyword (__inline, or something, see p_compiler.h). * vsnprintf can't be used to determine output buffer size; under MSVC _vcsprintf has to be used instead. http://github.com/aras-p/glsl-optimizer/commit/56bb0c7e7cedefcd2d149011a0b644551e080b9a * Replaced usage of MIN to TALLOC_MIN, and defined TALLOC_MIN: http://github.com/aras-p/glsl-optimizer/commit/db96499fbf874582b81dabedebc835c950520211 (there's accidental line of unrelated change in that commit) Compiling on Mac with Xcode 3.2 (using gcc 4.0) required MIN-TALLOC_MIN change and compiling as C++ with extern C blocks as well. Could you please prepare a patch? Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): translate_generic: return NULL instead of assert(0) if format not supported
On Wed, 2010-08-11 at 07:59 -0700, Luca Barbieri wrote: I assumed that the fact it would crash on a debug build would make the behavior safe to change. Even if it doesn't crash, it is anyway working incorrectly. Assert failures mean something that developers didn't anticipate to happen did happen. But release code should still continue past the assert and do something sensible like ignoring. That is, it should be resilient against the unexpected. At least this is my perspective. I'll revert it until further notice. Thanks. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Wed, 2010-08-11 at 12:52 -0700, Ian Romanick wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 José Fonseca wrote: Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? I think we learned our lesson with GLEW. Trying to keep a copy of an external dependency in our tree only leads to sadness. I have no intention to repeat that mistake. Having the GLEW source was quite useful for windows. I guess the lesson would not be not ship source, but only build it if necessary. If talloc source is not needed for Linux then we can just build it when absent. You can even ignore it from automake. I suspect there may also be some issue with including a piece of software with such a different license in our tree. I'm not a lawyer, so I may be unnecessarily paranoid here. *shrug* I'm not a lawyer neither but this is paranoid IMHO. How the source of multiple open source works is layed out or packaged is seldom prescribed in open source licenses. It's all about how multiple works are linked into binary form and distributed. LGPLv3 is quite clear here on this matter too, per http://www.gnu.org/licenses/lgpl.html , point 4 Combined Works, and point 5, Combined Libraries. If indeed, LGPLv3 puts us in muddy waters then the talloc dependency should never be added. Keeping its source out of mesa alters nothing. So let's be consistent here. FWIW, the talloc dependency is not worth its salt. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Wed, 2010-08-11 at 13:18 -0700, Aras Pranckevicius wrote: Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? I think we learned our lesson with GLEW. Trying to keep a copy of an external dependency in our tree only leads to sadness. I have no intention to repeat that mistake. I suspect there may also be some issue with including a piece of software with such a different license in our tree. I'm not a lawyer, so I may be unnecessarily paranoid here. *shrug* Another option I was considering (mostly for my own needs; I need to use GLSL2 fork in a closed source product) is a from-scratch implementation of talloc that keeps the same interface. Similar to what Mono folks have did with glib (they wrote their own eglib that matched the license and was much smaller in the result). That's a good idea. Jakob was also considering adding the missing bits to http://swapped.cc/halloc/ to make it compatible with talloc at source level. In my case, talloc's LGPL is quite a hassle because I have to build talloc dlls/dylibs, which complicates deployment packaging, etc. I had not time to do that yet and probably won't have in the next month or two though :( talloc is not very large, looks like just taking one .h and .c file is enough. And then there are quite a few functions that GLSL2 does not ever use. I really can't condone the glsl2 branch merge until we have some consensus on how the talloc dependency should be handled, windows or not. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] talloc (Was: Merge criteria for glsl2 branch)
On Sun, 2010-08-01 at 10:19 -0700, Eric Anholt wrote: On Tue, 27 Jul 2010 21:32:57 +0100, José Fonseca jfons...@vmware.com wrote: On Wed, 2010-07-21 at 18:53 -0700, Ian Romanick wrote: As everyone knows, a group of us at Intel have been rewriting Mesa's GLSL compiler. The work started out-of-tree as a stand alone compiler. We moved all of our work to the glsl2 branch in the Mesa tree as soon as we had some actual code being generated. This was about month ago. Since that time we have implemented quite a bit more code generation and a complete linker. The compiler is not done, but it gets closer every day. I think now is the time to start discussing merge criteria. It is well known that the Intel graphics team favors quarterly releases. In order to align with other people's release schedules, we'd really like to have the new compiler in a Mesa release at the end of Q3 (end of September / beginning of October). That's just over two months from now. In order to have a credible release, the compiler needs to be merged to master before then. A reasonable estimate puts the end of August as the latest possible merge time. Given how far the compiler has come in the last month, I have a lot of faith in being able to hit that target. We have developed our own set of merge requirements, and these are listed below. Since this is such a large subsystem, we want to solicit input from the other stakeholders. * No piglit regressions, except draw_buffers-05.vert, compared to master in swrast, i965, or i915. * Any failing tests do not crash (segfault, assertion failure, etc.). draw_buffers-05.vert is specifically excluded because I'm not sure the test is correct. One of the items on the TODO list is proper support for GLSL ES. That work won't happen until the last couple weeks of August, so I don't think any sort of ES testing is suitable merge criteria. Unless, of course, the tests in question should also work on desktop GLSL. We haven't and, for all practical purposes, can't test with Gallium or other hardware drivers. The new compiler generates the same assembly IR that the current compiler generates. We haven't added any instructions. It does, however, generate different combinations of instructions, different uses of registers, and so on. We don't expect there to be significant impact on other hardware, but there may be some. The optimizer in r300 will certainly see different sequences of instructions than it is accustomed to seeing. I can't foresee what impact this will have on that driver. I have put up the results of master and the glsl2 branch at http://people.freedesktop.org/~idr/results/. This run off compiler.tests from the glsl2 branch of Eric's piglit repository. A few of the failures (half a dozen or so) on master seem to be mutex related in the GLX code. Kristian is working on fixing that. :) We're already very close to our proposed merge criteria. I was recently made aware that glsl2 adds an hard dependency on the talloc library. Is this correct? I doesn't seem that talloc has ever been ported to Windows or MSVC, although it seems small. There's also the fact that it's a dependency with a very different license from Mesa (LGPLv3). This is not an obstacle on itself, but it makes it harder to simply bundle the code into mesa and port it ourselves. At a first glance it seems that talloc gives a tad more trouble than what it's worth. Did you guys investigate other alternatives? talloc.c mentions it was inspired by http://swapped.cc/halloc/ , which is BSD license and seems trivial to port to MSVC. Would that also fit your needs? No, it's missing most of the API that talloc provides. Also, http://github.com/aras-p/glsl-optimizer/ ported it to windows. Could then Aras Pranckevicius's talloc port to windows be merged into glsl2 branch before glsl2 is merged into master? Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev