Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 15:57, José Fonseca wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm a8r8g8b8_unorm - x8r8g8b8_unorm b5g5r5a1_unorm - b5g5r5x1_unorm b4g4r4a4_unorm - b4g4r4x4_unorm l8_unorm - r8_unorm i8_unorm - l8_unorm i8_unorm - a8_unorm i8_unorm - r8_unorm l16_unorm - r16_unorm z24_unorm_s8_uscaled - z24x8_unorm s8_uscaled_z24_unorm - x8z24_unorm r8g8b8a8_unorm - r8g8b8x8_unorm a8b8g8r8_srgb - x8b8g8r8_srgb b8g8r8a8_srgb - b8g8r8x8_srgb a8r8g8b8_srgb - x8r8g8b8_srgb a8b8g8r8_unorm - x8b8g8r8_unorm r10g10b10a2_uscaled - r10g10b10x2_uscaled r10sg10sb10sa2u_norm - r10g10b10x2_snorm Note that format compatibility is not commutative. For software drivers this means that memcpy/util_copy_rect() will achieve the correct result. For hardware drivers this means that a VRAM-VRAM 2D blit engine will also achieve the correct result. So I'd expect no implementation change of resource_copy_region() for any driver AFAICT. But I'd like to be sure. Jose José, this looks good to me. Note that the analogous function in d3d10, ResourceCopyRegion, only requires formats to be in the same typeless group (hence same number of bits for all components), which is certainly a broader set of compatible formats to what util_is_format_compatible() is outputting. As far as I can tell, no conversion is happening at all in d3d10, this is just like memcpy. I think we might want to support that in the future as well, but for now extending this to the formats you listed certainly sounds ok. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 17:16, Luca Barbieri wrote: On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: I was about to propose something like this. How about a much more powerful change though, that would make any pair of non-blocked format of the same bit depth compatible? This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. In addition to this, how about explicitly allowing sampler views to use a compatible format, and add the ability for surfaces to use a compatible format too? (with a new parameter to get_tex_surface) Note that get_tex_surface is dead (in gallium-array-textures - not merged yet but it will happen eventually). Its replacement (for render targets or depth stencil) create_surface(), already can be supplied with a format parameter. Compatible formats though should ultimately end up to something similar to dx10. This would allow for instance to implement glBlitFramebuffer on stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow the blitter module to copy depth/stencil buffers by simply treating them as color buffers. The only issue is that some drivers might hold depth/stencil surfaces in compressed formats that cannot be interpreted as a color format, and not have any mechanism for keeping temporaries or doing conversions internally. I think that's a pretty big if. I could be wrong but I think operations like blitting stencil buffers are pretty rare anyway (afaik other apis don't allow things like that). DirectX seems to have something like this with the _TYPELESS formats. Yes, and it precisely won't allow you to interpret s24_z8 as r8g8b8a8 or other wonky stuff. Only if all components have same number of bits. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 22:03, Luca Barbieri wrote: This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. Yes, it depends on the flexibility of the hardware and the driver. Due to depth textures, I think it is actually likely that you can easily treat depth as color. The worst issue right now is that stencil cannot be accessed in a sensible way at all, which makes implementing glBlitFramebuffer of STENCIL_BIT with NEAREST and different rect sizes impossible. Some cards (r600+ at least) can write stencil in shaders, but on some you must reinterpret the surface. And resource_copy_region does not support stretching, so it can't be used. Since not all cards can write stencil in shaders, one either needs to be able to bind depth/stencil as a color buffer, or extend resource_copy_region to support stretching with nearest filtering, or both (possibly in addition to having the option of using stencil export in shaders). Yes, accessing stencil is a problem - other apis just disallow that... There are other problems with accessing stencil, like for instance WritePixels with multisampled depth/stencil buffer (which you can't really map hence cpu fallbacks don't even work). Plus you really don't want any cpu fallbacks anyway. Using stencil export (ARB_shader_stencil_export) seems like a clean solution, but as you said not all cards support it. Plus you can't actually get the stencil values with texture sampling neither, so this doesn't help that much (well you can't get them with GL though hardware may support it I guess). When I said it won't work with decent hardware, I really meant it won't work due to compression. Now, it's quite possible this can be disabled on any chip, but you don't know that before hence you need to jump through hoops to get an uncompressed version of your compressed buffer later. Do applications actually really ever use blitframebuffer with stencil bit (with different sizes, otherwise resource_copy_region could be used)? It just seems to me that casts to completely different formats (well still with same total bitwidth, but still) are very unclean, but I don't have any good solution of how to solve this - if noone ever uses this in practice cpu fallback is just fine, but as said won't work for multisampled buffers for instance neither. Other things would likely benefit, such as GL_NV_copy_depth_to_color. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] ARB draw buffers + texenv program
On 13.04.2010 02:52, Dave Airlie wrote: On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote: Dave Airlie wrote: Just going down the r300g piglit failures and noticed fbo-drawbuffers failed, I've no idea if this passes on Intel hw, but it appears the texenvprogram really needs to understand the draw buffers. The attached patch fixes it here for me on r300g anyone want to test this on Intel with the piglit test before/after? The piglit test passes as-is with Mesa/swrast and NVIDIA. It fails with gallium/softpipe both with and w/out your patch. I think that your patch is on the right track. But multiple render targets are still a bit of an untested area in the st/mesa code. One thing: the patch introduces a dependency on buffer state in the texenvprogram code so in state.c we should check for the _NEW_BUFFERS flag. Otherwise, I'd like to debug the softpipe failure a bit further to see what's going on. Perhaps you could hold off on committing this for a bit... Well Eric pointed out to me the fun line in the spec (3) Should gl_FragColor be aliased to gl_FragData[0]? RESOLUTION: No. A shader should write either gl_FragColor, or gl_FragData[n], but not both. Writing to gl_FragColor will write to all draw buffers specified with DrawBuffersARB. So I was really just masking the issue with this. From what I can see softpipe messes up and I'm not sure where we should be fixing this. swrast does okay, its just whether we should be doing something in gallium or in the drivers is open. Hmm yes looks like that's not really well defined. I guess there are several options here: 1) don't do anything at the state tracker level, and assume that if a fragment shader only writes to color 0 but has several color buffers bound the color is meant to go to all outputs. Looks like that's what nv50 is doing today. If a shader writes to FragData[0] but not others, in gallium that would mean that output still gets replicated to all outputs, but since the spec says unwritten outputs are undefined that would be just fine (for OpenGL - not sure about other APIs). 2) Use some explicit means to distinguish FragData[] from FragColor in gallium. For instance, could use different semantic name (like TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective outputs). Or could have a flag somewhere (not quite sure where) saying if color output is to be replicated to all buffers. 3) Translate away the single color output in state tracker to multiple outputs. I don't like option 3) though. Means we need to recompile if the attached buffers change. Moreover, it seems both new nvidia and AMD chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw. I don't like option 1) neither, that kind of implicit behavior might be ok but this kind of guesswork isn't very nice imho. Opinions? Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] ARB draw buffers + texenv program
On 13.04.2010 20:28, Alex Deucher wrote: On Tue, Apr 13, 2010 at 2:21 PM, Corbin Simpson mostawesomed...@gmail.com wrote: On Tue, Apr 13, 2010 at 6:42 AM, Roland Scheidegger srol...@vmware.com wrote: On 13.04.2010 02:52, Dave Airlie wrote: On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote: Dave Airlie wrote: Just going down the r300g piglit failures and noticed fbo-drawbuffers failed, I've no idea if this passes on Intel hw, but it appears the texenvprogram really needs to understand the draw buffers. The attached patch fixes it here for me on r300g anyone want to test this on Intel with the piglit test before/after? The piglit test passes as-is with Mesa/swrast and NVIDIA. It fails with gallium/softpipe both with and w/out your patch. I think that your patch is on the right track. But multiple render targets are still a bit of an untested area in the st/mesa code. One thing: the patch introduces a dependency on buffer state in the texenvprogram code so in state.c we should check for the _NEW_BUFFERS flag. Otherwise, I'd like to debug the softpipe failure a bit further to see what's going on. Perhaps you could hold off on committing this for a bit... Well Eric pointed out to me the fun line in the spec (3) Should gl_FragColor be aliased to gl_FragData[0]? RESOLUTION: No. A shader should write either gl_FragColor, or gl_FragData[n], but not both. Writing to gl_FragColor will write to all draw buffers specified with DrawBuffersARB. So I was really just masking the issue with this. From what I can see softpipe messes up and I'm not sure where we should be fixing this. swrast does okay, its just whether we should be doing something in gallium or in the drivers is open. Hmm yes looks like that's not really well defined. I guess there are several options here: 1) don't do anything at the state tracker level, and assume that if a fragment shader only writes to color 0 but has several color buffers bound the color is meant to go to all outputs. Looks like that's what nv50 is doing today. If a shader writes to FragData[0] but not others, in gallium that would mean that output still gets replicated to all outputs, but since the spec says unwritten outputs are undefined that would be just fine (for OpenGL - not sure about other APIs). 2) Use some explicit means to distinguish FragData[] from FragColor in gallium. For instance, could use different semantic name (like TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective outputs). Or could have a flag somewhere (not quite sure where) saying if color output is to be replicated to all buffers. 3) Translate away the single color output in state tracker to multiple outputs. I don't like option 3) though. Means we need to recompile if the attached buffers change. Moreover, it seems both new nvidia and AMD chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw. I don't like option 1) neither, that kind of implicit behavior might be ok but this kind of guesswork isn't very nice imho. Whatever's easiest, just document it. I'd be cool with: DECL IN[0], COLOR, PERSPECTIVE DECL OUT[0], COLOR MOV OUT[0], IN[0] END Effectively being a write to all color buffers, however, this one from progs/tests/drawbuffers: DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL OUT[1], COLOR[1] IMM FLT32 { 1., 0., 0., 0. } 0: MOV OUT[0], IN[0] 1: SUB OUT[1], IMM[0]., IN[0] 2: END Would then double-write the second color buffer. Unpleasant. Language like this would work, I suppose? If only one color output is declared, writes to the color output shall be redirected to all bound color buffers. Otherwise, color outputs shall be bound to their specific color buffer. Also, keep in mind that writing to multiple color buffers uses additional memory bandwidth, so for performance, we should only do so when required. Do apps really have several color buffers bound but only write to one, leaving the state of the others undefined in the process? Sounds like a poor app to begin with to me. Actually, I would restrict that language above further, so only color output 0 will get redirected to all buffers if it's the only one written. As said though I'd think some explicit bits somewhere are cleaner. I'm not yet sure that the above would really work for all APIs, it is possible some say other buffers not written to are left as is instead of undefined. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https
Re: [Mesa3d-dev] ARB draw buffers + texenv program
On 14.04.2010 00:38, Dave Airlie wrote: On Wed, Apr 14, 2010 at 8:33 AM, Roland Scheidegger srol...@vmware.com wrote: On 13.04.2010 20:28, Alex Deucher wrote: On Tue, Apr 13, 2010 at 2:21 PM, Corbin Simpson mostawesomed...@gmail.com wrote: On Tue, Apr 13, 2010 at 6:42 AM, Roland Scheidegger srol...@vmware.com wrote: On 13.04.2010 02:52, Dave Airlie wrote: On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote: Dave Airlie wrote: Just going down the r300g piglit failures and noticed fbo-drawbuffers failed, I've no idea if this passes on Intel hw, but it appears the texenvprogram really needs to understand the draw buffers. The attached patch fixes it here for me on r300g anyone want to test this on Intel with the piglit test before/after? The piglit test passes as-is with Mesa/swrast and NVIDIA. It fails with gallium/softpipe both with and w/out your patch. I think that your patch is on the right track. But multiple render targets are still a bit of an untested area in the st/mesa code. One thing: the patch introduces a dependency on buffer state in the texenvprogram code so in state.c we should check for the _NEW_BUFFERS flag. Otherwise, I'd like to debug the softpipe failure a bit further to see what's going on. Perhaps you could hold off on committing this for a bit... Well Eric pointed out to me the fun line in the spec (3) Should gl_FragColor be aliased to gl_FragData[0]? RESOLUTION: No. A shader should write either gl_FragColor, or gl_FragData[n], but not both. Writing to gl_FragColor will write to all draw buffers specified with DrawBuffersARB. So I was really just masking the issue with this. From what I can see softpipe messes up and I'm not sure where we should be fixing this. swrast does okay, its just whether we should be doing something in gallium or in the drivers is open. Hmm yes looks like that's not really well defined. I guess there are several options here: 1) don't do anything at the state tracker level, and assume that if a fragment shader only writes to color 0 but has several color buffers bound the color is meant to go to all outputs. Looks like that's what nv50 is doing today. If a shader writes to FragData[0] but not others, in gallium that would mean that output still gets replicated to all outputs, but since the spec says unwritten outputs are undefined that would be just fine (for OpenGL - not sure about other APIs). 2) Use some explicit means to distinguish FragData[] from FragColor in gallium. For instance, could use different semantic name (like TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective outputs). Or could have a flag somewhere (not quite sure where) saying if color output is to be replicated to all buffers. 3) Translate away the single color output in state tracker to multiple outputs. I don't like option 3) though. Means we need to recompile if the attached buffers change. Moreover, it seems both new nvidia and AMD chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw. I don't like option 1) neither, that kind of implicit behavior might be ok but this kind of guesswork isn't very nice imho. Whatever's easiest, just document it. I'd be cool with: DECL IN[0], COLOR, PERSPECTIVE DECL OUT[0], COLOR MOV OUT[0], IN[0] END Effectively being a write to all color buffers, however, this one from progs/tests/drawbuffers: DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL OUT[1], COLOR[1] IMM FLT32 { 1., 0., 0., 0. } 0: MOV OUT[0], IN[0] 1: SUB OUT[1], IMM[0]., IN[0] 2: END Would then double-write the second color buffer. Unpleasant. Language like this would work, I suppose? If only one color output is declared, writes to the color output shall be redirected to all bound color buffers. Otherwise, color outputs shall be bound to their specific color buffer. Also, keep in mind that writing to multiple color buffers uses additional memory bandwidth, so for performance, we should only do so when required. Do apps really have several color buffers bound but only write to one, leaving the state of the others undefined in the process? Sounds like a poor app to begin with to me. Actually, I would restrict that language above further, so only color output 0 will get redirected to all buffers if it's the only one written. As said though I'd think some explicit bits somewhere are cleaner. I'm not yet sure that the above would really work for all APIs, it is possible some say other buffers not written to are left as is instead of undefined. Who knows, the GL API allows for it, I don't see how we can arbitrarily decide to restrict it. I could write an app that uses multiple fragment programs, and switches between them, with two outputs buffers bound, though I'm possibly constructing something very arbitary. I fail to see the problem. If you have two color buffers bound
Re: [Mesa3d-dev] gallium-resources branch merge
On 10.04.2010 14:00, Keith Whitwell wrote: Hmm, not sure whether to merge or squash-merge this branch. Any thoughts? I'm no big fan of squash merge but the history of the normal merge won't be nice neither. Tough call, though I'd prefer a normal merge. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-resources branch merge
On 10.04.2010 16:43, Chia-I Wu wrote: On Sat, Apr 10, 2010 at 8:00 PM, Keith Whitwell keith.whitw...@googlemail.com wrote: Hmm, not sure whether to merge or squash-merge this branch. Any thoughts? The conversion to pipe_resource seems to be done by components. Maybe a new branch that reorganize (git rebase -i) the commits in gallium-resources and merge the new branch to master? I've never used git rebase -i but I'm not convinced that can give something sensible. It wasn't done strictly by components, with a couple merges from master (and gallium-buffer-usage-cleanup) in between and fixes for already converted things... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-resources branch merge
On 10.04.2010 17:10, Keith Whitwell wrote: On Sat, Apr 10, 2010 at 4:05 PM, Keith Whitwell keith.whitw...@googlemail.com wrote: On Sat, Apr 10, 2010 at 3:49 PM, Roland Scheidegger srol...@vmware.com wrote: On 10.04.2010 16:43, Chia-I Wu wrote: On Sat, Apr 10, 2010 at 8:00 PM, Keith Whitwell keith.whitw...@googlemail.com wrote: Hmm, not sure whether to merge or squash-merge this branch. Any thoughts? The conversion to pipe_resource seems to be done by components. Maybe a new branch that reorganize (git rebase -i) the commits in gallium-resources and merge the new branch to master? I've never used git rebase -i but I'm not convinced that can give something sensible. It wasn't done strictly by components, with a couple merges from master (and gallium-buffer-usage-cleanup) in between and fixes for already converted things... Squash merge it is. Somewhat arbitrary decision to avoid stretching this out any further. I don't think the history that was on the branch was very useful, nor does inventing history seem likely to help people searching for regressions, etc. The branch is effectively an atomic change, so let's deal with it like that... Yeah, you're right. Thinking about it, parts of it were always broken throughout the life of the branch or didn't even build, so squash merge makes sense. Glad it's merged - no more conflicts fixing for merges from master :-). Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (gallium-resources): gallium: fix comments for changed USAGE flags
On 09.04.2010 17:49, Keith Whitwell wrote: On Fri, 2010-04-09 at 08:45 -0700, Roland Scheidegger wrote: Module: Mesa Branch: gallium-resources Commit: faf53328d1154c51d8a59513f2bfcae62272b0bf URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=faf53328d1154c51d8a59513f2bfcae62272b0bf Author: Roland Scheidegger srol...@vmware.com Date: Fri Apr 9 17:44:24 2010 +0200 gallium: fix comments for changed USAGE flags --- src/gallium/auxiliary/util/u_simple_screen.h |9 + src/gallium/drivers/svga/svga_winsys.h| 10 -- src/gallium/include/pipe/p_screen.h |2 +- src/gallium/include/state_tracker/sw_winsys.h |2 +- 4 files changed, 11 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/util/u_simple_screen.h b/src/gallium/auxiliary/util/u_simple_screen.h index 0042277..1ba59af 100644 --- a/src/gallium/auxiliary/util/u_simple_screen.h +++ b/src/gallium/auxiliary/util/u_simple_screen.h @@ -73,9 +73,10 @@ struct pipe_winsys * window systems must then implement that interface (rather than the * other way around...). * -* usage is a bitmask of PIPE_BUFFER_USAGE_PIXEL/VERTEX/INDEX/CONSTANT. This -* usage argument is only an optimization hint, not a guarantee, therefore -* proper behavior must be observed in all circumstances. +* usage is a bitmask of PIPE_BIND_*. +* XXX is this true? +* This usage argument is only an optimization hint, not a guarantee, +* therefore proper behavior must be observed in all circumstances. The new flags are no longer hints - they are supposed actually specify which operations are permitted on a resource. Unfortunately I don't think this is very well enforced yet -- I intend to add a debug layer to sit between state-tracker and driver, based on the drivers/identity layer, which will check for violations of this other rules. Ok, I thought this to be the case, but wasn't sure. I'll fix the comment. In the svga code, I actually couldn't figure out the usage flags when a winsys buffer is created. It looks like usage is always 0, except for queries it uses SVGA_BUFFER_USAGE_PINNED. Of course, that's not a resource but a winsys buffer, but as far as I can tell this ends up in a pb_buffer usage flag. Not sure if that's ok or supposed to be like that... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-resources branch merge
On 09.04.2010 17:29, STEVE555 wrote: Hi all, I've git branched and got the latest commits from the gallium-resources branch and also the latest commits from git master. I did a gmake -B realclean from a prevous compile on my copy of git master,and did a git checkout gallium-resources to switch to that branch,and did a ./autogen.sh with the following options: --prefix=/usr/local --enable-32-bit --enable-xcb --enable-gallium-nouveau --with-state-trackers=dri,egl,xorg,glx,vega,es --enable-motif --enable-gl-osmesa --disable-gallium-intel --disable-gallium-radeon --with-expat=/usr/lib --with-demos=xdemos,demos,trivial,tests --with-dri-drivers=swrast --enable-gallium-swrast --enable-gallium-svga --with-max-width=4096 --with-max-height=4096 --enable-debug I then did a gmake to compile my copy of gallium-resources,but it ended with error at the end: This should be fixed now. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (gallium-resources): gallium: fix comments for changed USAGE flags
On 09.04.2010 18:22, José Fonseca wrote: On Fri, 2010-04-09 at 09:02 -0700, Keith Whitwell wrote: On Fri, 2010-04-09 at 08:59 -0700, Roland Scheidegger wrote: On 09.04.2010 17:49, Keith Whitwell wrote: On Fri, 2010-04-09 at 08:45 -0700, Roland Scheidegger wrote: Module: Mesa Branch: gallium-resources Commit: faf53328d1154c51d8a59513f2bfcae62272b0bf URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=faf53328d1154c51d8a59513f2bfcae62272b0bf Author: Roland Scheidegger srol...@vmware.com Date: Fri Apr 9 17:44:24 2010 +0200 gallium: fix comments for changed USAGE flags --- src/gallium/auxiliary/util/u_simple_screen.h |9 + src/gallium/drivers/svga/svga_winsys.h| 10 -- src/gallium/include/pipe/p_screen.h |2 +- src/gallium/include/state_tracker/sw_winsys.h |2 +- 4 files changed, 11 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/util/u_simple_screen.h b/src/gallium/auxiliary/util/u_simple_screen.h index 0042277..1ba59af 100644 --- a/src/gallium/auxiliary/util/u_simple_screen.h +++ b/src/gallium/auxiliary/util/u_simple_screen.h @@ -73,9 +73,10 @@ struct pipe_winsys * window systems must then implement that interface (rather than the * other way around...). * -* usage is a bitmask of PIPE_BUFFER_USAGE_PIXEL/VERTEX/INDEX/CONSTANT. This -* usage argument is only an optimization hint, not a guarantee, therefore -* proper behavior must be observed in all circumstances. +* usage is a bitmask of PIPE_BIND_*. +* XXX is this true? +* This usage argument is only an optimization hint, not a guarantee, +* therefore proper behavior must be observed in all circumstances. The new flags are no longer hints - they are supposed actually specify which operations are permitted on a resource. Unfortunately I don't think this is very well enforced yet -- I intend to add a debug layer to sit between state-tracker and driver, based on the drivers/identity layer, which will check for violations of this other rules. Ok, I thought this to be the case, but wasn't sure. I'll fix the comment. In the svga code, I actually couldn't figure out the usage flags when a winsys buffer is created. It looks like usage is always 0, except for queries it uses SVGA_BUFFER_USAGE_PINNED. Of course, that's not a resource but a winsys buffer, but as far as I can tell this ends up in a pb_buffer usage flag. Not sure if that's ok or supposed to be like that... Jose has looked at this more recently than I have... pb_buffer sits between pipe driver and the winsys, and needs to pass custom buffer flags unmodified from svga to the winsys. SVGA_BUFFER_USAGE_PINNED is one of those usages. So the svga winsys buffer_create function takes only custom flags, none of the PB_USAGE ones? This is the idea I got from the code (plus the custom flags would clearly overlap with the generic ones), and hence what I updated the comment to (which clearly was wrong). I'm not sure though this really works with the pb code I thought it might do some checks on the usage flags there but if you say it works then I better believe it... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] gallium-resources branch merge
I'm planning on merging the gallium-resources branch shortly (after easter). Due to the amount of code changed, it wouldn't be unexpected if some drivers break here and there. So it would be nice if the respective driver authors could take a look at that branch now. If you've missed the discussion about this branch and what this is about, here it is: http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12726.html I've also removed the video interfaces completely, as they weren't ported to the interface changes and actually some of the video code missed some earlier interface changes so didn't build anyway. Video related work should be done on pipe-video branch which had newer stuff (for video) already. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] How do we init half float tables?
On 02.04.2010 17:09, Luca Barbieri wrote: Additionally, the S3TC library may now support only a subset of the formats. This may be even more useful as further compressed formats are added. FWIW, I don't see any new s3tc formats. rgtc will not be handled by s3tc library since it isn't patent encumbered. util_format_is_s3tc will not include rgtc formats. (Though I guess that external decoding per-pixel is really rather lame, should do it per-block...) Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] glsl: optimize sqrt
On 29.03.2010 04:50, Marek Olšák wrote: We were talking a bit on IRC that the GLSL compiler implements the sqrt function somewhat inefficiently. Instead of rsq+rcp+cmp instructions as is in the original code, the proposed patch uses just rsq+mul. Please see the patch log for further explanation, and please review. I'll definitely agree with the mul instead of rcp part, as that should be more efficient on a lot of modern hardware (rcp usually being part of some special function block instead of main alu). As far as I can tell though we still need the cmp unfortunately, since invsqrt(0) is infinite and multiplying by 0 will give some undefined result, for IEEE it should be NaN (well depending on hardware I guess, if you have implementation which clamps infinity to its max representable number it should be ok). In any case, glsl says invsqrt(0) is undefined, hence can't rely on this. Thinking about it, we'd possibly want a SQRT opcode, both in mesa and tgsi. Because there's actually hardware which can do sqrt (i965 MathBox), and just as importantly because this gives drivers a way to implement this as invsqrt + mul without the cmp, if they can. For instance AMD hardware generally has 3 rounding modes for these ops, IEEE (which gives infinity for invsqrt(0)), DX (clamps to MAX_FLOAT), and FF (which clamps infinity to 0, exactly what you need to implement sqrt with a mul and invsqrt and no cmp - though actually it should work with DX clamping as well). Roland -Marek From 9b834a79a1819f3b4b9868be3e2696667791c83e Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= mar...@gmail.com Date: Sat, 27 Mar 2010 13:49:09 +0100 Subject: [PATCH] glsl: optimize sqrt The new version can be derived from sqrt as follows: sqrt(x) = sqrt(x)^2 / sqrt(x) = x / sqrt(x) = x * rsqrt(x) Also the need for the CMP instruction is gone because there is no division by zero. --- .../shader/slang/library/slang_common_builtin.gc | 22 +++ 1 files changed, 4 insertions(+), 18 deletions(-) diff --git a/src/mesa/shader/slang/library/slang_common_builtin.gc b/src/mesa/shader/slang/library/slang_common_builtin.gc index a25ca55..3f6596c 100644 --- a/src/mesa/shader/slang/library/slang_common_builtin.gc +++ b/src/mesa/shader/slang/library/slang_common_builtin.gc @@ -602,50 +602,36 @@ vec4 exp2(const vec4 a) float sqrt(const float x) { - const float nx = -x; float r; __asm float_rsq r, x; - __asm float_rcp r, r; - __asm vec4_cmp __retVal, nx, r, 0.0; + __retVal = r * x; } vec2 sqrt(const vec2 x) { - const vec2 nx = -x, zero = vec2(0.0); vec2 r; __asm float_rsq r.x, x.x; __asm float_rsq r.y, x.y; - __asm float_rcp r.x, r.x; - __asm float_rcp r.y, r.y; - __asm vec4_cmp __retVal, nx, r, zero; + __retVal = r * x; } vec3 sqrt(const vec3 x) { - const vec3 nx = -x, zero = vec3(0.0); vec3 r; __asm float_rsq r.x, x.x; __asm float_rsq r.y, x.y; __asm float_rsq r.z, x.z; - __asm float_rcp r.x, r.x; - __asm float_rcp r.y, r.y; - __asm float_rcp r.z, r.z; - __asm vec4_cmp __retVal, nx, r, zero; + __retVal = r * x; } vec4 sqrt(const vec4 x) { - const vec4 nx = -x, zero = vec4(0.0); vec4 r; __asm float_rsq r.x, x.x; __asm float_rsq r.y, x.y; __asm float_rsq r.z, x.z; __asm float_rsq r.w, x.w; - __asm float_rcp r.x, r.x; - __asm float_rcp r.y, r.y; - __asm float_rcp r.z, r.z; - __asm float_rcp r.w, r.w; - __asm vec4_cmp __retVal, nx, r, zero; + __retVal = r * x; } -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: List Quake3 extensions first.
On 16.03.2010 18:52, Keith Whitwell wrote: On Tue, 2010-03-16 at 08:32 -0700, Ian Romanick wrote: I'm also a bit surprised that not detecting GL_EXT_compiled_vertex_array has any impact on our Quake3 performance. After all, our CVA implementation doesn't do anything! Looking at the list, it seems more likely that GL_EXT_texture_env_add is the problem. Not having that will cause Quake3 to use additional rendering passes in quite a few cases. I think if CVA isn't present, it falls back to glVertex() and friends... Bad... I'm not sure though listing that extension first really solves all problems. There's a quite famous bug when you bring up the information with the extension string it'll actually segfault. I think though that got fixed in later versions (though I don't know how, if by just copying only the first n bytes of the extension string it obviously wouldn't solve the problem that it doesn't recognize the CVA extension...). And against this you can't really do anything other than app detection and cut the string appropriately... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] extensions supported or not in gallium
Hi, there are currently a couple of extensions enabled in the mesa state tracker which probably shouldn't be. These were moved there by commit a0ae2ca033ec2024da1e01d1c11c0437837c031b (that is with dri they were already always enabled before). Someone knows off-hand which one we can enable or not? I'm going to kill off EXT_cull_vertex and TDFX_texture_compression_FXT1, clearly we can't handle them. The others in question are ARB_window_pos APPLE_client_storage MESA_pack_invert NV_vertex_program NV_vertex_program1_1 (the latter two IIRC the problem was that regs needed to be zero-initialized) Currently gallium dri drivers also have ARB_imaging enabled too (via driInitExtensions()), I think that's not correct neither. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] extensions supported or not in gallium
On 11.03.2010 17:54, Brian Paul wrote: Roland Scheidegger wrote: Hi, there are currently a couple of extensions enabled in the mesa state tracker which probably shouldn't be. These were moved there by commit a0ae2ca033ec2024da1e01d1c11c0437837c031b (that is with dri they were already always enabled before). Someone knows off-hand which one we can enable or not? I'm going to kill off EXT_cull_vertex and TDFX_texture_compression_FXT1, clearly we can't handle them. The others in question are ARB_window_pos handled in core mesa. APPLE_client_storage should not be enabled by default. MESA_pack_invert handled by core mesa. NV_vertex_program NV_vertex_program1_1 (the latter two IIRC the problem was that regs needed to be zero-initialized) There may be other issues too. Someone would have to enable the extension(s) and do some testing. Currently gallium dri drivers also have ARB_imaging enabled too (via driInitExtensions()), I think that's not correct neither. Yeah, I think that needs to be disabled. Ok thanks I've pushed a fix. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Code generate functions to pack and unpack a single pixel.
On 07.03.2010 01:21, José Fonseca wrote: On Sat, 2010-03-06 at 05:44 -0800, Brian Paul wrote: On Sat, Mar 6, 2010 at 5:44 AM, José Fonseca jfons...@vmware.com wrote: On Mon, 2010-03-01 at 09:03 -0800, Michel Dänzer wrote: On Fri, 2010-02-26 at 08:47 -0800, Jose Fonseca wrote: Module: Mesa Branch: master Commit: 9beb302212a2afac408016cbd7b93c8b859e4910 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=9beb302212a2afac408016cbd7b93c8b859e4910 Author: José Fonseca jfons...@vmware.com Date: Fri Feb 26 16:45:22 2010 + util: Code generate functions to pack and unpack a single pixel. Should work correctly for all pixel formats except SRGB formats. Generated code made much simpler by defining the pixel format as a C structure. For example this is the generated structure for PIPE_FORMAT_B6UG5SR5S_NORM: union util_format_b6ug5sr5s_norm { uint16_t value; struct { int r:5; int g:5; unsigned b:6; } chan; }; José, are you aware that the memory layout of bitfields is mostly implementation dependent? IME this makes them mostly unusable for modelling hardware in a portable manner. It's not only implementation dependent and slow -- it is also buggy! gcc-4.4.3 is doing something very fishy to single bit fields. See the attached code. ff ff ff ff is expected, but ff ff ff 01 is printed with gcc-4.4.3. Even without any optimization. gcc-4.3.4 works fine. Am I missing something or is this effectively a bug? Same result with gcc 4.4.1. If pixel.chan.a is put into a temporary int var followed by the scaling arithmetic it comes out as expected. Looks like a bug to me. Thanks. I'll submit a bug report then. BTW, it looks like sizeof(union util_format_b5g5r5a1_unorm) == 4, not 2. Yet another reason to stay away from bit fields.. Hmm, might be because the bitfields are of type unsigned, not uint16_t? I've no idea though neither why it would return 01 and not ff. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium
On 07.03.2010 20:26, Marek Olšák wrote: This branch is aimed to address the following issues: * Extensions are advertised in both st/mesa and st/dri, doing the same thing in two places. * The inability to disable extensions in pipe_screen::get_param because st/dri overrides the decisions of st/mesa. Here's the branch: http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions The first commit moves the differences between st/dri and st/mesa to the latter and removes dri_init_extensions from st/dri. It doesn't remove any extensions from the list except for those not advertised by pipe_screen. The second commit enables texture_rectangle by default in Gallium. To my knowledge any Gallium hardware can do this and I suspect it was dependent on NPOT textures by accident. All this is of course tested with piglit and glean. Please review. In case it's not OK, please let me know what needs to be done. The second commit looks fine to me. The first one, I'm not sure. Maybe that's ok, but if so I'm wondering why, since this skips all the mapping business driInitExtensions did and just sets the extension enable bits to true. At least I'm fairly sure it was needed in the past... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium
On 08.03.2010 14:22, Joakim Sindholt wrote: On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote: On 07.03.2010 20:26, Marek Olšák wrote: This branch is aimed to address the following issues: * Extensions are advertised in both st/mesa and st/dri, doing the same thing in two places. * The inability to disable extensions in pipe_screen::get_param because st/dri overrides the decisions of st/mesa. Here's the branch: http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions The first commit moves the differences between st/dri and st/mesa to the latter and removes dri_init_extensions from st/dri. It doesn't remove any extensions from the list except for those not advertised by pipe_screen. The second commit enables texture_rectangle by default in Gallium. To my knowledge any Gallium hardware can do this and I suspect it was dependent on NPOT textures by accident. All this is of course tested with piglit and glean. Please review. In case it's not OK, please let me know what needs to be done. The second commit looks fine to me. The first one, I'm not sure. Maybe that's ok, but if so I'm wondering why, since this skips all the mapping business driInitExtensions did and just sets the extension enable bits to true. At least I'm fairly sure it was needed in the past... Roland I believe airlied pointed out earlier that http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72 fixed that problem. But even with that commit, all drivers still call driInitExtensions at least once, though the parameter list can be NULL. I don't see that happening here. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium
Otherwise, looks good to me, but I'd prefer if someone more familiar with the extension handling code could give it a look. Roland On 08.03.2010 17:03, Marek Olšák wrote: Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of st_init_extensions. Anything else I missed or is it OK? -Marek On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger srol...@vmware.com mailto:srol...@vmware.com wrote: On 08.03.2010 14:22, Joakim Sindholt wrote: On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote: On 07.03.2010 20:26, Marek Olšák wrote: This branch is aimed to address the following issues: * Extensions are advertised in both st/mesa and st/dri, doing the same thing in two places. * The inability to disable extensions in pipe_screen::get_param because st/dri overrides the decisions of st/mesa. Here's the branch: http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions The first commit moves the differences between st/dri and st/mesa to the latter and removes dri_init_extensions from st/dri. It doesn't remove any extensions from the list except for those not advertised by pipe_screen. The second commit enables texture_rectangle by default in Gallium. To my knowledge any Gallium hardware can do this and I suspect it was dependent on NPOT textures by accident. All this is of course tested with piglit and glean. Please review. In case it's not OK, please let me know what needs to be done. The second commit looks fine to me. The first one, I'm not sure. Maybe that's ok, but if so I'm wondering why, since this skips all the mapping business driInitExtensions did and just sets the extension enable bits to true. At least I'm fairly sure it was needed in the past... Roland I believe airlied pointed out earlier that http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72 fixed that problem. But even with that commit, all drivers still call driInitExtensions at least once, though the parameter list can be NULL. I don't see that happening here. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium
Well I guess another solution would be to just call it directly from the place the dri_extennsion code initially was, i.e. in dri_create_context. Roland On 08.03.2010 17:21, Jakob Bornecrantz wrote: Calling dri code from src/mesa/state_tracker is not allowed since it supposed to be independent of windowing systems. That said from what I can see both driInitExtensions and driInitSignleExtension could be folded into mesa core, I can't see anything dri special about them. Cheers Jakob. On 8 mar 2010, at 16.12, Roland Scheidegger wrote: Otherwise, looks good to me, but I'd prefer if someone more familiar with the extension handling code could give it a look. Roland On 08.03.2010 17:03, Marek Olšák wrote: Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of st_init_extensions. Anything else I missed or is it OK? -Marek On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger srol...@vmware.com mailto:srol...@vmware.com wrote: On 08.03.2010 14:22, Joakim Sindholt wrote: On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote: On 07.03.2010 20:26, Marek Olšák wrote: This branch is aimed to address the following issues: * Extensions are advertised in both st/mesa and st/dri, doing the same thing in two places. * The inability to disable extensions in pipe_screen::get_param because st/dri overrides the decisions of st/mesa. Here's the branch: http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions The first commit moves the differences between st/dri and st/mesa to the latter and removes dri_init_extensions from st/dri. It doesn't remove any extensions from the list except for those not advertised by pipe_screen. The second commit enables texture_rectangle by default in Gallium. To my knowledge any Gallium hardware can do this and I suspect it was dependent on NPOT textures by accident. All this is of course tested with piglit and glean. Please review. In case it's not OK, please let me know what needs to be done. The second commit looks fine to me. The first one, I'm not sure. Maybe that's ok, but if so I'm wondering why, since this skips all the mapping business driInitExtensions did and just sets the extension enable bits to true. At least I'm fairly sure it was needed in the past... Roland I believe airlied pointed out earlier that http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72 fixed that problem. But even with that commit, all drivers still call driInitExtensions at least once, though the parameter list can be NULL. I don't see that happening here. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: gallium-format-cleanup branch (was Gallium format swizzles)
On 03.03.2010 14:07, José Fonseca wrote: On Wed, 2010-03-03 at 04:27 -0800, Luca Barbieri wrote: PIPE_FORMAT_X8B8G8R8_UNORM is being used by mesa. PIPE_FORMAT_R8G8B8X8_UNORM doesn't exist hence it appears to be unnecessary. So it doesn't make sense to rename. How about D3DFMT_X8B8G8R8? That should map to PIPE_FORMAT_R8G8B8X8_UNORM. Yes, you're right. BTW, we are also missing D3DFMT_X4R4G4B4, D3DFMT_X1R5G5B5, D3DFMT_A4L4, D3DFMT_A1, D3DFMT_L6V5U5, D3DFMT_D15S1, D3DFMT_D24X4S4, D3DFMT_CxV8U8 and perhaps others I did not notice. D3DFMT_L6V5U5 is there (PIPE_FORMAT_R5SG5SB6U_NORM). The others are indeed missing. Neither of the mentioned formats is required for D3D9 conformance, but we could add them to gallium. D3DFMT_A1 is special: it has less than 1 byte per pixel. Probably the best way to support it would be to treat it as a 8x1 macro pixel, 8bits, similarly to compressed formats. D3DFMT_CxV8U8 too as special semantics. And not only are those formats optional, some would be completely pointless in gallium (D15S1, D24X4S4). There's simply no modern hardware which supports 1 bit stencil (I think pretty much the only chip supporting that was savage3d), nor 4 bit stencil (can't remember off-hand any chip supporting that, maybe some of the then professional chips did). The others sound a bit more plausible and hardware may support them, but I'm not sure they are really missed (A4L4, X4R4G4B4, X1R5G5B5). As José said, CxV8U8 isn't really a format only, and we'll need to add 1-bit format for DX10. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 - VMware svga with arbitrary semantics work? How?
On 03.03.2010 20:23, Luca Barbieri wrote: And never will... It does not export PIPE_CAP_GLSL, and does not have the shader opcodes to ever do so. Any Gallium driver should be able to support the GLSL subset without control flow. And if we had a proper optimization infrastructure capable of inlining functions, converting conditionals to multiplications and unrolling loops (e.g. look at what the nVidia Cg compiler does), then essentially all GLSL could be supported on any driver, with only limitations on the maximum number of loop iterations. Isn't it worth supporting that? BTW, proprietary drivers do this: for instance nVidia supports GLSL on nv30, which can't do control flow in fragment shaders and doesn't support SM3. I think the i915 is a lot closer to r300 in that regard (which is quite a bit more limited than nv30), and it's true that ATI also supported glsl on that. As far as I know though it was quite easy to bump into shaders which wouldn't compile. There's only so much you can do if you have 4 blocks of (max) 16 instructions to run without any control flow if you need to unroll loops, not to mention lacking instructions for derivatives, or the fact things like sin/cos will take quite a few instructions... nv30, while processing fragment shaders slowly, had a LOT higher instruction count, IIRC supported derivatives and predication and had no dependent texturing limit. So that makes it a lot better suited for glsl hacks. So, I'm not sure it really makes a whole lot of sense to support glsl on i915. It'll really only ever work for very simple things (granted there are apps out there which indeed will only use glsl shaders which are known to compile fine on r300...) Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge
On 02.03.2010 11:37, Keith Whitwell wrote: On Mon, 2010-03-01 at 10:02 -0800, Roland Scheidegger wrote: Hi, this branch turns vertex element into a cso, so instead of set_vertex_elements there's now the triad of create/bind/delete_vertex_elements_state. I have converted all the drivers except nouveau (I didn't do it because Christoph Bumiller already did nv50, but I can give the rest of them a shot), though that doesn't necessarily mean they are optimized for it (the idea is of course to precalculate state on create, not just copy the pipe structs and do everything on bind) - only i965g really does something close to it (though still emits the state always). Drivers doing both hw vertex shaders and using draw in some circumstances of course will have to store both representations on create. Also note that util_draw_vertex_buffer semantics have changed a bit (caller needs to set vertex element state, which is a bit odd). Roland, The branch looks good to me, happy to see it merged when you're ready to go. There's actually something in the cso code I was a bit unsure about, I've looked at it again and indeed it seems wrong. The problem is that the count value itself isn't stored for the comparison. So in the unlikely case the hash value is the same for pipe_vertex_elements with different counts, the comparison itself will also be the same as long as the first few elements are identical. Which seems very wrong. The easiest way to fix would probably be to just store the count alongside the pipe_vertex_element data, but that would need an additional copy of the incoming data in cso_set_vertex_elements. Hmm... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge
Hi, this branch turns vertex element into a cso, so instead of set_vertex_elements there's now the triad of create/bind/delete_vertex_elements_state. I have converted all the drivers except nouveau (I didn't do it because Christoph Bumiller already did nv50, but I can give the rest of them a shot), though that doesn't necessarily mean they are optimized for it (the idea is of course to precalculate state on create, not just copy the pipe structs and do everything on bind) - only i965g really does something close to it (though still emits the state always). Drivers doing both hw vertex shaders and using draw in some circumstances of course will have to store both representations on create. Also note that util_draw_vertex_buffer semantics have changed a bit (caller needs to set vertex element state, which is a bit odd). Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge
On 01.03.2010 19:02, Roland Scheidegger wrote: Hi, this branch turns vertex element into a cso, so instead of set_vertex_elements there's now the triad of create/bind/delete_vertex_elements_state. I have converted all the drivers except nouveau (I didn't do it because Christoph Bumiller already did nv50, but I can give the rest of them a shot), though that doesn't necessarily mean they are optimized for it (the idea is of course to precalculate state on create, not just copy the pipe structs and do everything on bind) - only i965g really does something close to it (though still emits the state always). Drivers doing both hw vertex shaders and using draw in some circumstances of course will have to store both representations on create. Also note that util_draw_vertex_buffer semantics have changed a bit (caller needs to set vertex element state, which is a bit odd). Ok, I've converted nv30/nv40 too. Not that they'd precalculate any hw state... Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge
On 02.03.2010 00:18, Joakim Sindholt wrote: On Mon, 2010-03-01 at 19:02 +0100, Roland Scheidegger wrote: Hi, this branch turns vertex element into a cso, so instead of set_vertex_elements there's now the triad of create/bind/delete_vertex_elements_state. I have converted all the drivers except nouveau (I didn't do it because Christoph Bumiller already did nv50, but I can give the rest of them a shot), though that doesn't necessarily mean they are optimized for it (the idea is of course to precalculate state on create, not just copy the pipe structs and do everything on bind) - only i965g really does something close to it (though still emits the state always). Drivers doing both hw vertex shaders and using draw in some circumstances of course will have to store both representations on create. Also note that util_draw_vertex_buffer semantics have changed a bit (caller needs to set vertex element state, which is a bit odd). Roland Can I still do things like: element 0: - vbo 5 element 1: - vbo 2 and then set_vertex_buffers() with an array { zeros, zeros, vbo 2, zeros zeros, vbo 5 } ? The branch doesn't change pipe_vertex_element itself (except nr_components got removed as that's really derived from the associated pipe_format), only how those vertex elements are set. Hence you can do exactly the same things you could do before. Though I'm not quite sure what your zeros mean, if that's just a unused vbo it should be ok, but it is probably not ok to just pass in a null pointer for a unused pipe_vertex_buffer. Roland -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view
On 25.02.2010 18:39, michal wrote: Roland Scheidegger wrote on 2010-02-24 15:18: On 24.02.2010 12:48, Christoph Bumiller wrote: This wasn't a problem before because textures and samplers were linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch, this coordinate normalization bit becomes a problem. NV50 hardware has that bit in the RESOURCE binding, and not the SAMPLER binding, and you can imagine that this will lead to us having to jump through a few annoying looking hoops to accomodate. As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have sampler states that are decoupled from the texture, and which contain a normalized coordinates bit, so it's worth considering not having it there in gallium either. For OpenGL, unnormalized coordinates are only used for RECT textures, and in this case it makes sense to make it a property of the texture. I agree this is not sampler state, but I don't quite agree this should be texture state. This changes how texture coordinates get interpreted in the interpolator - in that sense it is similar to the cylindrical texture coord wrap which we moved away from sampler state recently. This one got moved to shader declaration. I wonder if the normalization bit should be treated the same. Though OTOH you're quite right that in OpenGL this really is texture property (it is a different texture target after all), and afaik d3d doesn't support non-normalized coords (?). Hmm... Isn't it the case that for RECT targets we clear the bit, and for others we always set it? In mesa st I see: if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB) sampler-normalized_coords = 1; By definition, RECT texture with normalised coordinates is just an NPOT. If we removed this apparently redundant flag, would that make nouveau developers life easier? But we don't have rect targets in gallium hence we need the flag. I think conceptually this makes sense since for texture layouts etc. drivers won't care one bit if this is 2d npot or rect texture. Though I guess introducing rect targets instead would be another option. Roland And, finally, I've seen you reverted the changes for independent image and sampler index in the texture opcodes. What's up with that ? Is the code not nice enough, or has the idea been discarded and by problem disappears ? Please consider this branch dead. It will be easier for me to introduce new, optional sampler and fetch opcodes a'la GL 3.0. There's just too much code to fix and test and we still want the older hardware not to stand on its head to try and translate back to old model. Thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view
On 24.02.2010 12:48, Christoph Bumiller wrote: This wasn't a problem before because textures and samplers were linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch, this coordinate normalization bit becomes a problem. NV50 hardware has that bit in the RESOURCE binding, and not the SAMPLER binding, and you can imagine that this will lead to us having to jump through a few annoying looking hoops to accomodate. As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have sampler states that are decoupled from the texture, and which contain a normalized coordinates bit, so it's worth considering not having it there in gallium either. For OpenGL, unnormalized coordinates are only used for RECT textures, and in this case it makes sense to make it a property of the texture. I agree this is not sampler state, but I don't quite agree this should be texture state. This changes how texture coordinates get interpreted in the interpolator - in that sense it is similar to the cylindrical texture coord wrap which we moved away from sampler state recently. This one got moved to shader declaration. I wonder if the normalization bit should be treated the same. Though OTOH you're quite right that in OpenGL this really is texture property (it is a different texture target after all), and afaik d3d doesn't support non-normalized coords (?). Hmm... Roland And, finally, I've seen you reverted the changes for independent image and sampler index in the texture opcodes. What's up with that ? Is the code not nice enough, or has the idea been discarded and by problem disappears ? Best regards, Christoph -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] st/dri: don't enable EXT_draw_buffers2 by default
Marek, I don't particularly like that patch, because it doesn't really fix the problem with the extension handling. There are lots of extension listed there which should not be advertized by default, so picking one out won't fix the others. I think they are there because driInitExtensions definitely does more than just set ctx-Extensions_foo_bar to enable. Other extensions in this list which are queried by CAP bits but still show up in the extension string regardless are the glsl ones (ARB_fragment_shader and friends), a couple texture address modes (mirrored_repeat, mirror_clamp), blend_equation_separate, technically even ARB_multitexture (though we probably should skip the test for more than 1 texture unit and always set that to true in st_extensions.c), two-sided stencil, occlusion queries, anisotropic filtering, ycbcr textures, packed depth stencil (there may be more that was just from a quick look). So if it's ok to remove them all from that list this should be done, but I fear it's not ok and the fix needs to be a bit more complicated (see comments in dri_init_extensions). Roland On 21.02.2010 16:00, Marek Olšák wrote: Hi, the attached patch modifies st/dri to not enable EXT_draw_buffers2 by default because r300g and most probably even some other drivers can't support this extension. The drivers reporting support of PIPE_CAP_INDEP_BLEND_ENABLE are not affected by this patch. Please review. Marek From ddda2c19b74780263f848ffafe10809bd6385d01 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= mar...@gmail.com Date: Sun, 21 Feb 2010 01:27:09 +0100 Subject: [PATCH 2/2] st/dri: don't enable EXT_draw_buffers2 by default --- src/gallium/state_trackers/dri/dri_extensions.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/src/gallium/state_trackers/dri/dri_extensions.c b/src/gallium/state_trackers/dri/dri_extensions.c index 1259813..7f8ceef 100644 --- a/src/gallium/state_trackers/dri/dri_extensions.c +++ b/src/gallium/state_trackers/dri/dri_extensions.c @@ -99,7 +99,6 @@ static const struct dri_extension card_extensions[] = { {GL_EXT_blend_minmax, GL_EXT_blend_minmax_functions}, {GL_EXT_blend_subtract, NULL}, {GL_EXT_cull_vertex, GL_EXT_cull_vertex_functions}, - {GL_EXT_draw_buffers2, GL_EXT_draw_buffers2_functions}, {GL_EXT_fog_coord, GL_EXT_fog_coord_functions}, {GL_EXT_framebuffer_object, GL_EXT_framebuffer_object_functions}, {GL_EXT_multi_draw_arrays, GL_EXT_multi_draw_arrays_functions}, -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): r300g: remove L8_UNORM from colorbuffer formats
This isn't actually true any more. See issue (9) of ARB_framebuffer_object which defines luminance, luminance_alpha and intensity as renderable. (I'm not quite sure how color assignment is done, readpixels and the like would define L = R + G + B, but I think it will follow the table from texture image specification instead, hence L = R, I = R). You are quite right though this is a recent addition, and in fact for instance i965 can't render to this neither (it can render to red or alpha formats, but none of the l/i formats) directly, neither can r300 (without shader hacking). Roland On 19.02.2010 15:35, Marek Olšák wrote: I still think st/xorg should use R8, which is well defined as to which component to store, rather than L8. That's also the reason L8 is not renderable in OpenGL. 2010/2/19 Corbin Simpson mostawesomed...@gmail.com mailto:mostawesomed...@gmail.com Yeah, I would have nak'd this. Will revert when I get home. Posting from a mobile, pardon my terseness. ~ C. On Feb 19, 2010 12:56 AM, Michel Dänzer mic...@daenzer.net mailto:mic...@daenzer.net wrote: On Thu, 2010-02-18 at 19:24 -0800, Marek Olk wrote: Module: Mesa Branch: master Commit: fc427d23439a2702068209957f08990ea29fe21b URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc427d23439a2702068209957f08990ea29fe21b Author: Marek Olšák mar...@gmail.com mailto:mar...@gmail.com Date: Fri Feb 19 04:23:06 2010 +0100 r300g: remove L8_UNORM from colorbuffer formats Not renderable in OpenGL anyway. The Xorg state tracker uses it though. -- Earthling Michel Dänzer | http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net mailto:Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
On 12.02.2010 14:44, michal wrote: Keith Whitwell wrote on 2010-02-12 14:28: On Fri, 2010-02-12 at 05:09 -0800, michal wrote: Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. But surely D3D permits two-component texture coordinates, which would be PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)... Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? It seems like we'd want to be able to support both usages - the alternative in texture sampling would be forcing the state tracker to generate varients of the shader when 2-component textures are bound. I would say that's an unreasonable requirement on the state tracker. It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D would want differing expansions in different parts of the pipeline. That indicates a single flag in the context somewhere isn't sufficient to choose between the two. Maybe there need to be two versions of these PIPE_FORMAT_ enums to capture the different values in the missing components? EG: PIPE_FORMAT_R32G32_0001_FLOAT PIPE_FORMAT_R32G32__FLOAT ? or something along those lines?? You are right. Alternatively, follow the more sane API (GL apparently), assume 0001 as default and use the infix to override. Note it's not just GL. D3D10 uses same expansion. Only D3D9 is different. Well for texture sampling anyway, I don't know what d3d does for vertex formats. Though for most hardware it would make sense to have only one format per different expansion, and use some swizzling parameter for sampling, because that's actually how the hardware works. But not all drivers will be able to do this, unfortunately. (Note that for instance, with i965, those two R32G32 formats mentioned here aren't really freely selectable. In OGL/DX10 mode you'll get the former, in d3d9 mode you get the latter. You can switch the mode but you'll also get different border color interpretation along with it - something which is also not specified in gallium, though I guess you could say this is tied to gl_rasterization_rules - maybe we could say the same too, R32G32 is rg01 with gl_rasterization_rules and rg11 without? Seems a bit hackish, though.). Roland -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
On 12.02.2010 18:42, Keith Whitwell wrote: On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote: On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote: On 12.02.2010 14:44, michal wrote: Keith Whitwell wrote on 2010-02-12 14:28: On Fri, 2010-02-12 at 05:09 -0800, michal wrote: Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. But surely D3D permits two-component texture coordinates, which would be PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)... Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? It seems like we'd want to be able to support both usages - the alternative in texture sampling would be forcing the state tracker to generate varients of the shader when 2-component textures are bound. I would say that's an unreasonable requirement on the state tracker. It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D would want differing expansions in different parts of the pipeline. That indicates a single flag in the context somewhere isn't sufficient to choose between the two. Maybe there need to be two versions of these PIPE_FORMAT_ enums to capture the different values in the missing components? EG: PIPE_FORMAT_R32G32_0001_FLOAT PIPE_FORMAT_R32G32__FLOAT ? or something along those lines?? You are right. Alternatively, follow the more sane API (GL apparently), assume 0001 as default and use the infix to override. Note it's not just GL. D3D10 uses same expansion. Only D3D9 is different. Well for texture sampling anyway, I don't know what d3d does for vertex formats. Though for most hardware it would make sense to have only one format per different expansion, and use some swizzling parameter for sampling, because that's actually how the hardware works. But not all drivers will be able to do this, unfortunately. You mean, having a swizzle in pipe_sampler_state ? It sounds a good idea. In the worst case some component will inevitably need to make shader variants with different swizzles. In this case it probably makes sense to be the pipe driver -- it's a tiny shader variation which could be done without recompiling the whole shader, but if the state tracker does it then the pipe driver will always have to recompile. In the best case it is handled by the hardware's texture sampling unit. It's in theory similar to baking the swizzle in the format as Keith suggested, but cleaner IMHO. The question is whether it makes sense to have full xwyz01 swizzles, or just 01 swizzles. Another alternative is to just add the behaviour we really need - a single flag at context creation time that says what the behaviour of the sampler should be for these textures. Then the driver wouldn't have to worry about varients or mixing two different expansions. Hardware (i965 at least) seems to have one global mode to switch between these, and that's all we need to choose the right behaviour for each state tracker. It might be simpler all round just to specify it at context creation. Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth texture mode problem though. Also, we sort of have that flag already, I think there's no reason why this needs to be separate from gl_rasterization_rules (though I guess in that case it's a bit a misnomer...) Roland -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
On 12.02.2010 19:00, Keith Whitwell wrote: On Fri, 2010-02-12 at 09:56 -0800, Roland Scheidegger wrote: On 12.02.2010 18:42, Keith Whitwell wrote: On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote: On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote: On 12.02.2010 14:44, michal wrote: Keith Whitwell wrote on 2010-02-12 14:28: On Fri, 2010-02-12 at 05:09 -0800, michal wrote: Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. But surely D3D permits two-component texture coordinates, which would be PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)... Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? It seems like we'd want to be able to support both usages - the alternative in texture sampling would be forcing the state tracker to generate varients of the shader when 2-component textures are bound. I would say that's an unreasonable requirement on the state tracker. It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D would want differing expansions in different parts of the pipeline. That indicates a single flag in the context somewhere isn't sufficient to choose between the two. Maybe there need to be two versions of these PIPE_FORMAT_ enums to capture the different values in the missing components? EG: PIPE_FORMAT_R32G32_0001_FLOAT PIPE_FORMAT_R32G32__FLOAT ? or something along those lines?? You are right. Alternatively, follow the more sane API (GL apparently), assume 0001 as default and use the infix to override. Note it's not just GL. D3D10 uses same expansion. Only D3D9 is different. Well for texture sampling anyway, I don't know what d3d does for vertex formats. Though for most hardware it would make sense to have only one format per different expansion, and use some swizzling parameter for sampling, because that's actually how the hardware works. But not all drivers will be able to do this, unfortunately. You mean, having a swizzle in pipe_sampler_state ? It sounds a good idea. In the worst case some component will inevitably need to make shader variants with different swizzles. In this case it probably makes sense to be the pipe driver -- it's a tiny shader variation which could be done without recompiling the whole shader, but if the state tracker does it then the pipe driver will always have to recompile. In the best case it is handled by the hardware's texture sampling unit. It's in theory similar to baking the swizzle in the format as Keith suggested, but cleaner IMHO. The question is whether it makes sense to have full xwyz01 swizzles, or just 01 swizzles. Another alternative is to just add the behaviour we really need - a single flag at context creation time that says what the behaviour of the sampler should be for these textures. Then the driver wouldn't have to worry about varients or mixing two different expansions. Hardware (i965 at least) seems to have one global mode to switch between these, and that's all we need to choose the right behaviour for each state tracker. It might be simpler all round just to specify it at context creation. Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth texture mode problem though. Also, we sort of have that flag already, I think there's no reason why this needs to be separate from gl_rasterization_rules (though I guess in that case it's a bit a misnomer...) I'd prefer to avoid a big I'm a GL/DX9 context flag, and split different behaviours into different flags. Sure, a GL state tracker might set them all one way, but that doesn't mean some future state-tracker wouldn't want to use a novel combination. The GL rasterization rules flag should be renamed to reflect what it's really asking for. Ok
[Mesa3d-dev] nouveau changes for gallium-dynamicstencilref
Hi, could one of the nouveau developers please take a look at the nv30 changes I did for the stencil ref changes in gallium-dynamicstencilref branch? I've just done that in a way I think it might make sense, but I've absolutely no idea if it would work like that (and even if it would in theory there might of course still be bugs in it...) Also, I was a bit confused about the so_new() parameters as the numbers didn't seem to add up (assuming it's basically the max number of so_method and so_data calls). Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what needs to be done instead or do it yourself :-). Or it will break after the merge... Roland -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] nouveau changes for gallium-dynamicstencilref
On 11.02.2010 21:42, Christoph Bumiller wrote: On 02/11/2010 09:02 PM, Roland Scheidegger wrote: Hi, could one of the nouveau developers please take a look at the nv30 changes I did for the stencil ref changes in gallium-dynamicstencilref branch? I've just done that in a way I think it might make sense, but I've absolutely no idea if it would work like that (and even if it would in theory there might of course still be bugs in it...) Looks like it should work, I can't test nv30 myself though. Also, I was a bit confused about the so_new() parameters as the numbers didn't seem to add up (assuming it's basically the max number of so_method and so_data calls). It's (nr of so_method, nr of so_data + nr of so_reloc, nr of so_reloc), since relocs/addresses are considered data. Ok that's what I figured. The numbers were just wrong, then (nv30 used 5/21/0 but the actual max was 4/22/0, nv40 uses 4/21/0 and it should also be 4/22/0). Hence the confusion. At least things looked ok for nv50 (though it seems to needlessly split up the back face state into two so_methods - anyway that'll change). Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what needs to be done instead or do it yourself :-). Or it will break after the merge... I think you'll get it right. nv40 should be about the same as nv30, nv50 state is a bit less elegant but should be easy to adjust, too. Ok. Roland -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] fix the usual cell breakage
On 06.02.2010 15:07, Marc Dietrich wrote: also update the cell config a bit --- configs/linux-cell |6 ++-- src/gallium/drivers/cell/common.h |3 +- src/gallium/drivers/cell/spu/spu_per_fragment_op.c | 36 ++-- 3 files changed, 22 insertions(+), 23 deletions(-) Sorry for that. Got confused there, thought the driver was using cell_blend_state and not pipe_blend_state there... cell_blend_state though actually seems to be only some remains from the past... Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC]: gallium-nopointsizeminmax merge
On 08.02.2010 18:27, Brian Paul wrote: On Mon, Feb 8, 2010 at 10:21 AM, Roland Scheidegger srol...@vmware.com wrote: This branch removes point_size_min and point_size_max because most hardware doesn't have any register to clamp this at rasterization time (from all gallium drivers, only r300 had this), and the mesa state tracker actually never used these field properly. The clamp to implementation limits will now be done in the vertex shader instead. Also, point_sprite enable is removed and replaced with a point_quad_rasterization field. The reason for this is that OGL actually has quite different rasterization rules for points and point sprites - hence this indicates if points should be rasterized as points or according to point sprite (which decomposes them into quads, basically) rules. It is unclear to me if we'd actually really need to do something different for these rules in the draw module or if hardware can do much with this information, but if there's hardware which can well you can use it. The point sprite coord enable is no longer also indicating the sprite coord origin, since there's no api interested in this per coord. Testing was done with softpipe, and actually pointblast doesn't work (does not draw any points at all), neither does spriteblast (some points have their size cut somewhere vertically so they are rectangles) correctly. However, these bugs are not introduced by this branch, those must be bugs in the draw module present before - I'm still trying to figure out what goes wrong. They're OK on master. I fixed some breakage in this area last week. See 54d7ec8e769b588ec93dea5bc04399e91737557e for example. Yes you're right, I missed that (probably because it wasn't in the draw module, but the actual drivers). pointblast works again, and spriteblast actually wasn't broken (parts of points simply disappeared due to depth test...) when cherrypicking those commits to the branch. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium DRI fbconfig/visual setup
On 05.02.2010 22:48, Corbin Simpson wrote: Two things... Are accumbufs still slow in Gallium-land? Should we still mark them as slow? How many multisamples should we actually pretend/advertise? Should we have a cap to check the number of multisamples supported? Should we just say that four samples are done for the fbconfig/visual, and then replace pipe_texture::nr_samples with a multisample boolean flag? I think it would be nice if we could support multiple MSAA levels. Sure hardware typically can do 4xMSAA, but maybe you'd really want max quality (with modern hw often offering 8x, and that's not taking specialties like csaa into account) or only 2x. Maybe cap should return a bitmask indicating what levels it supports. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
On 03.02.2010 16:07, michal wrote: Keith, This feature branch adds cylindrical wrap texcoord mode to gallium shader tokens and removes prefilter field from sampler state. Implemented cylindrical wrapping for linear interpolator in softpipe. Not sure whether it makes sense to do it for perspective interpolator. Documented TGSI declaration token. Sample fragment shader declaration that wraps S and T coordinates follows. DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY Please review so I can merge it to master. Michal, why do you need this for linear interpolator and not perspective? I think d3d mobile let you disable perspective correct texturing, but it is always enabled for normal d3d. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
On 03.02.2010 17:45, michal wrote: Roland Scheidegger wrote on 2010-02-03 16:47: On 03.02.2010 16:07, michal wrote: Keith, This feature branch adds cylindrical wrap texcoord mode to gallium shader tokens and removes prefilter field from sampler state. Implemented cylindrical wrapping for linear interpolator in softpipe. Not sure whether it makes sense to do it for perspective interpolator. Documented TGSI declaration token. Sample fragment shader declaration that wraps S and T coordinates follows. DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY Please review so I can merge it to master. Michal, why do you need this for linear interpolator and not perspective? I think d3d mobile let you disable perspective correct texturing, but it is always enabled for normal d3d. I could not think of a use case that uses perspective and cylindrical interpolation at the same time. If you think it's valid, we can implement cylindrical wrapping for perspective interpolator, but then I am not sure how exactly it should be done, i.e. should we divide and then wrap or the opposite? Good question. Unfortunately the description of what the wrap renderstate does doesn't say anything about that. I just assumed since perspective correction is usually always enabled, it would be enabled even when wrapping is used on some coordinates. Not sure what the order of wrap/divide should be... Also, d3d lets you wrap the 4th coordinate, does that really make sense? Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Grab bag of random questions (whoo)
On 31.01.2010 18:41, Christoph Bumiller wrote: On 31.01.2010 01:37, Roland Scheidegger wrote: Marek Olšák wrote: 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in the fragment constants reads, Since Gallium doesn't support GL_ARB_shadow_ambient, this is always (0,0,0,0), right? I think the extension could be added to Gallium since the r300 compiler can generate code for it. It could. But generally, gallium doesn't implement features common hardware can't do (not only because most drivers except software based ones couldn't implement it, but those features also turn out to be rarely used, for obvious reasons). r300 is an exception here since it emulates ARB_shadow anyway. Though I think if you can make a case why this is really necessary it could be done, but that's not my call. Another comment reads, Gallium doesn't provide us with any information regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE, above the texture compare modes. I don't really like that section of code, but it probably can't get cleaner, right? Even though this is a rarely used feature in OpenGL nowadays, it should get fixed if we want to be GL-compliant. That means adding depth texture modes in pipe_sampler_state and setting them in the Mesa state tracker. The R300 compiler can already generate code for these modes as well. Note R300 is again special a bit here. Actually, I realized my earlier answer doesn't make sense. Hardware which actually supports EXT_texture_swizzle (and native ARB_shadow) should be able to implement this easily. Hardware like i965 which doesn't support EXT_texture_swizzle could do it in the shader. Maybe it would make sense to add EXT_texture_swizzle capability in gallium (in the sampler state). That would solve this in a bit more generic way than some special bits for depth texture mode. From my point of view adding a swizzle in the sampler state is a bad idea. On nv50, this would make texture setup dependent on sampler state: we have an Image and Sampler configuration buffer containing entries that can be bound to texture and sampler units. The texture swizzle would be supported by setting a different format in the image entry, like BGRA instead of RGBA, just that it also supports RGRG or whatever you like. Well, the normalization bit seems to be stored in the TIC entries instead of the TSC ones already, I guess that comes from the rectangle texture type, but let's ignore that. I don't see texture swizzle in d3d10 (but then, I don't know d3d10 very well), and OpenGL doesn't separate textures and samplers anyway, so I'd put in in texture state. Keeping a bunch of shaders for texture swizzles doesn't sound nice either. Of course, if other hardware would prefer this in sampler state, then ... ah, I should probably let go of the illusion that gallium state will continue to nicely map to my hardware ... I don't know if other hardware likes it in sampler state, but the problem is it really is sampler state. This is not a property of the texture, it can change anytime and you don't want to recreate the texture just because this changes, I think. I'm not sure how to implement this nicely neither, but I'd guess we'd at least wanted to indicate the swizzle fields to correspond to hardware channels (so for a luminance_alpha texture, the swizzle would indicate rrrg for the first and second channels respectively), not the GL-after-sampling mapping as the extension uses. Hence depth textures used as luminance would be rrr1, as alpha 000r, and as intensity . So, basically, a texture a8l8 texture would be equivalent to a r8g8 texture, when used for sampling with the same swizzling. Note that an easy solution (for depth textures) would be to add just new depth texture formats (one for each alpha, luminance, and intensity mode), but then again you make this part of the texture, which it is not. Those are just some quick thoughts however, I don't think anyone would be opposed if you can come up with a nice solution for this. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Grab bag of random questions (whoo)
On 01.02.2010 20:23, Brian Paul wrote: Speaking of texture formats and texture sampling, one area of Gallium that's under-specified is what (x,y,z,w) values are returned by TEX instructions when sampling from each of the various texture formats. A while back I started a table comparing OpenGL to D3D: texture componentsOpenGL D3D --- R,G,B,A (R,G,B,A) (R,G,B,A) R,G,B (R,G,B,1) (R,G,B,1) R,G (R,G,0,1) (R,G,1,1) R (R,0,0,1) (R,1,1,1) A (0,0,0,A) (0,0,0,A) L (L,L,L,1) (L,?,?,1) (probably L,L,L,1) I (I,I,I,I) (?,?,?,?) UV(0,0,0,1)* (U,V,1,1) Z (Z,Z,Z,Z) or (0,Z,0,1) (Z,Z,Z,1) or (0,0,0,Z)** other formats?...... AL. Should be (L,L,L,A) for both OGL And D3D. And yes, (L,L,L,1) is correct for D3D (that's what i965 at least does). There are no intensity textures in d3d (unless you can somehow support that via cap bits, but in that case I'd certainly expect it to be (I,I,I,I)). UV is of course really odd in OGL, since it says the sample result is constant but of course you still use the UV components for the bump target. That's just an oddity to make that fit somehow into fixed function pipeline rather than it has anything to do with hardware. Note that the D3D column is only valid for DX9 (and older). DX10 uses the same mappings as OpenGL (if it supports the format, all luminance, alpha etc. textures are gone, as are the swizzled bgra formats). * per http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt ** depends on GL_DEPTH_TEXTURE_MODE state For OpenGL, see page 141 of the OpenGL 3.1 spec. For D3D, see http://msdn.microsoft.com/en-us/library/ee422472(VS.85).aspx We should first add a column to the above table for Gallium and then decide whether to implement swizzling (and GL_DEPTH_TEXTURE_MODE) with extra GPU instructions or new texture/sampler swizzle state. But most gpus can do arbitrary swizzling natively, hence inserting gpu instructions really must be optional. Even hardware which can't do arbitrary swizzling can sometimes do both OGL and D3D mapping, hence we don't really want additional instructions there neither (i965 being the example, though it's not easy to switch behavior, since that affects not only the format of the border color too but also how the border color is used if the particular channel isn't in the texture). I think we'd want DX10/OGL behavior, and u_format defines it that way. Except for depth/stencil formats, where the depth always ends up in the red channel and stencil in green (with the rest undefined). i965 actually has different depth/stencil formats (a24x8, l24x8, i24x8) just for those depth texture modes (though the code suggests it won't do anything if shadow comparison is enabled). Or maybe we'd want additional formats just for DX9 - sounds like overkill though. The different border color interpretation of i965 suggests to me that won't do much on its own for conformance anyway. I think the swizzle values used by u_format are nice. Using xyzw rather than rgba to refer to the first, etc. channel avoids confusion. Hence I'd propose we'd use the same for the hypothetical sampler swizzle state (that is x,y,z,w,0,1, not sure if the _ undefined makes sense there). The swizzling would be the same as that indicated in u_format for all textures initially, except depth/stencil. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Grab bag of random questions (whoo)
On 30.01.2010 13:06, Corbin Simpson wrote: Handful of random things bugging me. 2) progs/tests/drawbuffers and progs/tests/drawbuffers2, and possibly others, segfault with both softpipe and the HW driver at sl_pp_version.c:45. I think there's some codegen going on there? At any rate, if anybody has any hints on how to solve it, that'd be nice. Works for me (with softpipe). 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in the fragment constants reads, Since Gallium doesn't support GL_ARB_shadow_ambient, this is always (0,0,0,0), right? I'd think so. This extension isn't in core GL (and d3d can't do it neither), and AFAIK there's no hardware (which doesn't emulate the shadow functionality in the fragment shader) which could actually do it. Another comment reads, Gallium doesn't provide us with any information regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE, above the texture compare modes. I don't really like that section of code, but it probably can't get cleaner, right? Yes, that's not very clean, but there doesn't seem to be an easy solution for this. Exposing this in gallium only seems marginally useful, since again modern hardware can't really do anything useful with that information neither. Maybe would need to tweak the shader if actually the wrong channels are used (probably shouldn't be the driver's responsibility to do this), but I guess assuming default LUMINANCE works just fine usually. New OGL won't have that problem... 7) Is there more information on the dual-source blend modes? I'm not sure if I can do them; might have to bug AMD for the register values. Pretty sure most pre-DX10 hardware can't do that, the blend unit just doesn't have access to multiple source colors. I've attached a small patch which shows how softpipe implements it. (But I still need to write a testcase probably for the python statetracker to see if it actually works...). Pretty easy really, just using pixel shader color outputs 0 and 1 for blending (note that in this form this restricts dual source blending to one render target, this is the same restriction DX10 enforces, and if you look at i965 docs it actually has this restriction in hardware). I think that's it for now. Sorry for all the questions, but I'm really starting to get a good handle on the hardware and interface, and I'm ready to start beating the classic driver in serious benchmarks; I think that r300's probably the most mature driver alongside nv50 and maybe nv40. Great! Roland diff --git a/src/gallium/drivers/softpipe/sp_quad_blend.c b/src/gallium/drivers/softpipe/sp_quad_blend.c index d65307b..85fda0b 100644 --- a/src/gallium/drivers/softpipe/sp_quad_blend.c +++ b/src/gallium/drivers/softpipe/sp_quad_blend.c @@ -222,7 +222,7 @@ logicop_quad(struct quad_stage *qs, static void blend_quad(struct quad_stage *qs, - float (*quadColor)[4], + float (*quadColors)[4][4], float (*dest)[4], unsigned cbuf) { @@ -230,6 +230,7 @@ blend_quad(struct quad_stage *qs, static const float one[4] = { 1, 1, 1, 1 }; struct softpipe_context *softpipe = qs-softpipe; float source[4][QUAD_SIZE] = { { 0 } }; + float (*quadColor)[4] = quadColors[cbuf]; /* * Compute src/first term RGB @@ -298,11 +299,23 @@ blend_quad(struct quad_stage *qs, } break; case PIPE_BLENDFACTOR_SRC1_COLOR: - assert(0); /* to do */ - break; + { + float (*quadColor1)[4] = quadColors[1]; + assert(cbuf == 0); + VEC4_MUL(source[0], quadColor[0], quadColor1[0]); /* R */ + VEC4_MUL(source[1], quadColor[1], quadColor1[1]); /* G */ + VEC4_MUL(source[2], quadColor[2], quadColor1[2]); /* B */ + } + break; case PIPE_BLENDFACTOR_SRC1_ALPHA: - assert(0); /* to do */ - break; + { + const float *alpha = quadColors[1][3]; + assert(cbuf == 0); + VEC4_MUL(source[0], quadColor[0], alpha); /* R */ + VEC4_MUL(source[1], quadColor[1], alpha); /* G */ + VEC4_MUL(source[2], quadColor[2], alpha); /* B */ + } + break; case PIPE_BLENDFACTOR_ZERO: VEC4_COPY(source[0], zero); /* R */ VEC4_COPY(source[1], zero); /* G */ @@ -372,11 +385,29 @@ blend_quad(struct quad_stage *qs, } break; case PIPE_BLENDFACTOR_INV_SRC1_COLOR: - assert(0); /* to do */ - break; + { + float (*quadColor1)[4] = quadColors[1]; + float inv_comp[4]; + assert(cbuf == 0); + VEC4_SUB(inv_comp, one, quadColor1[0]); /* R */ + VEC4_MUL(source[0], quadColor[0], inv_comp); /* R */ + VEC4_SUB(inv_comp, one, quadColor1[1]); /* G */ + VEC4_MUL(source[1], quadColor[1], inv_comp); /* G */ + VEC4_SUB(inv_comp, one, quadColor1[2]); /* B */ + VEC4_MUL(source[2], quadColor[2], inv_comp); /* B */ + } + break; case PIPE_BLENDFACTOR_INV_SRC1_ALPHA: - assert(0); /* to do */ - break; + { + const float *alpha = quadColors[1][3]; + float
Re: [Mesa3d-dev] Grab bag of random questions (whoo)
Marek Olšák wrote: 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in the fragment constants reads, Since Gallium doesn't support GL_ARB_shadow_ambient, this is always (0,0,0,0), right? I think the extension could be added to Gallium since the r300 compiler can generate code for it. It could. But generally, gallium doesn't implement features common hardware can't do (not only because most drivers except software based ones couldn't implement it, but those features also turn out to be rarely used, for obvious reasons). r300 is an exception here since it emulates ARB_shadow anyway. Though I think if you can make a case why this is really necessary it could be done, but that's not my call. Another comment reads, Gallium doesn't provide us with any information regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE, above the texture compare modes. I don't really like that section of code, but it probably can't get cleaner, right? Even though this is a rarely used feature in OpenGL nowadays, it should get fixed if we want to be GL-compliant. That means adding depth texture modes in pipe_sampler_state and setting them in the Mesa state tracker. The R300 compiler can already generate code for these modes as well. Note R300 is again special a bit here. Actually, I realized my earlier answer doesn't make sense. Hardware which actually supports EXT_texture_swizzle (and native ARB_shadow) should be able to implement this easily. Hardware like i965 which doesn't support EXT_texture_swizzle could do it in the shader. Maybe it would make sense to add EXT_texture_swizzle capability in gallium (in the sampler state). That would solve this in a bit more generic way than some special bits for depth texture mode. 7) Is there more information on the dual-source blend modes? I'm not sure if I can do them; might have to bug AMD for the register values. I bet R300 can't do these modes. It's only a Direct3D 10.0 feature, not present in Direct3D 10.1. MS must have a good reason to remove it. Where did you see that it's removed in 10.1? Here's a list of blend ops in d3d11: http://msdn.microsoft.com/en-us/library/ee416042(VS.85).aspx Note this feature can be present (via cap bits in some limited form) in D3D9Ex too, and I thought windows actually used it for (antialiased) text rendering (but don't quote me on that). BTW I looked at some of your patches and r3xx-r5xx cards don't even support separate blend enables, therefore the cap should be 0. Or are you going to emulate this using independent color channel masks and two rendering passes? That could be done in the state tracker. Also, I think the indep. color masks are r5xx-only. I also think even r500 shouldn't say this is supported. Just changing the colormasks isn't going to be very correct... Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Grab bag of random questions (whoo)
Corbin Simpson wrote: Another comment reads, Gallium doesn't provide us with any information regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE, above the texture compare modes. I don't really like that section of code, but it probably can't get cleaner, right? Yes, that's not very clean, but there doesn't seem to be an easy solution for this. Exposing this in gallium only seems marginally useful, since again modern hardware can't really do anything useful with that information neither. Maybe would need to tweak the shader if actually the wrong channels are used (probably shouldn't be the driver's responsibility to do this), but I guess assuming default LUMINANCE works just fine usually. New OGL won't have that problem... New OGL? GL3? Sweet. Well, all the luminance/intensity stuff goes away, so problem solved :-). Even when mesa will get to GL3 eventually, we'd still need to deal with that for ARB_compatibility however at least in theory. See also my answer in the other email, I was quite wrong that hardware typically can't do much with it. Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] hack around commas in macro argument
On 26.01.2010 09:18, Marvin wrote: Jose, Brian, Marc, Why is this necessary? It has been working fine so far. Which gcc version are you using? What commas are you referring to? the PIPE_ALIGN_TYPE macro is so far only used in the cell driver in src/gallium/drivers/cell/spu/spu_main.c (this is probably why no one noticed it). The marco takes a type, a stuct in this case, which can include commas: PIPE_ALIGN_TYPE(16, struct spu_framebuffer { void *color_start; /** addr of color surface in main memory */ void *depth_start; /** addr of depth surface in main memory */ enum pipe_format color_format; enum pipe_format depth_format; uint width, height; /** size in pixels */ ^^^ uint width_tiles, height_tiles; /** width and height in tiles */ ^^^ uint color_clear_value; uint depth_clear_value; uint zsize; /** 0, 2 or 4 bytes per Z */ float zscale; /** 65535.0, 2^24-1 or 2^32-1 */ }); This will cause a problem, as the macro will thread each comma as an argument seperator and thus the number of arguments is larger than 2. Hmm, maybe could just avoid the problem by not using commas in the struct declaration? Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] perrtblend merge
Hi, I'm planning on merging this branch to master soon. This will make it possible to do per render target blend enables, colormasks, and also per rendertarget blend funcs (with a different CAP bit for the latter, and this one isn't actually used in mesa state tracker yet). None of the drivers other than softpipe implement any of it, but they were adapted to the interface changes so should continue to run. Apparently, that functionality is only interesting for drivers supporting multiple render targets, and the hw probably needs to be quite new (I know that i965 could support it (well not the multiple blend funcs but the rest), but the driver currently only supports 1 render target). Roland -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] perrtblend merge
Oh, I should have added the PIPE_CAP bits (even if not supported) to all drivers. Good catch. I'll do that for the other drivers now. Roland (btw, I think r500 could do separate colormasks, but not separate blend enables, and there might be more hardware like that. However, this is not exposed by GL, it might be supported by some DX9 cap bit, but it didn't seem worthwile to add a separate gallium cap bit for supporting per-rt blend enables and colormasks, respectively.) On 26.01.2010 16:37, Corbin Simpson wrote: Yeah, r300 doesn't but r600 does. I've read through the branch, and the r300g patch looks perfect. I've pushed another patch on top for the pipe caps, to avoid post-merge cleanups for myself. On Tue, Jan 26, 2010 at 7:00 AM, Alex Deucher alexdeuc...@gmail.com wrote: On Tue, Jan 26, 2010 at 9:44 AM, Roland Scheidegger srol...@vmware.com wrote: Hi, I'm planning on merging this branch to master soon. This will make it possible to do per render target blend enables, colormasks, and also per rendertarget blend funcs (with a different CAP bit for the latter, and this one isn't actually used in mesa state tracker yet). None of the drivers other than softpipe implement any of it, but they were adapted to the interface changes so should continue to run. Apparently, that functionality is only interesting for drivers supporting multiple render targets, and the hw probably needs to be quite new (I know that i965 could support it (well not the multiple blend funcs but the rest), but the driver currently only supports 1 render target). FWIW, AMD R6xx+ hw supports MRTs and per-MRT blends as well, although at the moment the driver also only supports 1 RT. Alex -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] What about gl_rasterization_rules?
On 21.01.2010 18:47, Luca Barbieri wrote: On Thu, Jan 21, 2010 at 6:34 PM, Corbin Simpson mostawesomed...@gmail.com wrote: Maybe it's just me, since I actually wrote the docs, but does anybody else read them? From cso/rasterizer.html (viewable at e.g. http://people.freedesktop.org/~csimpson/gallium-docs/cso/rasterizer.html ): gl_rasterization_rules Whether the rasterizer should use (0.5, 0.5) pixel centers. When not set, the rasterizer will use (0, 0) for pixel centers. So why aren't these patches using pipe_rasterizer_state::gl_rasterization_rules? It's a different thing. gl_rasterization_rules affects the way fragments are rasterized, i.e. the set of fragments which a primitive is mapped to. Changing it is equivalent to adding/subtracting a subpixel offset to the viewport (which seemingly depends on the primitive type). The pixel center convention instead sets how the values look like in the fragment shader. Changing it is equivalent to adding/subtracting 0.5 to the fragment.position in the fragment shader. In other words, yes, if you set gl_rasterization_rules and the pixel center in a mismatched way, fragment.position will not be the coordinate of the rasterization center. As another example, suppose you do a blit with the 3D engine using fragment.position to sample from a texture rectangle with bilinear filtering. A wrong rasterization convention may cause 1 pixel black bars at the borders. A wrong pixel center convention will cause a 2x2 blur filter to be applied to the texture. BTW, gl_rasterization_rules is ignored by almost all drivers Most but not all. Not the software based ones, for instance. Should be easy to add to r300 (and the nouveau ones, I assume), I guess these simply don't care enough about environments with different (= DX9) rasterization rules :-). Roland From the spec: The scope of this extension deals *only* with how the fragment coordinate XY location appears during programming fragment processing. Beyond the scope of this extension are coordinate conventions used for rasterization or transformation. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge
On 21.01.2010 20:20, michal wrote: Hi, This simple feature branch adds support for two-dimensional constant buffers in TGSI. An example shader would look like this: FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL CONST[1][1..2] MAD OUT[0], IN[0], CONST[1][2], CONST[1][1] END For this to work, one needs to bind a buffer to slot nr 1 containing at least 3 vectors. Looks good to me - I wondered how you'd use the multiple constant buffers possible by the gallium interface, and that is how :-). Is that something we'd need a cap bit for in the future? Would this be also used by ARB_uniform_buffer_object / GL 3.1? Roland -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH 2/2] st: don't assert on empty fragment program
On 18.01.2010 19:15, Luca Barbieri wrote: Breakpoint 3, _mesa_ProgramStringARB (target=34820, format=34933, len=70, string=0x85922ba) at shader/arbprogram.c:434 434 GET_CURRENT_CONTEXT(ctx); $31 = 0x85922ba !!ARBfp1.0\n\nOPTION ARB_precision_hint_fastest;\n\n\n\nEND\n Not sure why Sauerbraten does this, but it does, at least on my system (Ubuntu Karmic, nv40 driver) and it should be legal. Probably depth writes only enabled for things like shadows? Roland -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium feature levels
On 11.01.2010 22:03, Zack Rusin wrote: On Monday 11 January 2010 15:17:00 Roland Scheidegger wrote: - extra mirror wrap modes - i don't think mirror repeat was ever supported and mirror clamp was removed in d3d10 but it seems that some hardware kept support for those Mirror repeat is a core feature in GL since 1.4 hence we can't just drop it. I wasn't suggesting that. I was just pointing out what happens with it from the D3D side. I think all hardware we'd ever care about would support it. mirror clamp / mirror clamp to edge are only an extension, though (ATI_texture_mirror_once). (I think the dx mirror once definition is probably mirror_clamp_to_edge in opengl parlance). That's possible. As mentioned I'm not really sure what to do with this feature. - shadow maps - it's more of an researched guess since it's largely based on a format support, but as far as i can tell all d3d10 hardware supports it, earlier it varies (e.g. nvidia did it for ages) Required for GL 1.4. I thought it was pretty much required for d3d sm2.0, though you're right you could probably just not support the texture format there. Anyway, most hardware should support it, I believe even those which didn't really supported it at DX9 SM 2.0 time supported it (chips like radeon r300 lacked the hw to do the comparison in the texture unit, but it can be more or less easily implemented in the pixel shader, though the implementation will suck as it certainly won't do PCF just use some point sampling version - unless you're willing to do a much more complex implementation in the pixel shader, but then on this generation of hardware you might exceed maximum shader length). I believe all hardware supporting SM 2.0 could at least do some sampling of depth textures, though possibly only 16 bit and I'm not sure filtering worked in all cases. Yes, but the issue is that I'm not sure how to represent it from a feature level case. Are you saying we should just enable it for all feature levels? That'd be nice. Hmm, maybe. I think the other stuff is acceptable. Take a look at the docs and let me know what you think. What is feature level 1 useful for? I thought we'd really wanted DX9 level functionality as a bare minimum. GL2.x certainly needs cards supporting shader model 2 (and that is a cheat, in reality it would be shader model 3). The main issue was having something without hardware vs in the feature levels. It was supposed to be whatever the current i915 driver currently supports, but yea, I think it doesn't make sense and level 2 should be minumum. Also, I don't quite get the shader model levels. I thought there were mainly two different DX9 versions, one with sm 2.0 the other with 3.0, with noone caring about other differences (as most stuff was cappable anyway). However, you've got 3 and all of them have 2.0 shader model? As mentioned this is based on the D3D feature level concept. It's the first link I put in the the references: http://msdn.microsoft.com/en-us/library/ee422086(VS.85).aspx#Overview It's there because that's what Microsoft defined as feature level and I'm assuming it's because they had a good need for it :) Ah, that's why it doesn't make much sense :-). I'm not sure what requirements got them to these levels. I definitely think those 3 dx9 levels are very odd and don't even make sense for d3d only, much less for gallium. For example, requires at least max aniso 16? You got to be kidding, aniso spec is so fuzzy you can pass any cheap point filter as compliant (well almost) anyway, so it doesn't make any sense (plus, this only really enhances filtering quality, it makes absolutely zero difference for writing applications). I think the retrofit of 9_1, 9_2, 9_3 to some arbitrary DX9 versions doesn't match hardware really neither. The most distinguishable feature of DX9.0c (which was the last version IIRC) was definitely SM 3.0, but of course like everything else (multiple render targets, etc.) it was optional. I think for gallium it would make way more sense to expose only 2 feature levels - basically drop 9_1, and additionally bump 9_3 to include SM 3.0 (I wonder if that's not just a typo there, after all the model is called ps_4_0_level_9_3 unlike the others which are called 9_1 only). Though granted nv20/25 can't do separate alpha blend (but it can't do fragment shaders neither really so I don't know how well that driver is ever going to work), i915 may not be able to do occlusion queries (not sure if hw can't do it but the current driver doesn't implement it), everybody (I think) can do mirror_once, and I don't know what overlapping vertex elements are. More comments below. +static const enum pipe_feature_level +i915_feature_level(struct pipe_screen *screen) +{ + return PIPE_FEATURE_LEVEL_1; +} What's the reason this is not feature level 2? Yea, I was winging it for all the drivers because I couldn't be bothered to do a cross
[Mesa3d-dev] gallium-noconstbuf merge
Hi, I'll plan to merge gallium-noconstbuf today. It's a pretty simple API change, so things should continue to run :-). Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-noconstbuf merge
On 11.01.2010 16:42, Keith Whitwell wrote: On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote: Hi, I'll plan to merge gallium-noconstbuf today. It's a pretty simple API change, so things should continue to run :-). Roland, Before you do this, can you make sure that the set_constant_buffer() entrypoint is properly documented in gallium/docs? I was planning to do that after the merge, since the branch is too old to include docs, so I'd have to merge from master just for that if I'd do it before the merge. Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-noconstbuf merge
On 11.01.2010 16:53, Keith Whitwell wrote: On Mon, 2010-01-11 at 07:50 -0800, Roland Scheidegger wrote: On 11.01.2010 16:42, Keith Whitwell wrote: On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote: Hi, I'll plan to merge gallium-noconstbuf today. It's a pretty simple API change, so things should continue to run :-). Roland, Before you do this, can you make sure that the set_constant_buffer() entrypoint is properly documented in gallium/docs? I was planning to do that after the merge, since the branch is too old to include docs, so I'd have to merge from master just for that if I'd do it before the merge. OK -- that's a decent excuse... Can you post a first draft of the docs here before merging? Ok here's a first stab. Actually I'm not sure how documentation should look like, there are no other functions really commented yet. Should these include function parameters / return values? Also I'll need to work on the syntax a bit I know... Roland diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst index 21f5f91..a6fe408 100644 --- a/src/gallium/docs/source/context.rst +++ b/src/gallium/docs/source/context.rst @@ -34,6 +34,12 @@ buffers, surfaces) are bound to the driver. * ``set_constant_buffer`` +void (*set_constant_buffer)( struct pipe_context *, + uint shader, uint index, + struct pipe_constant_buffer *buf ); +sets a constant buffer to be used in a given shader type. index is +used to indicate which buffer to set (note that some apis allow multiple ones +to be set, though drivers are mostly restricted to the first one right now). * ``set_framebuffer_state`` * ``set_fragment_sampler_textures`` * ``set_vertex_sampler_textures`` -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium feature levels
On 11.01.2010 18:49, Zack Rusin wrote: Hey, knowing that we're starting to have serious issues with figuring out what features the given device supports and what api's/extensions can be reasonably implemented on top of it I've spent the weekend trying to define feature levels. Feature levels were effectively defined by the Direct3D version numbers. Attached is a patch and documentation for the feature levels. I'm also attaching gallium_feature_levels.rst file which documents what each feature level means and what apis can be reasonably supported by each (I figured it's going to be easier to look at it outside the diff). There's a few features that are a bit problematic, in no particular order: - unnormalized coordinates, we don't even have a cap for those right now but since that feature doesn't exist in direct3d (all coords are always normalized in d3d) the support for it is hard to define in term of a feature level - two-sided stencil - d3d supports it in d3d10 but tons of hardware supported it earlier - extra mirror wrap modes - i don't think mirror repeat was ever supported and mirror clamp was removed in d3d10 but it seems that some hardware kept support for those Mirror repeat is a core feature in GL since 1.4 hence we can't just drop it. I think all hardware we'd ever care about would support it. mirror clamp / mirror clamp to edge are only an extension, though (ATI_texture_mirror_once). (I think the dx mirror once definition is probably mirror_clamp_to_edge in opengl parlance). - shadow maps - it's more of an researched guess since it's largely based on a format support, but as far as i can tell all d3d10 hardware supports it, earlier it varies (e.g. nvidia did it for ages) Required for GL 1.4. I thought it was pretty much required for d3d sm2.0, though you're right you could probably just not support the texture format there. Anyway, most hardware should support it, I believe even those which didn't really supported it at DX9 SM 2.0 time supported it (chips like radeon r300 lacked the hw to do the comparison in the texture unit, but it can be more or less easily implemented in the pixel shader, though the implementation will suck as it certainly won't do PCF just use some point sampling version - unless you're willing to do a much more complex implementation in the pixel shader, but then on this generation of hardware you might exceed maximum shader length). I believe all hardware supporting SM 2.0 could at least do some sampling of depth textures, though possibly only 16 bit and I'm not sure filtering worked in all cases. I think the other stuff is acceptable. Take a look at the docs and let me know what you think. What is feature level 1 useful for? I thought we'd really wanted DX9 level functionality as a bare minimum. GL2.x certainly needs cards supporting shader model 2 (and that is a cheat, in reality it would be shader model 3). Also, I don't quite get the shader model levels. I thought there were mainly two different DX9 versions, one with sm 2.0 the other with 3.0, with noone caring about other differences (as most stuff was cappable anyway). However, you've got 3 and all of them have 2.0 shader model? More comments below. +static const enum pipe_feature_level +i915_feature_level(struct pipe_screen *screen) +{ + return PIPE_FEATURE_LEVEL_1; +} What's the reason this is not feature level 2? +static const enum pipe_feature_level +nv30_screen_feature_level(struct pipe_screen *screen) +{ + return PIPE_FEATURE_LEVEL_1; +} + Hmm in theory this should be feature level 2. Maybe the driver doesn't quite cut it though... +static const enum pipe_feature_level r300_feature_level( + struct pipe_screen* pscreen) +{ + if (r300screen-caps-is_r500) { + return PIPE_FEATURE_LEVEL_2; + } else { + return PIPE_FEATURE_LEVEL_1; + } +} Shouldn't one be feature level 3 (or maybe 4?) the other 2? Profile 7 (2009)6 (2008)5 (2006) 4 (2004)3 (2003)2 (2002) 1 (2000) API Support DX11DX10.1 DX10/GL3.2 DX9.2 DX9.1 DX9.0DX7.0 GL4.0 GL3.2+ GL3.2 GL3.0 GL2.x GL2.xGL2.x VG VG VG VG VG VG VG CL1.0 CL1.0 CL1.0 Shader Model 5.0 4.x 4.0 2.0 2.0 2.0 1.0 4_0_level_9_3 4_0_level_9_1 4_0_level_9_1 Fragment Shader
Re: [Mesa3d-dev] RFC: gallium changes for conditional rendering
On 04.01.2010 15:48, Brian Paul wrote: Keith Whitwell wrote: On Thu, 2009-12-31 at 15:57 -0800, Brian Paul wrote: The BY_REGION modes indicate that it's OK for the GPU to discard the fragments in the region(s) which failed the occlusion test (perhaps skipping other per-fragment ops that would have otherwise occurred). See the spec at http://www.opengl.org/registry/specs/NV/conditional_render.txt for details. I'd be happy to omit those modes for now. But since they're in the NV spec, I suspect NVIDIA hardware (at least) can make use of them. Brian, Lets leave them in - I'm presuming the no-op implementation which maps them down to the regular tokens is fine. Yes. Incidentally, it would be fairly easy to take advantage of the BY_REGION modes in the llvm driver. If the number of samples passed in a tile during occlusion testing is zero, the tile can be skipped entirely when doing the conditional render. I'll check in these changes later today. I think the main benefit for the by-region modes might have been saving the vertex processing for the second GPU, but it's nice that these modes seem useful for other cases as well. (Remember for split-frame SLI, there will be two hardware occlusion query results, one for each gpu, and by-region modes will make it possible to run the rendering commands only on one when using conditional render). Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] [RFC] Remove PIPE_TEX_FILTER_ANISO to properly implement GL_EXT_texture_filter_anisotropic
On 01.01.2010 23:32, Luca Barbieri wrote: Currently Gallium defines a specific filtering mode for anisotropic filtering. This however prevents proper implementation of GL_EXT_texture_filter_anisotropic. The spec (written by nVidia) contains the following text: A texture's maximum degree of anisotropy is specified independent from the texture's minification and magnification filter (as opposed to being supported as an entirely new filtering mode). Implementations are free to use the specified minification and magnification filter to select a particular anisotropic texture filtering scheme. For example, a NEAREST filter with a maximum degree of anisotropy of two could be treated as a 2-tap filter that accounts for the direction of anisotropy. Implementations are also permitted to ignore the minification or magnification filter and implement the highest quality of anisotropic filtering possible. and Should there be a particular anisotropic texture filtering minification and magnification mode? RESOLUTION: NO. The maximum degree of anisotropy should control when anisotropic texturing is used. Making this orthogonal to the minification and magnification filtering modes allows these settings to influence the anisotropic scheme used. Yes, such an anisotropic filtering scheme exists in hardware. Gallium does the opposite, and this prevents use of nearest anisotropic filtering which is supported in nVidia hardware and also introduces redundant state. This patch removes PIPE_TEX_FILTER_ANISO. Anisotropic filtering is enabled if and only if max_anisotropy 1.0. Values between 0.0 and 1.0, inclusive, of max_anisotropy are to be considered equivalent, and meaning to turn off anisotropic filtering. This approach has the small drawback of eliminating the possibility of enabling anisotropic filter on either minification or magnification separately, which Radeon hardware seems to support, is currently support by Gallium but not exposed to OpenGL. If this is actually useful it could be handled by splitting max_anisotropy in two values and adding an appropriate OpenGL extension. How does Radeon anisotropic magnification differ from linear magnification? Note that different 3d apis have different requirements - ideally we should be able to choose some state which suits all of them. In particular, d3d10/11 have a separate filter mode for aniso (which applies to all of min/mag/mip filters at the same time). d3d9 also has special aniso filter, but it can be set separately for min and mag - apart from aniso d3d9 also has some more filters like 4-sample tent/guassian, all of them with undefined results if used as mip filter. max aniso values with d3d can be from (uint) 1 to 16, and I haven't seen hardware yet which could use float values for that. So it seems for conformant d3d9 (but not d3d10) implementation you'll need to be able to enable aniso for min/mag separately. Meanwhile, you said This however prevents proper implementation of GL_EXT_texture_filter_anisotropic. This isn't quite true - you've quoted it yourself Implementations are also permitted to ignore the minification or magnification filter and implement the highest quality of anisotropic filtering possible. I don't think it's terribly useful to being able to enable anisotropic filtering with other min/mag filters, and d3d never allowed it hence hardware support for this will likely be rare. I don't really have a strong opinion though if we should allow this in the api or not, I guess it might make some drivers (except nvidia ones) (plus d3d state trackers...) a bit more complicated but it shouldn't be too bad, and maybe there's actually one app out there which would use it - or maybe it'll give better results for things like forced aniso. Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Yet more r300g fear and loathing...
On 21.12.2009 15:13, Henri Verbeet wrote: 2009/12/21 Corbin Simpson mostawesomed...@gmail.com: So, yet another thing that r300 sucks balls at: NPOT textures. We've been talking it over on IRC, and here's the options. 1) Don't do NPOT. Stop advertising PIPE_CAP_NPOT, refuse to accept non-NPOT dimensions on textures. This sucks because it means that we don't get GL 2.0, which means most apps (bless their non-compliant souls) will refuse to attempt GLSL, which means that there's really no point in continuing this driver. 2) Don't do NPOT in the pipe, but do it in the state tracker instead, as needed. Write up the appropriate fallbacks, and then let ARB_npot be advertised by the state tracker regardless of whether PIPE_CAP_NPOT is set. Lots of typing, though. Lots and lots of typing. 3) Same as above, but put all the fallbacks in the pipe instead of the state tracker. I am *really* not fond of this, since PIPE_CAP was not intended for lies, but it was mentioned in IRC, so I gotta mention it here. 3) The fglrx special: Don't require ARB_npot for advertising GL 2.0. I figured this wasn't on the table, but you never know... This is not really about where to implement the fallbacks, but as far as Wine is concerned, we'd mostly care about not triggering those if we can avoid them, e.g. by restrictions on clamping and filtering. We don't care much if GL 2.0 is supported or not, so a hypothetical MESA_conditional_npot extension would work for us. Other applications might care though, in which case an extension that allows us to query what situations trigger fallbacks would work for us as well. The fglrx solution mostly just sucks, for an important part because there's (afaik) no real documentation on what the restrictions are, and the reported extensions are now inconsistent with the reported GL version. That said, Wine has code to handle this case now, and I imagine other applications do as well. This is a very common hardware problem, there's lots of hardware out there which can do some (like r300) or even all glsl shaders but lack true npot support. I suspect there might be a few apps which try to see if ARB_texture_npot is supported, and if not, they'll assume that functionality isn't supported even if the driver says GL 2.0. There's certainly precedent for not announcing extensions even if you have to support it for a given gl version, one prominent example would be the nvidia GF3/4 cards which were GL 1.4 but couldn't do blend_func_separate - they didn't announce support for EXT_blend_func_separate and just used software fallback when they needed. So of course just not announcing support for it isn't sufficient you still need to implement this somehow (unless you just want to misrender...) but it might give apps a hint, even though the API wasn't really designed for this. Sounds like it'll just pollute things though. Last time I checked the extension mechanism in gallium when used with dri state tracker was broken though and needed some work anyway (because dri_init_extensions was called after st_create_context, and the former just enables lots of extensions regardless any cap bits, hence the extension string will have lots of extensions which might not be supported). Anyway, doing this in a utility module sounds good, though I'm not sure what exactly you want to do. You could certainly fix up all texture lookups in the shader by doing address calculations manually and so on, but that gets a bit complicated quite soon I guess (in case of r300 it probably also increases chances a shader won't fit onto hardware a lot). Maybe misrendering things would still be an option, I think it would mostly be clamp modes which wouldn't work correctly, since AFAIK you could make even mipmaps work (on r300 at least). Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Yet more r300g fear and loathing...
The draw module approach can only work if the texcoords are used directly for texture lookups, not for calculated coords (it should be possible to detect these cases though). Roland On 21.12.2009 19:32, Keith Whitwell wrote: Faking those wrap modes is something that could be done either in the draw module (by decomposing triangles and adjusting the texcoords) or in the pixel shader (by adding logic to adjust the texcoord on a per-pixel basis). Probably the draw-module approach is the easiest to implement and is appropriate for an infrequently used path - you still get hardware rasterization speeds, just a more expensive vertex path. Keith From: Alex Deucher [alexdeuc...@gmail.com] Sent: Monday, December 21, 2009 10:18 AM To: tom fogal Cc: Mesa3D-Development Subject: Re: [Mesa3d-dev] Yet more r300g fear and loathing... I work on real-time visualization apps; the one in particular I'm thinking of does texture sampling of potentially-NPOT textures via GLSL. If sampling a NPOT texture is not going to run in hardware, the app is useless. Further, our app keeps track of the amount of GL memory allocated for textures, FBOs and the like. If a texture is going to be silently extended, that messes with our management routines [1]. The hardware supports rectangular texture sampling. What's missing is support for certain wrap modes and mipmaps with npot textures. Neither of which are used that often. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] gallium-edgeflags branch
Hello, I plan to merge gallium-edgeflags branch soon. I should have fixed up drivers syntactically, but note some will break if applications use edgeflags. In particular the drivers which so far have chosen to ignore edgeflags completely and don't have implemented a fall back to use the draw module might break (I'm looking at you, r300 and nv30!...). If those drivers want to continue to just have broken edgeflags support but you just don't want them to crash, you'll need to fix them up so they map the edgeflag output of the vertex shader to something halfway meaningful for the hw, like a unneeded temp or so. But really the right solution is to fix them so they use the draw module for things they can't handle, like svga and nv40 do (or, of course, make them handle edgeflags properly in hardware, but that might be dx10-class hardware only which truly can do it). Drivers for hardware without a hw vertex unit shouldn't have any problem, since draw will handle everything for them. You can use progs/trivial/tri-edgeflag for instance to see what happens. Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions
On 15.12.2009 14:14, michal wrote: Guys, Does the attached patch make sense to you? I replaced the incomplete switch-cases with calls to u_format_access functions that are complete but are going to be a bit more expensive to call. Since they are used not very often in mesa state tracker, I thought it's a good compromise. They are not only used in state trackers, but drivers for instance as well. That said, it's probably not really a performance critical path. Though I'm not sure it makes sense to keep these functions even around if they'll just do a single function call. Also, I'm pretty sure your usage of the union isn't strict aliasing compliant (as far as I can tell you could just go back and remove that ugly union again), though it's probably one of the cases gcc won't complain (and hopefully won't miscompile). Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions
On 15.12.2009 18:02, michal wrote: Roland Scheidegger pisze: On 15.12.2009 14:14, michal wrote: Guys, Does the attached patch make sense to you? I replaced the incomplete switch-cases with calls to u_format_access functions that are complete but are going to be a bit more expensive to call. Since they are used not very often in mesa state tracker, I thought it's a good compromise. They are not only used in state trackers, but drivers for instance as well. That said, it's probably not really a performance critical path. Though I'm not sure it makes sense to keep these functions even around if they'll just do a single function call. Also, I'm pretty sure your usage of the union isn't strict aliasing compliant (as far as I can tell you could just go back and remove that ugly union again), though it's probably one of the cases gcc won't complain (and hopefully won't miscompile). I am casting to (void *) and then u_format casts it back to whatever it needs to. I think I am innocent. Casts to void * and back to something are only safe if the something is the same as it initially was. Well in theory anyway. That's also where some of the initial warnings came from, callers using some pointer to unsigned, which then in the end got cast to ubyte * or whatever. An intermediate cast to void * doesn't change anything. That said, the callers probably couldn't have handled the formats not returning the right type anyway. Often though gcc won't complain about aliasing if you use some void * pointer in a function call and cast it to something else than it was, I think it usually won't be able to figure out what the original type was, hence it needs to assume it can alias with anything. Anyway, I will go after Keith's suggestion and fill in only the switch-default case. We can always nuke the special cases later when/if we realise the performance impact can be neglected. Yes, sounds good. Roland -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] r300 driver help needed
On 14.12.2009 10:29, michael wang wrote: Dear Mesa developers, I am learning OpenGL on my notebook (with an old ATI Radeon X600 video card), but I cannot get GL_LINE_STIPPLE work. It draws solid line only. glxinfo shows I'm using the R300 driver, and some study of the source code I find it fallback (to software rendering I suppose) when I enable GL_LINE_STIPPLE. So my question is: 1. How can I check why my software rendering does not do line stipple? You could try it with software mesa and see if it works there. Also, the fallback for r300 only happens if you don't have disable_lowimpact_fallbacks set, so if this is set for whatever reason you will indeed get a solid line. If you set RADEON_DEBUG=fall it should print out a warning if it hits that line stipple fallback 2. Is R300 project still active? Is so, where should I report this bug to? The project is still alive, if it's a driver bug and not your app you could file a bug at bugs.freedesktop.org. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge
On 09.12.2009 18:58, michal wrote: Keith Whitwell pisze: On Wed, 2009-12-09 at 09:16 -0800, michal wrote: Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Thanks. Michal, Can you detail what testing you've done on this branch and which environments you have/haven't built on? Testing: * Capture the output of the old syntax parser and comapre with the output of the new parser. No regressions found. Use a set of over 400 shaders to perform the comparison. * Run GLSL Parser Test to see if the new parser successfully intergrates with the rest of Mesa. No regressions found. So far I have been building that with scons on windows. I am planning to fix the build with make and scons on linux. Seems to compile just fine now with make. Too bad all the strict-aliasing violations are still there (in grammar.c), I'll give this a look (but don't wait for it for merging). Also, there seems to be some char/byte uncleanliness, I get a gazillion warnings like: shader/grammar/grammar.c: In function ‘get_spec’: shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness /usr/include/string.h:397: note: expected ‘const char *’ but argument is of type ‘byte *’ shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness shader/grammar/grammar.c:1978: note: expected ‘const char *’ but argument is of type ‘byte *’ shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge
On 09.12.2009 18:16, michal wrote: Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Good stuff! Looks like only scons build system is working though. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Branch pipe-format-simplify open for review
On 08.12.2009 15:55, michal wrote: This branch simplifies pipe/p_format.h by making enum pipe_format what it should have been -- an enum. As a result there is no extra information encoded in it and one needs to use auxiliary/util/u_format.h to get that info instead. Linking to the auxiliary/util lib is necessary. Please review and if you can test if it doesn't break your setup, I will appreciate it. I would like to hear from r300 and nouveau guys, as those drivers were using some internal macros and I weren't 100% sure I got the conversion right. Looks nice, though it is unfortunately based on pre gallium-noblocks merge, so I suspect you'll get a conflict for almost every patch chunk at least in drivers if you try to merge it... Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Branch pipe-format-simplify open for review
On 08.12.2009 16:49, michal wrote: Roland Scheidegger pisze: On 08.12.2009 15:55, michal wrote: This branch simplifies pipe/p_format.h by making enum pipe_format what it should have been -- an enum. As a result there is no extra information encoded in it and one needs to use auxiliary/util/u_format.h to get that info instead. Linking to the auxiliary/util lib is necessary. Please review and if you can test if it doesn't break your setup, I will appreciate it. I would like to hear from r300 and nouveau guys, as those drivers were using some internal macros and I weren't 100% sure I got the conversion right. Looks nice, though it is unfortunately based on pre gallium-noblocks merge, so I suspect you'll get a conflict for almost every patch chunk at least in drivers if you try to merge it... I didn't touch pipe blocks -- I left the pf_getblock* and friends in pipe_format.h intact. Yes, but you're bound to get lots of conflicts because you replaced for instance pf_format_get_block with util_format_get_block whereas that stuff is removed from master because pipe_format_block (and the block/nblocksx/nblocksy variables in pipe_texture and pipe_transfer) are gone completely. I quickly tried a merge and there were conflicts in over 40 files - from a quick glance though they should be trivial to resolve. And I don't think there's too much hidden stuff which won't work any longer - just let util_format_get_block die and it should probably work out ok. How severe is the gallium-noblocks change? I would like to avoid mergin master into this branch. It's not really that severe, it just touched a lot of the same places in drivers this change does. btw, I also avoided merging master to feature branch when I merged gallium-noblocks, and instead fixed up conflicts on merge to master (and adpated stuff which needed changes later). Is there some policy for this? Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-strict-aliasing branch merge
Keith, I think there might be some slight issue with some of the changes in the drivers I did. In particular, I was under the impression it would be ok to do something like union a_union { int i; double d; }; int f() { double d = 3.0; return ((union a_union *) d)-i; }; but in fact gcc manpage tells me it's not (the example is from gcc 4.4 manpage) - this site told me this is ok, casting through a union (2) http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html, I guess it was considered ok in 2006 but not now (though I'm not sure why not)... I did that in some places because otherwise there's no way around assigning the value to the union and pass that around instead. Curiously though, despite the gcc manpage saying the code might not be ok, gcc doesn't warn about it in the places I used it. Anyway, I'm not sure it's worth bothering with this now, as drivers could be fixed up without any interface changes. Roland unsigned i = On 08.12.2009 17:19, Keith Whitwell wrote: Roland, This looks OK to me, hopefully this will see us getting on top of strict aliasing issues after all these years... Keith On Mon, 2009-12-07 at 18:14 -0800, Roland Scheidegger wrote: Hello, I'm planning to merge gallium-strict-aliasing branch soon, which will bring another gallium api change. pipe_reference function has different arguments, because the old version was pretty much not really useful for strict-aliasing compliant code (util_color_pack functions also gets an update for the same reason). The goal of course it to enable builds which do no longer need -fno-strict-aliasing. scons builds already didn't do this (which was a bug since the builds were indeed broken). I didn't check all drivers for strict-aliasing compliance, but for gallium everybody should make sure the code they are submitting is according to strict aliasing rules (*). One downside of compiling with -fno-strict-aliasing is also that you don't get the warnings wrt strict aliasing, so you might have missed that in the past. (There are no build system changes yet, there's still some strict aliasing violating code in shader/grammar which should get replaced soon anyway.) (*) Strictly speaking, it looks like c99 actually has undefined behaviour writing and reading different members of a union (wtf?), but this is considered acceptable here, and all compilers should support it. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-strict-aliasing branch merge
On 08.12.2009 17:37, Keith Whitwell wrote: On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote: Keith, I think there might be some slight issue with some of the changes in the drivers I did. In particular, I was under the impression it would be ok to do something like union a_union { int i; double d; }; int f() { double d = 3.0; return ((union a_union *) d)-i; }; but in fact gcc manpage tells me it's not (the example is from gcc 4.4 manpage) - this site told me this is ok, casting through a union (2) http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html, I guess it was considered ok in 2006 but not now (though I'm not sure why not)... I did that in some places because otherwise there's no way around assigning the value to the union and pass that around instead. Curiously though, despite the gcc manpage saying the code might not be ok, gcc doesn't warn about it in the places I used it. Anyway, I'm not sure it's worth bothering with this now, as drivers could be fixed up without any interface changes. Is it a lot of extra work to fix? I wouldn't mind getting on top of this once and for all. Not in the places I touched. It'll just make the code uglier, though at least the compiler might still optimize extra assignments away. For example in st_atom_pixeltransfer.c it now looks like this: util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k)); and I'd need to change it to: union util_color uc; util_pack_color_ub(r, g, b, a, pt-format, uc); *(dest + k) = uc.ui; Ok, not really a lot more ugly. Will do this then, though there are other places where things like that might already be used, and since the compiler does not issue any warnings it might be a bit time consuming to find all of them. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-strict-aliasing branch merge
On 08.12.2009 18:12, Roland Scheidegger wrote: On 08.12.2009 17:37, Keith Whitwell wrote: On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote: Keith, I think there might be some slight issue with some of the changes in the drivers I did. In particular, I was under the impression it would be ok to do something like union a_union { int i; double d; }; int f() { double d = 3.0; return ((union a_union *) d)-i; }; but in fact gcc manpage tells me it's not (the example is from gcc 4.4 manpage) - this site told me this is ok, casting through a union (2) http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html, I guess it was considered ok in 2006 but not now (though I'm not sure why not)... I did that in some places because otherwise there's no way around assigning the value to the union and pass that around instead. Curiously though, despite the gcc manpage saying the code might not be ok, gcc doesn't warn about it in the places I used it. Anyway, I'm not sure it's worth bothering with this now, as drivers could be fixed up without any interface changes. Is it a lot of extra work to fix? I wouldn't mind getting on top of this once and for all. Not in the places I touched. It'll just make the code uglier, though at least the compiler might still optimize extra assignments away. For example in st_atom_pixeltransfer.c it now looks like this: util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k)); and I'd need to change it to: union util_color uc; util_pack_color_ub(r, g, b, a, pt-format, uc); *(dest + k) = uc.ui; Ok, not really a lot more ugly. Will do this then, though there are other places where things like that might already be used, and since the compiler does not issue any warnings it might be a bit time consuming to find all of them. Ok, unfortunately code in vg_translate.c got a lot more verbose :-(. Also, I think there's quite some usage of casting void * to other types. That could also lead to strict-aliasing violations, as you're only allowed to do casts back to the original type it had (hence the strict-aliasing warnings if you do *((float *) (void *) some-uint-value), because the compiler is able to determine original type). Might be safe though as long as gcc doesn't do too much interprocedural optimizations, and if it does it should probably be able to at least output a warning, since in this case it should also be able to determine the original type I guess... Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-strict-aliasing branch merge
On 08.12.2009 20:57, Martin Olsson wrote: Roland Scheidegger wrote: Keith, I think there might be some slight issue with some of the changes in the drivers I did. In particular, I was under the impression it would be ok to do something like union a_union { int i; double d; }; int f() { double d = 3.0; return ((union a_union *) d)-i; }; but in fact gcc manpage tells me it's not (the example is from gcc 4.4 manpage) I think the issue you are describing is explained here: http://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html Yes, probably. Note though it says gcc generates warnings for it, which didn't happen, so I think gcc would actually not miscompile it. (I suspect gcc doesn't complain and does not miscompile as long as it can't determine the original type of the value). Still, the explanation is imho not really satisfactory. I think a lot of people used to think this would be perfectly fine (see for instance http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html casting through a union (2)). Also note the link he posts to the GCC manual: http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Optimize-Options.html#index-fstrict_002daliasing-721 Yep, that's the same stuff I used for the example. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] gallium-strict-aliasing branch merge
Hello, I'm planning to merge gallium-strict-aliasing branch soon, which will bring another gallium api change. pipe_reference function has different arguments, because the old version was pretty much not really useful for strict-aliasing compliant code (util_color_pack functions also gets an update for the same reason). The goal of course it to enable builds which do no longer need -fno-strict-aliasing. scons builds already didn't do this (which was a bug since the builds were indeed broken). I didn't check all drivers for strict-aliasing compliance, but for gallium everybody should make sure the code they are submitting is according to strict aliasing rules (*). One downside of compiling with -fno-strict-aliasing is also that you don't get the warnings wrt strict aliasing, so you might have missed that in the past. (There are no build system changes yet, there's still some strict aliasing violating code in shader/grammar which should get replaced soon anyway.) (*) Strictly speaking, it looks like c99 actually has undefined behaviour writing and reading different members of a union (wtf?), but this is considered acceptable here, and all compilers should support it. Roland -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it
On 04.12.2009 11:24, Kenneth Graunke wrote: On Thursday 03 December 2009 12:47:36 Brian Paul wrote: [snip] I've been meaning to go over imports.[ch] and make a bunch of the wrapper functions inlines. A lot of the wrappers aren't needed any more. Back before valgrind I used the memory-related wrappers quite often. For now, let's keep the wrappers so we don't have to touch tons of other files right away. Matt, feel free to submit a patch. -Brian I've attached patches to remove a number of the wrappers, should you decide you want to go that way. diff --git a/src/mesa/main/imports.c b/src/mesa/main/imports.c index 6a34aec..0f10111 100644 --- a/src/mesa/main/imports.c +++ b/src/mesa/main/imports.c @@ -268,17 +268,6 @@ _mesa_bzero( void *dst, size_t n ) #endif } -/** Wrapper around memcmp() */ -int -_mesa_memcmp( const void *s1, const void *s2, size_t n ) -{ -#if defined(SUNOS4) - return memcmp( (char *) s1, (char *) s2, (int) n ); -#else - return memcmp(s1, s2, n); -#endif -} - /*...@}*/ So is the different implementation on SUNOS4 no longer relevant? Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs
On 03.12.2009 01:38, Jose Fonseca wrote: Interesting. Yes we want to fix the problem, as we're missing out potential optimizations. For fixing reference counting, couldn't we fix it by doing the final *pdst = src; in each pipe_xxx_reference function in the bottom of p_state.h, and pass only (*pdst)-reference, src-reference to the p_refcnt.h's pipe_reference() instead (i.e., just pointers, and no pointer to pointers)? I haven't tested it but it seems that that would eliminate all casts, hence should be correct. That is pretty much what I did (except I used a temporary pointer to pass its address to pipe_reference so I didn't have to change the pipe_reference function at all, so it still worked save the aliasing issues for other callers). Some callers use this directly, I can fix them I guess it's just a bit less convenient function, if this is the approach we'll take. Roland Jose From: Roland Scheidegger [srol...@vmware.com] Sent: Wednesday, December 02, 2009 23:19 To: Jose Fonseca Cc: mesa3d-...@lists.sf.net Subject: Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs On 02.12.2009 18:33, José Fonseca wrote: I've seen strict aliasing assumption causing bugs in other gallium components. It seems endemic to our code. Unless we actively decidee to go and chase the strict aliasing bugs now we should add -fno-strict-aliasing to all our builds. Do we ever want to fix strict aliasing? If we do, I think the problem with refcounting is pretty fundamental (I traced the crash to aliasing problems there, and hacked up some bogus version which didn't segfault for the testcase I used). At least I can't see a way to make this really work in some nice way. Supposedly gcc supports __attribute__((__may_alias__)) but I had no luck with it. In gallium (not core mesa) there's only one other offender causing a large amount of warnings, that is util_pack_color, and I think it won't actually cause problems. -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs
On 03.12.2009 11:17, Keith Whitwell wrote: On Wed, 2009-12-02 at 12:46 -0800, Roland Scheidegger wrote: On 02.12.2009 18:33, José Fonseca wrote: I've seen strict aliasing assumption causing bugs in other gallium components. It seems endemic to our code. Unless we actively decidee to go and chase the strict aliasing bugs now we should add -fno-strict-aliasing to all our builds. Hmm, actually some of them (in mesa at least) seem to be really unnecessary. Take the COPY_4FV macro for instance. I replaced that in a simple test program (attached) to either just do direct assignment without cast, or use memcpy instead. That comment was probably true in 1999 -- but possibly not any longer... The results are actually interesting, the comment says cast is done to avoid going through fp registers, but looking in the assembly (at least with optimization) that doesn't happen anyway, and the generated code is actually nearly identical, but in fact it not only triggers strict-aliasing warnings but doesn't work correctly (when compiled with -O3 or similar parameters invoking -fstrict-aliasing). ... Doesn't use 128bit sse moves but looks like an improvement... When using no optimization code certainly gets much less readable and the memcpy version will call glibc memcpy (which itself will still be optimized hence probably faster despite the function call). So I'll kill at least this one and just use _mesa_memcpy there, unless there are good reasons not to. I think pretty much all compilers should have builtin memcpy optimizations. I didn't realize COPY_4FV and friends were related to our strict aliasing problems -- if that's the case, let's kill or reimplement them straight away. Actually, that was the simplest one, and most other of these macros don't do this. There's also plenty of warnings in the shader/grammar code, apart from that there's actually not that many warnings, at least when not compiling legacy drivers... So I guess getting rid of strict-aliasing issues is doable for gallium. Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium-noblocks branch merge
On 03.12.2009 20:55, Christoph Bumiller wrote: Roland Scheidegger schrieb: Hi, I'm planning to merge gallium-noblocks branch to master soon. This api change may affect your driver, statetracker, whatever. I _should_ have fixed up all in tree stuff using it, but that's not a guarantee it will still run correctly (nv50 driver was strange for instance), and What's strange with nv50 ? There's this one if (!pt-nblocksx[level]) { in nv50_transfer.c that was an unnecessary leftover because I hadn't seen miptree_blanket forgot the initialize these and pushed a bit too early, thankfully this is now gone automatically. Ok, this is mostly what was strange. That and it was the driver which by far needed the most changes :-). I just need the y blocks everywhere instead of just y because things like offset = stride * y is simply wrong if you have *actual* multi-pixel blocks (pitch as in nblocksx * width). Yes, drivers are encouraged to use the block helpers. This way they don't need to special case any formats, as it should work for uncompressed, dxt, or things like ycbcr just the same. I hope no one will try to transfer just parts of a block (makes not much sense for DXT imo though). Yes, this shouldn't happen. Neither ogl nor dx should trigger this (it's not allowed for CompressedTexSubImage), so transfers are required to only happen along block boundaries. Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it
On 03.12.2009 19:46, Matt Turner wrote: Most of the functions in imports.c are very small, to the function call overhead is large relative to their size. Can't we do something like in the attached patch and move them to imports.h and mark them static inline? Things like memcpy and memset are often optimized out by the compiler, and by hiding them in these wrapper functions, we're probably losing any benefits we'd otherwise see. ++ from me, at least for the very simple wrappers. _mesa_memcpy especially I think can be very nicely used for array assignments and the like, and in case of (very) small amounts of data to copy call overhead might be significant. Similarly, if we're going to use a magic sqrtf algorithm (apparently for speed) then shouldn't we also let the compiler properly inline the function? Not sure here, the function is still quite complex, I don't think call overhead will make any difference. I've looked at the code though when it wasn't using the fast path (with -O3 but DEBUG - why is this different?) This version though adds a lot of overhead: - call overhead for _mesa_sqrtf - overhead converting to double - overhead converting back In the generated code the actual sqrtf code was a single assembly instruction (sqrtsd %xmm0, %xmm0) - granted that's SSE2 only, and it requires quite a few cycles. Still, I guess the overhead is significant, not to mention that if we'd just use a float instead of double not only we wouldn't have to convert the type but the compiler would actually issue sqrtss %xmm0 %xmm0 instead, which is (depending on the cpu) twice as fast. Not sure why we use double there, are there platforms where sqrtf (float x) isn't supported? So really, call overhead is a tiny fraction of the optimization potential for this function. When not using DEBUG (and USE_IEEE is defined) the function is still quite a few cycles, so call overhead doesn't look that bad neither. I don't actually know which version is faster (or more accurate - I think though sqrtss is actually fully accurate). Of course using sqrtf(x) will only be fast if the cpu supports some kind of fast float unit (and the compiler knows how to use it). If you'd want to do some more optimization, there's for instance _mesa_inv_sqrtf - it is supposedly fast, but sse2 offers rsqrtss, which is really fast. However, I remember we got some bugs some time ago when gcc actually used that, because precision wasn't enough - it will do this if you enable -funsafe-math-optimizations, -mrecip or similar. I've just seen though actually that at least gcc 4.4 does an additional newton-raphson step when you do 1.0/sqrtf(float x) (so it will issue rsqrtss plus a couple muls and adds), which might still be less or even more accurate, and almost certainly be faster than the manual version. So there's probably far more optimization potential than the call overhead. Most of those functions are probably never used in any performance critical path anyway. I also don't quite understand wrapper functions like double _mesa_pow(double x, double y) { return pow(x, y); } Maybe at one time these had #ifdefs in them like _mesa_memcpy, but I can't see any reason not to remove it now. Someone enlighten me. I guess there might have been indeed #ifdefs in the past. In any case, using wrapper would make it easier to implement such optimizations in the future if anyone wants to, not that this is something which you probably want to do (that stuff is probably better left up to the compiler). So, at least if they are inlined, they shouldn't really hurt neither. -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] mesa/gallium strict aliasing bugs
Hi, I've come across some bug (which I thought might be related to the gallium-noblocks branch, but it's not) which caused a segfault but only when not using debug builds. I think this is the same issue Vinson was seeing some time ago. Looks like a impossible backtrace: #0 st_texture_image_copy (pipe=0x612640, dst=0x0, dstLevel=value optimized out, src=0x6e1dd0, face=0) at src/mesa/state_tracker/st_texture.c:306 #1 0x7759b383 in copy_image_data_to_texture ( ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0, needFlush=value optimized out) at src/mesa/state_tracker/st_cb_texture.c:1673 #2 st_finalize_texture (ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0, needFlush=value optimized out) at src/mesa/state_tracker/st_cb_texture.c:1807 #3 0x7758fd9d in finalize_textures (st=0x68a9c0) at src/mesa/state_tracker/st_atom_texture.c:144 Segfault seems to be because dst is 0x0, but if you look at the call stack it is easy to see this is impossible. That would point to gcc optimizer issue (using gcc 4.4.1), except there are quite a few warnings during compile, especially about violating strict-aliasing rules... So, in the gallium.py scons file there's actually this: if debug: ccflags += ['-O0', '-g3'] elif env['CCVERSION'].startswith('4.2.'): # gcc 4.2.x optimizer is broken print warning: gcc 4.2.x optimizer is broken -- disabling optimizations ccflags += ['-O0', '-g3'] else: ccflags += ['-O3', '-g3'] So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's actually a code bug, and certainly it is not restricted to gcc 4.2 (unless this addressed a different problem). Not quite sure though why the code violates strict-aliasing rules though in all places - about half of the warnings are from pipe_reference, (p_refcount.h:85). Not sure if all warnings are actually real issues, and not sure how this should be fixed (should we try to fix this for real or just force -fno-strict-aliasing). Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs
On 02.12.2009 18:33, José Fonseca wrote: On Wed, 2009-12-02 at 09:05 -0800, Roland Scheidegger wrote: Hi, I've come across some bug (which I thought might be related to the gallium-noblocks branch, but it's not) which caused a segfault but only when not using debug builds. I think this is the same issue Vinson was seeing some time ago. Looks like a impossible backtrace: #0 st_texture_image_copy (pipe=0x612640, dst=0x0, dstLevel=value optimized out, src=0x6e1dd0, face=0) at src/mesa/state_tracker/st_texture.c:306 #1 0x7759b383 in copy_image_data_to_texture ( ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0, needFlush=value optimized out) at src/mesa/state_tracker/st_cb_texture.c:1673 #2 st_finalize_texture (ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0, needFlush=value optimized out) at src/mesa/state_tracker/st_cb_texture.c:1807 #3 0x7758fd9d in finalize_textures (st=0x68a9c0) at src/mesa/state_tracker/st_atom_texture.c:144 Segfault seems to be because dst is 0x0, but if you look at the call stack it is easy to see this is impossible. That would point to gcc optimizer issue (using gcc 4.4.1), except there are quite a few warnings during compile, especially about violating strict-aliasing rules... So, in the gallium.py scons file there's actually this: if debug: ccflags += ['-O0', '-g3'] elif env['CCVERSION'].startswith('4.2.'): # gcc 4.2.x optimizer is broken print warning: gcc 4.2.x optimizer is broken -- disabling optimizations ccflags += ['-O0', '-g3'] else: ccflags += ['-O3', '-g3'] So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's actually a code bug, and certainly it is not restricted to gcc 4.2 (unless this addressed a different problem). It addressed a different problem. Type git show bb8f3090ba37aa3f24943fdb43c4120776289658 to see explanation of it. Ok. Not quite sure though why the code violates strict-aliasing rules though in all places - about half of the warnings are from pipe_reference, (p_refcount.h:85). Not sure if all warnings are actually real issues, and not sure how this should be fixed (should we try to fix this for real or just force -fno-strict-aliasing). I read (forgot where) that gcc strict aliasing warnings don't catch all cases. gcc man page states this (-Wstrict-aliasing=n). Says though (gcc 4.4.1) with n=3 (default) there should be very few false positives and few false negatives. I've seen strict aliasing assumption causing bugs in other gallium components. It seems endemic to our code. Unless we actively decidee to go and chase the strict aliasing bugs now we should add -fno-strict-aliasing to all our builds. ok. I guess though there's no guarantee it won't break other compilers where we don't have set any flags for this. Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs
On 02.12.2009 18:33, José Fonseca wrote: I've seen strict aliasing assumption causing bugs in other gallium components. It seems endemic to our code. Unless we actively decidee to go and chase the strict aliasing bugs now we should add -fno-strict-aliasing to all our builds. Hmm, actually some of them (in mesa at least) seem to be really unnecessary. Take the COPY_4FV macro for instance. I replaced that in a simple test program (attached) to either just do direct assignment without cast, or use memcpy instead. The results are actually interesting, the comment says cast is done to avoid going through fp registers, but looking in the assembly (at least with optimization) that doesn't happen anyway, and the generated code is actually nearly identical, but in fact it not only triggers strict-aliasing warnings but doesn't work correctly (when compiled with -O3 or similar parameters invoking -fstrict-aliasing). assign_cast: .LFB45: .cfi_startproc movl(%rsi), %edx leaq4(%rdi), %rax movl%edx, 4(%rdi) movl4(%rsi), %edx movl%edx, 4(%rax) movl8(%rsi), %edx movl%edx, 8(%rax) movl12(%rsi), %edx movl%edx, 12(%rax) ret .cfi_endproc assign: .LFB46: .cfi_startproc movl(%rsi), %eax movl%eax, 4(%rdi) movl4(%rsi), %eax movl%eax, 8(%rdi) movl8(%rsi), %eax movl%eax, 12(%rdi) movl12(%rsi), %eax movl%eax, 16(%rdi) ret .cfi_endproc But clearly using memcpy the compiler does a better job: assign_cpy: .LFB44: .cfi_startproc movq(%rsi), %rax movq%rax, 4(%rdi) movq8(%rsi), %rax movq%rax, 12(%rdi) ret .cfi_endproc .LFE44: Doesn't use 128bit sse moves but looks like an improvement... When using no optimization code certainly gets much less readable and the memcpy version will call glibc memcpy (which itself will still be optimized hence probably faster despite the function call). So I'll kill at least this one and just use _mesa_memcpy there, unless there are good reasons not to. I think pretty much all compilers should have builtin memcpy optimizations. Roland #include string.h #include stdio.h #define COPY_4FV( DST, SRC ) \ do { \ const unsigned *_s = (const unsigned *) (SRC); \ unsigned *_d = (unsigned *) (DST); \ _d[0] = _s[0]; \ _d[1] = _s[1]; \ _d[2] = _s[2]; \ _d[3] = _s[3]; \ } while (0) #define COPY_4FV_NOCAST( DST, SRC ) \ do { \ (DST)[0] = (SRC)[0]; \ (DST)[1] = (SRC)[1]; \ (DST)[2] = (SRC)[2]; \ (DST)[3] = (SRC)[3]; \ } while (0) #define COPY_4FV_MEMCPY( DST, SRC ) \ do { \ memcpy(DST, SRC, sizeof(float) * 4);\ } while (0) struct sfloat { unsigned unused; float p[4]; }; void assign_cpy(struct sfloat *s, float *param) { COPY_4FV_MEMCPY(s-p, param); } void assign_cast(struct sfloat *s, float *param) { COPY_4FV(s-p, param); } void assign(struct sfloat *s, float *param) { COPY_4FV_NOCAST(s-p, param); } int main(void) { float fl[4] = {0.1,0.2,0.3,0.4}; struct sfloat s1; struct sfloat s2; struct sfloat s3; assign(s1, fl); fprintf(stderr, assigned values are %f %f %f %f\n, s1.p[0], s1.p[1], s1.p[2], s1.p[3]); assign_cpy(s2, fl); fprintf(stderr, assigned values are %f %f %f %f\n, s2.p[0], s2.p[1], s2.p[2], s2.p[3]); assign_cast(s3, fl); fprintf(stderr, assigned values are %f %f %f %f\n, s3.p[0], s3.p[1], s3.p[2], s3.p[3]); return 0; }-- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] gallium-noblocks branch merge
Hi, I'm planning to merge gallium-noblocks branch to master soon. This api change may affect your driver, statetracker, whatever. I _should_ have fixed up all in tree stuff using it, but that's not a guarantee it will still run correctly (nv50 driver was strange for instance), and certainly if you have out of tree things they will break. The changes themselves should be fairly simple, you can read more about them in the git log file. Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs
On 02.12.2009 18:33, José Fonseca wrote: I've seen strict aliasing assumption causing bugs in other gallium components. It seems endemic to our code. Unless we actively decidee to go and chase the strict aliasing bugs now we should add -fno-strict-aliasing to all our builds. Do we ever want to fix strict aliasing? If we do, I think the problem with refcounting is pretty fundamental (I traced the crash to aliasing problems there, and hacked up some bogus version which didn't segfault for the testcase I used). At least I can't see a way to make this really work in some nice way. Supposedly gcc supports __attribute__((__may_alias__)) but I had no luck with it. In gallium (not core mesa) there's only one other offender causing a large amount of warnings, that is util_pack_color, and I think it won't actually cause problems. -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: Fix array out-of-bounds access by _mesa_TexGeni.
On 01.12.2009 11:16, Ian Romanick wrote: Speaking of which... there are a bunch of conflicts merging 7.7 to master in Galliumland. Could one of you guys take a look at it? I have no clue what's going on over there. Quite a few of that was due to the gallium interface changes (introduced by width0 branch merge). It will get only worse when I merge gallium-noblocks branch (not quite there yet). Those changes are fairly intrusive as those API changes affect a lot of files/code. Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: Fix array out-of-bounds access by _mesa_TexGeni.
On 01.12.2009 15:35, Keith Whitwell wrote: On Tue, 2009-12-01 at 06:31 -0800, Roland Scheidegger wrote: On 01.12.2009 11:16, Ian Romanick wrote: Speaking of which... there are a bunch of conflicts merging 7.7 to master in Galliumland. Could one of you guys take a look at it? I have no clue what's going on over there. Quite a few of that was due to the gallium interface changes (introduced by width0 branch merge). It will get only worse when I merge gallium-noblocks branch (not quite there yet). Those changes are fairly intrusive as those API changes affect a lot of files/code. They were pretty minimal really - but there was some knowledge required of what is new and what is old. It's not much fun resolving conflicts in code you don't know about, but the conflicts themselves weren't onerous. Yes, the changes themselves are pretty simple - it's just because so many files are affected there's a lot of potential for future merge conflicts. Nothing really difficult to resolve, but annoying nonetheless (but there's no way to avoid it). Roland -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state
On 27.11.2009 19:32, michal wrote: Why is the MAX here smaller than for fragment samplers? Doesn't GL require them to be the same, because GL effectively binds the same set of sampler states in both cases? Can you take a closer look at what the GL state tracker would have to do to expose this functionality and make sure it's valid? It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many samplers can be used in a vertex shader. Anything above that is used only with fragment shaders and ignored for vertex shaders. I fail to see though why the limit needs to be that low. All modern hardware nowadays can use the same number of texture samplers for both fragment and vertex shading (it's the same sampler hardware, after all). Some older hardware (typically non-unified, D3D9 shader model 3 compliant) though indeed only had limited support for this (like the GeForce 6/7 series) probably only supporting 4 (can't remember exactly), though other hardware never implemented it despite d3d9 sm3 requiring it (thanks to a api loophole). Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] gallium width0 branch merge
Hi, just a warning I'm planning on merging width0 branch to master tomorrow. This is a interface change eliminating width/height/depth arrays from pipe_texture, instead just storing base width/height/depth. In-tree drivers/state trackers should be fixed (I think though there might be bugs with rbug), but obviously if you have any out-of-tree drivers they will break (though should be trivial to fix). Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] st_shader-varients merge tomorrow
I'm planning to merge st_shader-varients branch to master tomorrow. This should not adversely affect drivers, unless they rely on generics inputs/outputs semantic_index always starting at 0 without holes (something that they shouldn't do but it would have worked previously). Feedback for hw drivers welcome, I'll try i915 myself, but I can't test the others, though some quick glance seemed to suggest they should be ok. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Blit support for r300
On 23.10.2009 08:37, Maciej Cencora wrote: Hi, as you may already know r300 classic driver is in pretty good shape these days, but there's one thing that causes major slowdowns in many games: lack of hardware accelerated blit operation. The same is true for r100/r200... Currently all glCopyTex[Sub]Image operations are done through span functions which is slow as hell. We could use the hw blitter unit, but using it causes stalls because of the 2D/3D mode switch. A long time ago I implemented this as a hack for r200 (just blit directly to the texture in vram, so never touching the backup texture in system memory). Worked quite well in practice (good enough for doom3 special effects). I didn't notice any obvious slowdowns due to 2d/3d sync issues (though maybe I didn't do any syncs...). I was wondering how this could be fixed and I got this crazy idea of porting the everything-is-texture concept from gallium to classic mesa. Actually not all of it, just the pieces that make the renderbuffers look like textures for the driver. You could probably just try to hack up a blit using 3d engine? Though of course lots of setup would be needed. Nice thing about not using blitter (apart from potential performance issues) is of course that you can also support format conversion for free. Brian, what do you think about this idea? Is it feasible and worth doing? Maybe you have better ideas how to resolve this issue? Not sure what Brian's opinion on that is, but I'm not sure if there's really much point in trying to port over half of gallium to classic mesa. Looks to me like time might be better spent working on gallium drivers instead... Roland -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH 1/2] mesa: Compact state key for TexEnv program cache
Hmm, I'm not actually sure this will always reduce the state key size. I think the compiler is still allowed to pad the mode_opt struct out to whatever it likes (maybe #pragma pack(1) can prevent this), even though maybe gcc does not. I don't like pragmas too much, but it looks the only way to do this in some clean c99 way would be to get rid of the mode_opt struct entirely? Roland On 02.09.2009 16:23, Brian Paul wrote: Unfortunately gcc (version 4.3.2 anyway) warns on this: main/texenvprogram.c:87: warning: type of bit-field ‘Source’ is a GCC extension main/texenvprogram.c:88: warning: type of bit-field ‘Operand’ is a GCC extension I'm trying to find a #pragma or something to silence the warning... -Brian Keith Whitwell wrote: Looks great Chris. Keith On Wed, 2009-09-02 at 05:11 -0700, Chris Wilson wrote: By rearranging the bitfields within the key we can reduce the size of the key from 644 to 196 bytes, reducing the cost of both the hashing and equality tests. --- src/mesa/main/texenvprogram.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/texenvprogram.c b/src/mesa/main/texenvprogram.c index 5913957..3851937 100644 --- a/src/mesa/main/texenvprogram.c +++ b/src/mesa/main/texenvprogram.c @@ -82,8 +82,8 @@ texenv_doing_secondary_color(GLcontext *ctx) #define DISASSEM (MESA_VERBOSE VERBOSE_DISASSEM) struct mode_opt { - GLuint Source:4; /** SRC_x */ - GLuint Operand:3; /** OPR_x */ + GLubyte Source:4; /** SRC_x */ + GLubyte Operand:3; /** OPR_x */ }; struct state_key { @@ -103,10 +103,11 @@ struct state_key { GLuint NumArgsRGB:3; /** up to MAX_COMBINER_TERMS */ GLuint ModeRGB:5; /** MODE_x */ - struct mode_opt OptRGB[MAX_COMBINER_TERMS]; GLuint NumArgsA:3; /** up to MAX_COMBINER_TERMS */ GLuint ModeA:5; /** MODE_x */ + + struct mode_opt OptRGB[MAX_COMBINER_TERMS]; struct mode_opt OptA[MAX_COMBINER_TERMS]; } unit[MAX_TEXTURE_UNITS]; }; -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev . -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today
On 23.08.2009 01:50, Ian Romanick wrote: Philipp Heise wrote: Ian Romanick wrote: Roland Scheidegger wrote: glprogs/R200_interaction.vp GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error, unexpected $undefined Okay. I posted a patch to bug #23457 that should fix this. Could you give it a test on R200, and let me know? I've only run this on my laptop, and I don't have Doom3 installed there. I haven't yet tested it in-game. Hi Ian, thanks for your great work! The problem is not the header of the vp, but the additional carriage return at the end of each line ... DOS newline-format. Therefore the parser fails at the end of the header line. The attached patch should fix the problem. Oh good grief! It's always the little things. Hmm... Unix uses \n, and DOS uses \r\n. Don't Macs use \r? If that's the case, the proposed patch could cause the line numbers to be incorrect if the shaders are authored on Macs. I should be able to whip up a patch that will handle that case, but it will have to wait until later today. Thanks for tracking this down. Works now perfectly indeed. Sorry for leading you to the wrong track first, I don't know why the doom3 error output doesn't include the header of the shader even though it's actually there. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today
On 23.08.2009 01:50, Ian Romanick wrote: Philipp Heise wrote: Ian Romanick wrote: Roland Scheidegger wrote: glprogs/R200_interaction.vp GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error, unexpected $undefined Okay. I posted a patch to bug #23457 that should fix this. Could you give it a test on R200, and let me know? I've only run this on my laptop, and I don't have Doom3 installed there. I haven't yet tested it in-game. Hi Ian, thanks for your great work! The problem is not the header of the vp, but the additional carriage return at the end of each line ... DOS newline-format. Therefore the parser fails at the end of the header line. The attached patch should fix the problem. Oh good grief! It's always the little things. Hmm... Unix uses \n, and DOS uses \r\n. Don't Macs use \r? If that's the case, the proposed patch could cause the line numbers to be incorrect if the shaders are authored on Macs. I should be able to whip up a patch that will handle that case, but it will have to wait until later today. Thanks for tracking this down. Works now perfectly indeed. Sorry for leading you to the wrong track first, I don't know why the doom3 error output doesn't include the header of the shader even though it's actually there. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today
On 21.08.2009 20:26, Ian Romanick wrote: All, In the next couple hours I'm planning to merge the asm-shader-rework-1 branch to master. In my testing I have found that it passes at least as many (and in a couple cases more) tests than the current code. One of our internal tests runs about 89,000 vertex programs. This test takes about 30 minutes (1,800 seconds) on current Mesa master. On the new code it takes about 25 seconds. Good work! It seems to break (all of) doom3's (vertex, at least) shaders however. At least with r200, here's the doom3 output for the main r200 vertex shader (others break in exactly the same way). glprogs/R200_interaction.vp GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error, unexpected $undefined error at 34: ariant ; # this is slightly simpler than the ARB interaction, # because the R200 can only emit six texture coordinates, # so we assume that the diffuse and specular matrixes are # the same, with higher level code splitting it into two # passes if it isn't # # I am using texcoords instead of attribs, because a separate # extension is required to use attribs with vertex array objects. # # input: # # TEX0 texture coordinates # TEX1 tangent[0] # TEX2 tangent[1] # TEX3 normal # COL vertex color # # c[4] localLightOrigin # c[5] localViewOrigin # c[6] lightProjection S # c[7] lightProjection T # c[8] lightProjection Q # c[9] lightFalloff S # c[10] bumpMatrix S # c[11] bumpMatrix T # c[12] diffuseMatrix S # c[13] diffuseMatrix T # c[14] specularMatrix S # c[15] specularMatrix T # # output: # # texcoord 0 = light projection texGen # texcoord 1 = light falloff texGen # texcoord 2 = bumpmap texCoords # texcoord 3 = specular / diffuse texCoords # texcoord 4 = normalized halfangle vector in tangent space # texcoord 5 = unnormalized vector to light in tangent space TEMPR0, R1, R2, lightDir; PARAM defaultTexCoord = { 0, 0.5, 0, 1 }; # texture 0 has three texgens DP4 result.texcoord[0].x, vertex.position, program.env[6]; DP4 result.texcoord[0].y, vertex.position, program.env[7]; DP4 result.texcoord[0].w, vertex.position, program.env[8]; # texture 1 has one texgen MOV result.texcoord[1], defaultTexCoord; DP4 result.texcoord[1].x, vertex.position, program.env[9]; # textures 2 takes the base coordinates by the texture matrix MOV result.texcoord[2], defaultTexCoord; DP4 result.texcoord[2].x, vertex.texcoord[0], program.env[10]; DP4 result.texcoord[2].y, vertex.texcoord[0], program.env[11]; # textures 3 takes the base coordinates by the texture matrix MOV result.texcoord[3], defaultTexCoord; DP4 result.texcoord[3].x, vertex.texcoord[0], program.env[12]; DP4 result.texcoord[3].y, vertex.texcoord[0], program.env[13]; # texture 4's texcoords will be the halfangle in tangent space # calculate normalized vector to light in R0 SUB lightDir, program.env[4], vertex.position; DP3 R1, lightDir, lightDir; RSQ R1, R1.x; MUL R0, lightDir, R1.x; # calculate normalized vector to viewer in R1 SUB R1, program.env[5], vertex.position; DP3 R2, R1, R1; RSQ R2, R2.x; MUL R1, R1, R2.x; # add together to become the half angle vector in object space (non-normalized) ADD R0, R0, R1; # put into texture space DP3 result.texcoord[4].x, vertex.texcoord[1], R0; DP3 result.texcoord[4].y, vertex.texcoord[2], R0; DP3 result.texcoord[4].z, vertex.texcoord[3], R0; # texture 5's texcoords will be the unnormalized lightDir in tangent space DP3 result.texcoord[5].x, vertex.texcoord[1], lightDir; DP3 result.texcoord[5].y, vertex.texcoord[2], lightDir; DP3 result.texcoord[5].z, vertex.texcoord[3], lightDir; # generate the vertex color, which can be 1.0, color, or 1.0 - color # for 1.0 : env[16] = 0, env[17] = 1 # for color : env[16] = 1, env[17] = 0 # for 1.0-color : env[16] = -1, env[17] = 1 MAD result.color, vertex.color, program.env[16], program.env[17]; END -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): i965: Use _MaxElement instead of index-calculated min/ max for VBO bounds.
On 13.08.2009 12:19, Michel Dänzer wrote: On Wed, 2009-08-12 at 11:31 -0700, Eric Anholt wrote: Module: Mesa Branch: master Commit: e643bc5fc7afb563028f5a089ca5e38172af41a8 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e643bc5fc7afb563028f5a089ca5e38172af41a8 Author: Eric Anholt e...@anholt.net Date: Tue Aug 11 12:59:09 2009 -0700 i965: Use _MaxElement instead of index-calculated min/max for VBO bounds. This change breaks things all over the place here. E.g. progs/glsl/array and .../skinning are missing most of the geometry, and a lot of the other glsl progs have weird lighting. The problem here is for vertex buffer elements which have zero stride count is 1. This is now used as bounds check, which will hence fail for most indices - according to docs hardware will simply return 0 in this case (and not the data at start index or something like that which would work). I'll fix this (should be ok to just disable bounds check for these cases as in fact any index is valid if stride is 0). (Looks like this isn't an issue for IGDNG (Ironlake I assume) since it appears like this one uses address range instead of maximum index to check against according to the code...). Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] ATI Mobility Radeon X300: Blender menus all black (or white)
On 31.07.2009 10:26, Terry Barnaby wrote: Hi,I have a problem with the Mesa DRI Radeon 300 driver in that I cannot use the blender application as the menus are not displayed correctly. See bug: https://bugs.freedesktop.org/show_bug.cgi?id=21774 I would like to get this fixed as I need to be able to run blender at a reasonable speed on this system. I am running the latest DRM/MESA/xf86-video-ati code from git. Are there any pointers on how to start debugging this ? For example, can I turn off various hardware acceleration features one by one until I find the source of the problem ? This looks like an issue with dri2 and front buffer rendering. Did you try if this still works with dri1? You probably can't just disable dri2, but disabling kms (nomodeset boot param) should force the driver to use dri1 (for radeon cards) I think. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] ATI Mobility Radeon X300: Blender menus all black (or white)
On 31.07.2009 15:35, Terry Barnaby wrote: On 07/31/2009 02:15 PM, Roland Scheidegger wrote: On 31.07.2009 10:26, Terry Barnaby wrote: Hi,I have a problem with the Mesa DRI Radeon 300 driver in that I cannot use the blender application as the menus are not displayed correctly. See bug: https://bugs.freedesktop.org/show_bug.cgi?id=21774 I would like to get this fixed as I need to be able to run blender at a reasonable speed on this system. I am running the latest DRM/MESA/xf86-video-ati code from git. Are there any pointers on how to start debugging this ? For example, can I turn off various hardware acceleration features one by one until I find the source of the problem ? This looks like an issue with dri2 and front buffer rendering. Did you try if this still works with dri1? You probably can't just disable dri2, but disabling kms (nomodeset boot param) should force the driver to use dri1 (for radeon cards) I think. Roland Thanks for the reply. I have tried a lot of user level configuration options such as nomodeset, disabling EXA etc. None of these had any effect on the blender issue. Using nomodeset on the kernel boot line did appear to change the DRI interface from DRI2 to DRI1 (at least from the glxinfo report). Hence my move to trying the latest code from git ... Oh so it also happens with DRI1. It would probably be easiest (as others have suggested) to do a git bisect then. Alternatively, since this appears to work on r200 but not r300, you could try finding differences wrt front buffer rendering between those drivers manually - it seems unlikely this still happens to work with git master on r200 but not r300 since pretty much all of the code which could affect this issue should be shared. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] ATI R200 code currently broken in git
On 31.07.2009 17:36, Terry Barnaby wrote: I have just compiled/installed the latest drm/mesa/xf86-video-ati code from git under Fedora 11 on a system with an ATI Technologies Inc RV280 [Radeon 9200 PRO] graphics board. 2D appears fine. 3D is quite broken. gxlgears runs showing about 500 frames/sec. However it shows as moving gears for 1 second followed by 5 seconds of still frame, repeated. blender just shows a really currupted screen with chess board patterns, lots of horizontal tearing, half drawn menu's etc ... Hmm, I don't see that. I'm using a very old drm/ddx on that box though so no kms/dri2, and no compositing neither. Roland -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev