Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge
michal wrote on 2010-03-12 15:00: michal wrote on 2010-03-11 17:59: Keith Whitwell wrote on 2010-03-11 16:16: On Thu, 2010-03-11 at 06:05 -0800, michal wrote: Keith Whitwell wrote on 2010-03-11 14:21: On Thu, 2010-03-11 at 03:16 -0800, michal wrote: Hi, I would like to merge the branch in subject this week. This feature branch allows state trackers to bind sampler views instead of textures to shader stages. A sampler view object holds a reference to a texture and also overrides internal texture format (resource casting) and specifies RGBA swizzle (needed for GL_EXT_texture_swizzle extension). Michal, I've got some issues with the way the sampler views are being generated and used inside the CSO module. The point of a sampler view is that it gives the driver an opportunity to do expensive operations required for special sampling modes (which may include copying surface data if hardware is deficient in some way). This approach works if a sampler view is created once, then used multiple times before being deleted. Unfortunately, it seems the changes to support this in the CSO module provide only a single-shot usage model. Sampler views are created in cso_set_XXX_sampler_textures, bound to the context, and then dereferenced/destroyed on the next bind. The reason CSO code looks like this is because it was meant to be an itermediate step towards migration to sampler view model. Fully converting all existing state trackers is non-trivial and thus I chose this conservative approach. State trackers that do not care about extra features a sampler view provides will keep using this one-shot CSO interface with the hope that creation of sampler objects is lighweight (format matches texture format, swizzle matches native texel layout, etc.). On the surface, this hope isn't likely to be fulfilled - lots of hardware doesn't support non-zero first_level. Most cases of drivers implementing sampler views internally are to catch this issue. Of course, it seems like your branch so leaves the existing driver-specific sampler view code in place, so that there are potentially two implementations of sampler views in those drivers. I guess this means that you can get away with the current implementation for now, but it prevents drivers actually taking advantage of the fact that these entities exist in the interface -- they will continue to have to duplicate the concept internally until the state trackers and/or CSO module start caching views. Ideally, everybody moves on and we stop using CSO for sampler views. I prefer putting my effort into incremental migration of state trackers rather than caching something that by definition doesn't need to be cached. The CSO module exists to manage this type of caching on behalf of state trackers. I would have thought that this was a sensible extension of the existing purpose of the CSO module. Won't all state-trackers implementing APIs which don't expose sampler views to the application require essentially the same caching logic, as is the case with regular state? Wouldn't it be least effort to do that caching once only in the CSO module? OK, I see your point. I will make the necessary changes and ping you when that's done. Keith, I changed my mind, went ahead and implemented sampler view caching in mesa state tracker, rather than inside cso context. I strongly believe that doing caching on cso side would be slower and more complicated. A state tracker has a better understanding of the relationship between a texture and sampler view. In case of mesa, this is trivial 1-to-1 mapping. Later, when we'll need more sampler views per texture, we can have a per-texture cache for that, and yes, the code for that would be in cso. There are two other state trackers that need to be fixed: xorg and vega. The transition should be similar to mesa -- I can help with doing that, but I can't do it myself. Once that's done we can purge one-shot sampler view wrappers. What do you think? Keith, I just finished transforming mesa and auxiliary modules to new sampler view interfaces. The remaining bits are vega and xorg state trackers -- I will need help with them, but they could be fixed after the merge, as they are not broken, and just set sampler view in suboptimal fashion. Please review, thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw
Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge
Keith Whitwell wrote on 2010-03-15 15:19: On Mon, 2010-03-15 at 07:08 -0700, michal wrote: michal wrote on 2010-03-12 15:00: michal wrote on 2010-03-11 17:59: Keith Whitwell wrote on 2010-03-11 16:16: On Thu, 2010-03-11 at 06:05 -0800, michal wrote: Keith Whitwell wrote on 2010-03-11 14:21: On Thu, 2010-03-11 at 03:16 -0800, michal wrote: Hi, I would like to merge the branch in subject this week. This feature branch allows state trackers to bind sampler views instead of textures to shader stages. A sampler view object holds a reference to a texture and also overrides internal texture format (resource casting) and specifies RGBA swizzle (needed for GL_EXT_texture_swizzle extension). Michal, I've got some issues with the way the sampler views are being generated and used inside the CSO module. The point of a sampler view is that it gives the driver an opportunity to do expensive operations required for special sampling modes (which may include copying surface data if hardware is deficient in some way). This approach works if a sampler view is created once, then used multiple times before being deleted. Unfortunately, it seems the changes to support this in the CSO module provide only a single-shot usage model. Sampler views are created in cso_set_XXX_sampler_textures, bound to the context, and then dereferenced/destroyed on the next bind. The reason CSO code looks like this is because it was meant to be an itermediate step towards migration to sampler view model. Fully converting all existing state trackers is non-trivial and thus I chose this conservative approach. State trackers that do not care about extra features a sampler view provides will keep using this one-shot CSO interface with the hope that creation of sampler objects is lighweight (format matches texture format, swizzle matches native texel layout, etc.). On the surface, this hope isn't likely to be fulfilled - lots of hardware doesn't support non-zero first_level. Most cases of drivers implementing sampler views internally are to catch this issue. Of course, it seems like your branch so leaves the existing driver-specific sampler view code in place, so that there are potentially two implementations of sampler views in those drivers. I guess this means that you can get away with the current implementation for now, but it prevents drivers actually taking advantage of the fact that these entities exist in the interface -- they will continue to have to duplicate the concept internally until the state trackers and/or CSO module start caching views. Ideally, everybody moves on and we stop using CSO for sampler views. I prefer putting my effort into incremental migration of state trackers rather than caching something that by definition doesn't need to be cached. The CSO module exists to manage this type of caching on behalf of state trackers. I would have thought that this was a sensible extension of the existing purpose of the CSO module. Won't all state-trackers implementing APIs which don't expose sampler views to the application require essentially the same caching logic, as is the case with regular state? Wouldn't it be least effort to do that caching once only in the CSO module? OK, I see your point. I will make the necessary changes and ping you when that's done. Keith, I changed my mind, went ahead and implemented sampler view caching in mesa state tracker, rather than inside cso context. I strongly believe that doing caching on cso side would be slower and more complicated. A state tracker has a better understanding of the relationship between a texture and sampler view. In case of mesa, this is trivial 1-to-1 mapping. Later, when we'll need more sampler views per texture, we can have a per-texture cache for that, and yes, the code for that would be in cso. There are two other state trackers that need to be fixed: xorg and vega. The transition should be similar to mesa -- I can help with doing that, but I can't do it myself. Once that's done we can purge one-shot sampler view wrappers. What do you think? Keith, I just finished transforming mesa and auxiliary modules to new sampler view interfaces. The remaining bits are vega and xorg state trackers -- I will need help with them, but they could be fixed after the merge, as they are not broken, and just set sampler view in suboptimal fashion. Please review, thanks. Michal, Did you get a chance to look at the double
Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge
michal wrote on 2010-03-11 17:59: Keith Whitwell wrote on 2010-03-11 16:16: On Thu, 2010-03-11 at 06:05 -0800, michal wrote: Keith Whitwell wrote on 2010-03-11 14:21: On Thu, 2010-03-11 at 03:16 -0800, michal wrote: Hi, I would like to merge the branch in subject this week. This feature branch allows state trackers to bind sampler views instead of textures to shader stages. A sampler view object holds a reference to a texture and also overrides internal texture format (resource casting) and specifies RGBA swizzle (needed for GL_EXT_texture_swizzle extension). Michal, I've got some issues with the way the sampler views are being generated and used inside the CSO module. The point of a sampler view is that it gives the driver an opportunity to do expensive operations required for special sampling modes (which may include copying surface data if hardware is deficient in some way). This approach works if a sampler view is created once, then used multiple times before being deleted. Unfortunately, it seems the changes to support this in the CSO module provide only a single-shot usage model. Sampler views are created in cso_set_XXX_sampler_textures, bound to the context, and then dereferenced/destroyed on the next bind. The reason CSO code looks like this is because it was meant to be an itermediate step towards migration to sampler view model. Fully converting all existing state trackers is non-trivial and thus I chose this conservative approach. State trackers that do not care about extra features a sampler view provides will keep using this one-shot CSO interface with the hope that creation of sampler objects is lighweight (format matches texture format, swizzle matches native texel layout, etc.). On the surface, this hope isn't likely to be fulfilled - lots of hardware doesn't support non-zero first_level. Most cases of drivers implementing sampler views internally are to catch this issue. Of course, it seems like your branch so leaves the existing driver-specific sampler view code in place, so that there are potentially two implementations of sampler views in those drivers. I guess this means that you can get away with the current implementation for now, but it prevents drivers actually taking advantage of the fact that these entities exist in the interface -- they will continue to have to duplicate the concept internally until the state trackers and/or CSO module start caching views. Ideally, everybody moves on and we stop using CSO for sampler views. I prefer putting my effort into incremental migration of state trackers rather than caching something that by definition doesn't need to be cached. The CSO module exists to manage this type of caching on behalf of state trackers. I would have thought that this was a sensible extension of the existing purpose of the CSO module. Won't all state-trackers implementing APIs which don't expose sampler views to the application require essentially the same caching logic, as is the case with regular state? Wouldn't it be least effort to do that caching once only in the CSO module? OK, I see your point. I will make the necessary changes and ping you when that's done. Keith, I changed my mind, went ahead and implemented sampler view caching in mesa state tracker, rather than inside cso context. I strongly believe that doing caching on cso side would be slower and more complicated. A state tracker has a better understanding of the relationship between a texture and sampler view. In case of mesa, this is trivial 1-to-1 mapping. Later, when we'll need more sampler views per texture, we can have a per-texture cache for that, and yes, the code for that would be in cso. There are two other state trackers that need to be fixed: xorg and vega. The transition should be similar to mesa -- I can help with doing that, but I can't do it myself. Once that's done we can purge one-shot sampler view wrappers. What do you think? -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (gallium-sampler-view): st/mesa: Associate a sampler view with an st texture object.
Keith Whitwell wrote on 2010-03-12 14:46: Michal, Is the intention to have 1 sampler view active in the Mesa state tracker, specifically in the cases where min_lod varies? In other words, you seem to have two ways of specifying the same state: pipe_sampler_view::first_level and pipe_sampler::min_lod Is there a case to keep both of these? Or is one enough? It looks like one has to go away, and that would be pipe_sampler::min_lod. And we want to have a per-texture cache of sampler views in mesa. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge
Keith Whitwell wrote on 2010-03-11 14:21: On Thu, 2010-03-11 at 03:16 -0800, michal wrote: Hi, I would like to merge the branch in subject this week. This feature branch allows state trackers to bind sampler views instead of textures to shader stages. A sampler view object holds a reference to a texture and also overrides internal texture format (resource casting) and specifies RGBA swizzle (needed for GL_EXT_texture_swizzle extension). Michal, I've got some issues with the way the sampler views are being generated and used inside the CSO module. The point of a sampler view is that it gives the driver an opportunity to do expensive operations required for special sampling modes (which may include copying surface data if hardware is deficient in some way). This approach works if a sampler view is created once, then used multiple times before being deleted. Unfortunately, it seems the changes to support this in the CSO module provide only a single-shot usage model. Sampler views are created in cso_set_XXX_sampler_textures, bound to the context, and then dereferenced/destroyed on the next bind. The reason CSO code looks like this is because it was meant to be an itermediate step towards migration to sampler view model. Fully converting all existing state trackers is non-trivial and thus I chose this conservative approach. State trackers that do not care about extra features a sampler view provides will keep using this one-shot CSO interface with the hope that creation of sampler objects is lighweight (format matches texture format, swizzle matches native texel layout, etc.). Ideally, everybody moves on and we stop using CSO for sampler views. I prefer putting my effort into incremental migration of state trackers rather than caching something that by definition doesn't need to be cached. Thanks for having a look. To make this change worthwhile, we'd want to somehow cache sampler views and reuse them on multiple draws. Currently drivers that implement views internally hang them off the relevant texture. The choices in this branch are to do it in the CSO module, or push it up to the state tracker. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge
Keith Whitwell wrote on 2010-03-11 16:16: On Thu, 2010-03-11 at 06:05 -0800, michal wrote: Keith Whitwell wrote on 2010-03-11 14:21: On Thu, 2010-03-11 at 03:16 -0800, michal wrote: Hi, I would like to merge the branch in subject this week. This feature branch allows state trackers to bind sampler views instead of textures to shader stages. A sampler view object holds a reference to a texture and also overrides internal texture format (resource casting) and specifies RGBA swizzle (needed for GL_EXT_texture_swizzle extension). Michal, I've got some issues with the way the sampler views are being generated and used inside the CSO module. The point of a sampler view is that it gives the driver an opportunity to do expensive operations required for special sampling modes (which may include copying surface data if hardware is deficient in some way). This approach works if a sampler view is created once, then used multiple times before being deleted. Unfortunately, it seems the changes to support this in the CSO module provide only a single-shot usage model. Sampler views are created in cso_set_XXX_sampler_textures, bound to the context, and then dereferenced/destroyed on the next bind. The reason CSO code looks like this is because it was meant to be an itermediate step towards migration to sampler view model. Fully converting all existing state trackers is non-trivial and thus I chose this conservative approach. State trackers that do not care about extra features a sampler view provides will keep using this one-shot CSO interface with the hope that creation of sampler objects is lighweight (format matches texture format, swizzle matches native texel layout, etc.). On the surface, this hope isn't likely to be fulfilled - lots of hardware doesn't support non-zero first_level. Most cases of drivers implementing sampler views internally are to catch this issue. Of course, it seems like your branch so leaves the existing driver-specific sampler view code in place, so that there are potentially two implementations of sampler views in those drivers. I guess this means that you can get away with the current implementation for now, but it prevents drivers actually taking advantage of the fact that these entities exist in the interface -- they will continue to have to duplicate the concept internally until the state trackers and/or CSO module start caching views. Ideally, everybody moves on and we stop using CSO for sampler views. I prefer putting my effort into incremental migration of state trackers rather than caching something that by definition doesn't need to be cached. The CSO module exists to manage this type of caching on behalf of state trackers. I would have thought that this was a sensible extension of the existing purpose of the CSO module. Won't all state-trackers implementing APIs which don't expose sampler views to the application require essentially the same caching logic, as is the case with regular state? Wouldn't it be least effort to do that caching once only in the CSO module? OK, I see your point. I will make the necessary changes and ping you when that's done. Thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium questions ...
Jerome Glisse wrote on 2010-03-11 18:13: Hi all, I have been a little bit out of the loop on the mesa side, thus now i am having a bunch of questions relating to gallium, apologies if i am asking for obvious thing. First in tgsi compiler there is a Dimension field (struct tgsi_dimension) that i don't understand, it seems all driver are ignoring it, from quick glimpse to tgsi code it's for 2d array addressing, but i think glsl only talks about 1d array. What are the exepction for this field ? Should driver care ? It makes sense for geometry shaders, where you use one dimension to address input vertex, and another one to index a particular input attribute within that vertex. What is the indirect boolean for in src or dst operand of an instruction ? What is the GLSL equivalent of it. This is used to e.g. address constant registers with a non-constant index. A more practical question are what are the gallium branches likely to be merge in the next few weeks ? I will likely have r600g driver in good shape enough in the next few weeks to consider merging it with master but i would like first to port it to the lastest gallium change before merging it so i don't put the burden on people working on those branches. What are the plan to expand TGSI to support new shader feature ? (double precision op, ... I am planning to add a new set of texture fetch/sampler instructions in the immediate future. The gallium-double-opcodes branch has stalled, though, so it won't be merged any time soon. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] PK/UP* and NV_[vertex|fragment]_program* support in Gallium?
Luca Barbieri wrote on 2010-03-01 18:25: I see that PK2US and friends are being removed. These would be necessary to implement NV_fragment_program_option, NV_fragment_program2 and NV_gpu_program4. Currently the no drivers (including Nouveau) support them, but since we already have some support in Mesa (even parsers for the nVidia syntax), it would be nice to support them in Gallium eventually. Not sure about STR/SFL though: they can be encoded/decoded as MOV x, 0/1, but they complete the SETcond instruction set. How about keeping them and adding a capability bit for them? I don't know if anybody cares about those NV extensions, and if there's somebody eager enough to add support for them to the whole gallium stack, nothing stops him/her from re-adding those opcodes. The point of gallium-no-nvidia-opcodes is to strip down TGSI instruction set to what's being actually used by state trackers and implemented by drivers. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] gallium-no-rhw-position branch merge
Hi, This branch removes bypass_vs_clip_and_viewport flag from pipe rasterizer state. The benefits of having this bit around were dubious for everybody and burdensome for driver writers. All the utility code that relied on this flag have been rewritten to pass vertex positions in clip space and set clip and viewport state. I would like to ask the maintainers of u_blitter module to please test my changes and provide feedback. Please review. Thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view
Roland Scheidegger wrote on 2010-02-24 15:18: On 24.02.2010 12:48, Christoph Bumiller wrote: This wasn't a problem before because textures and samplers were linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch, this coordinate normalization bit becomes a problem. NV50 hardware has that bit in the RESOURCE binding, and not the SAMPLER binding, and you can imagine that this will lead to us having to jump through a few annoying looking hoops to accomodate. As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have sampler states that are decoupled from the texture, and which contain a normalized coordinates bit, so it's worth considering not having it there in gallium either. For OpenGL, unnormalized coordinates are only used for RECT textures, and in this case it makes sense to make it a property of the texture. I agree this is not sampler state, but I don't quite agree this should be texture state. This changes how texture coordinates get interpreted in the interpolator - in that sense it is similar to the cylindrical texture coord wrap which we moved away from sampler state recently. This one got moved to shader declaration. I wonder if the normalization bit should be treated the same. Though OTOH you're quite right that in OpenGL this really is texture property (it is a different texture target after all), and afaik d3d doesn't support non-normalized coords (?). Hmm... Isn't it the case that for RECT targets we clear the bit, and for others we always set it? In mesa st I see: if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB) sampler-normalized_coords = 1; By definition, RECT texture with normalised coordinates is just an NPOT. If we removed this apparently redundant flag, would that make nouveau developers life easier? And, finally, I've seen you reverted the changes for independent image and sampler index in the texture opcodes. What's up with that ? Is the code not nice enough, or has the idea been discarded and by problem disappears ? Please consider this branch dead. It will be easier for me to introduce new, optional sampler and fetch opcodes a'la GL 3.0. There's just too much code to fix and test and we still want the older hardware not to stand on its head to try and translate back to old model. Thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view
Christoph Bumiller wrote on 2010-02-25 19:39: On 25.02.2010 19:00, Brian Paul wrote: Roland Scheidegger wrote: On 25.02.2010 18:39, michal wrote: Roland Scheidegger wrote on 2010-02-24 15:18: On 24.02.2010 12:48, Christoph Bumiller wrote: This wasn't a problem before because textures and samplers were linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch, this coordinate normalization bit becomes a problem. NV50 hardware has that bit in the RESOURCE binding, and not the SAMPLER binding, and you can imagine that this will lead to us having to jump through a few annoying looking hoops to accomodate. As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have sampler states that are decoupled from the texture, and which contain a normalized coordinates bit, so it's worth considering not having it there in gallium either. For OpenGL, unnormalized coordinates are only used for RECT textures, and in this case it makes sense to make it a property of the texture. I agree this is not sampler state, but I don't quite agree this should be texture state. This changes how texture coordinates get interpreted in the interpolator - in that sense it is similar to the cylindrical texture coord wrap which we moved away from sampler state recently. This one got moved to shader declaration. I wonder if the normalization bit should be treated the same. Though OTOH you're quite right that in OpenGL this really is texture property (it is a different texture target after all), and afaik d3d doesn't support non-normalized coords (?). Hmm... Isn't it the case that for RECT targets we clear the bit, and for others we always set it? In mesa st I see: if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB) sampler-normalized_coords = 1; By definition, RECT texture with normalised coordinates is just an NPOT. If we removed this apparently redundant flag, would that make nouveau developers life easier? But we don't have rect targets in gallium hence we need the flag. I think conceptually this makes sense since for texture layouts etc. drivers won't care one bit if this is 2d npot or rect texture. Though I guess introducing rect targets instead would be another option. We should also be thinking about texture array targets. With a 2D texture array, the S and T coords would be normalized, but not R. I think we either need new texture targets for RECT, 1D_ARRAY, 2D_ARRAY, etc. or per-dimension normalization flags. I'm thinking the former may be better (simpler) since textures are created as a particular type and not changed afterward. We also know the texture type/target when we execute TEX shader instructions. If it's part of sampler state it gives the impression that it's variable state, but it really isn't. We'd also need a BUFFER target then, they also have scaled coordinates. The problem is I think that this drivers gallium a little towards catering towards specific APIs (OpenGL). OpenCL for instance does have a per sampler normalization bit iirc, but it seems there's no hardware that reflects this property. Then again, TGSI does have a RECT target already, so we might as well add corresponding PIPE targets. I want to remind again that the normalization bit only becomes problematic once samplers and textures can be independently combined, and that it seems older hardware can't nicely do this anyway, except if they take it upon them to recompile their shaders (although I hear some need to do that already ...) I admit I'm actually being a bit selfish here, trying to get the interface more adapted to nv50, but, if other hardware doesn't have conflicting views, why not ? Maybe I should accept nv50 is getting old. Why do you say that? NV50 is a DX10-level card -- it deserves better treatment. Your request is valid and we should go and ask gallium gatekeepers to get this change pushed. Thanks. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
Roland Scheidegger wrote on 2010-02-12 20:55: On 12.02.2010 20:20, Corbin Simpson wrote: On Fri, Feb 12, 2010 at 10:49 AM, Brian Paul bri...@vmware.com wrote: Roland Scheidegger wrote: On 12.02.2010 19:00, Keith Whitwell wrote: On Fri, 2010-02-12 at 09:56 -0800, Roland Scheidegger wrote: On 12.02.2010 18:42, Keith Whitwell wrote: On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote: On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote: On 12.02.2010 14:44, michal wrote: Keith Whitwell wrote on 2010-02-12 14:28: On Fri, 2010-02-12 at 05:09 -0800, michal wrote: Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. But surely D3D permits two-component texture coordinates, which would be PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)... Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? It seems like we'd want to be able to support both usages - the alternative in texture sampling would be forcing the state tracker to generate varients of the shader when 2-component textures are bound. I would say that's an unreasonable requirement on the state tracker. It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D would want differing expansions in different parts of the pipeline. That indicates a single flag in the context somewhere isn't sufficient to choose between the two. Maybe there need to be two versions of these PIPE_FORMAT_ enums to capture the different values in the missing components? EG: PIPE_FORMAT_R32G32_0001_FLOAT PIPE_FORMAT_R32G32__FLOAT ? or something along those lines?? You are right. Alternatively, follow the more sane API (GL apparently), assume 0001 as default and use the infix to override. Note it's not just GL. D3D10 uses same expansion. Only D3D9 is different. Well for texture sampling anyway, I don't know what d3d does for vertex formats. Though for most hardware it would make sense to have only one format per different expansion, and use some swizzling parameter for sampling, because that's actually how the hardware works. But not all drivers will be able to do this, unfortunately. You mean, having a swizzle in pipe_sampler_state ? It sounds a good idea. In the worst case some component will inevitably need to make shader variants with different swizzles. In this case it probably makes sense to be the pipe driver -- it's a tiny shader variation which could be done without recompiling the whole shader, but if the state tracker does it then the pipe driver will always have to recompile. In the best case it is handled by the hardware's texture sampling unit. It's in theory similar to baking the swizzle in the format as Keith suggested, but cleaner IMHO. The question is whether it makes sense to have full xwyz01 swizzles, or just 01 swizzles. Another alternative is to just add the behaviour we really need - a single flag at context creation time that says what the behaviour of the sampler should be for these textures. Then the driver wouldn't have to worry about varients or mixing two different expansions. Hardware (i965 at least) seems to have one global mode to switch between these, and that's all we need to choose the right behaviour for each state tracker. It might be simpler all round just to specify it at context creation. Yes, for rg01 vs rg11
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .
Keith Whitwell wrote on 2010-02-12 14:28: On Fri, 2010-02-12 at 05:09 -0800, michal wrote: Keith Whitwell wrote on 2010-02-12 13:39: On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: master Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f Author: Michal Krol mic...@vmware.com Date: Fri Feb 12 13:32:35 2010 +0100 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats. Michal, Is this more like two different users expecting two different results in those unused columns? In particular, we definitely require the missing elements to be extended to (0,0,0,1) when fetching vertex data, and probably also in OpenGL texture sampling (if we supported these formats for that). Gallium should follow D3D rules, so I've been following D3D here. Also, util_unpack_color_ub() in u_pack_color.h already sets the remaining fields to 0xff. Note that D3D doesn't have the problem with expanding vertex attribute data since you can't have X or XY vertex positions, only XYZ (with W extended to 1 as in GL) and XYZW. But surely D3D permits two-component texture coordinates, which would be PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)... Brian added a table of differences between GL and other APIs recently to gallium/docs - does your change agree with that? Where's that exactly, I can't find it? It seems like we'd want to be able to support both usages - the alternative in texture sampling would be forcing the state tracker to generate varients of the shader when 2-component textures are bound. I would say that's an unreasonable requirement on the state tracker. It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D would want differing expansions in different parts of the pipeline. That indicates a single flag in the context somewhere isn't sufficient to choose between the two. Maybe there need to be two versions of these PIPE_FORMAT_ enums to capture the different values in the missing components? EG: PIPE_FORMAT_R32G32_0001_FLOAT PIPE_FORMAT_R32G32__FLOAT ? or something along those lines?? You are right. Alternatively, follow the more sane API (GL apparently), assume 0001 as default and use the infix to override. -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] Gallium query types
Hi, I can't find any information regarding two Gallium query types. No documentation, no source code. #define PIPE_QUERY_PRIMITIVES_GENERATED 1 #define PIPE_QUERY_PRIMITIVES_EMITTED2 Do they have something to do with NV_transform_feedback extension? If not, do they mean the number of primitves before clipping, and after clipping, respectively? Thanks. -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
Brian Paul wrote on 2010-02-04 22:07: michal wrote: Brian Paul wrote on 2010-02-03 17:58: Keith Whitwell wrote: Michal, why do you need this for linear interpolator and not perspective? I think d3d mobile let you disable perspective correct texturing, but it is always enabled for normal d3d. I could not think of a use case that uses perspective and cylindrical interpolation at the same time. If you think it's valid, we can implement cylindrical wrapping for perspective interpolator, but then I am not sure how exactly it should be done, i.e. should we divide and then wrap or the opposite? Is there some way we can figure out what DX9 does here? Maybe a quick test? I suspect cylindrical wrapping would be done after the divide. A quick test shows it is legal to have perspective and cylindrical interpolation. In fact, I see no difference between projected and non-projected version with REF device -- both are perspective correct. I think I am stuck at this point and need further help. I am trying to modify tri_persp_coeff() in softpipe in a similar manner to tri_linear_coeff(), but all I get are lousy rendering artifacts. If we need do cylindrical wrapping after divide, it must be done as part of shader interpolator, but the only place where we have enough information to do wrapping is in primitive setup. Do you have a patch relative to gallium-cylindrical-wrap? I'll take a look. Brian, I have no half-working patch for you, sorry. I tried a few approaches, but they were nonsensical. The linear coeff calculation is simple: calculate distance between two coordinates, and if it's greater than 0.5, apply wrapping by adjusting the distance. However, for the perspective correct coeffs, we divide early by position.w before calculating the distance, and so my approach that worked for linear fails here. I am either not comprehending the math here (why do we divide the second time in interpolator, for instance?) or we need to put more information into struct tgsi_interp_coef so that the interpolator code has enough information to do wrapping on its own without help of primitive setup. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
michal wrote on 2010-02-05 11:05: Brian Paul wrote on 2010-02-04 22:07: michal wrote: Brian Paul wrote on 2010-02-03 17:58: Keith Whitwell wrote: Michal, why do you need this for linear interpolator and not perspective? I think d3d mobile let you disable perspective correct texturing, but it is always enabled for normal d3d. I could not think of a use case that uses perspective and cylindrical interpolation at the same time. If you think it's valid, we can implement cylindrical wrapping for perspective interpolator, but then I am not sure how exactly it should be done, i.e. should we divide and then wrap or the opposite? Is there some way we can figure out what DX9 does here? Maybe a quick test? I suspect cylindrical wrapping would be done after the divide. A quick test shows it is legal to have perspective and cylindrical interpolation. In fact, I see no difference between projected and non-projected version with REF device -- both are perspective correct. I think I am stuck at this point and need further help. I am trying to modify tri_persp_coeff() in softpipe in a similar manner to tri_linear_coeff(), but all I get are lousy rendering artifacts. If we need do cylindrical wrapping after divide, it must be done as part of shader interpolator, but the only place where we have enough information to do wrapping is in primitive setup. Do you have a patch relative to gallium-cylindrical-wrap? I'll take a look. Brian, I have no half-working patch for you, sorry. I tried a few approaches, but they were nonsensical. The linear coeff calculation is simple: calculate distance between two coordinates, and if it's greater than 0.5, apply wrapping by adjusting the distance. However, for the perspective correct coeffs, we divide early by position.w before calculating the distance, and so my approach that worked for linear fails here. I am either not comprehending the math here (why do we divide the second time in interpolator, for instance?) or we need to put more information into struct tgsi_interp_coef so that the interpolator code has enough information to do wrapping on its own without help of primitive setup. OK, I managed to correctly implement cylindrical wrap in softpipe both for linear and perspective interpolation, both for lines and triangles. Tested with Brian's cylwrap test app -- it works. Please re-review. Thanks. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
Brian Paul wrote on 2010-02-03 17:58: Keith Whitwell wrote: Michal, why do you need this for linear interpolator and not perspective? I think d3d mobile let you disable perspective correct texturing, but it is always enabled for normal d3d. I could not think of a use case that uses perspective and cylindrical interpolation at the same time. If you think it's valid, we can implement cylindrical wrapping for perspective interpolator, but then I am not sure how exactly it should be done, i.e. should we divide and then wrap or the opposite? Is there some way we can figure out what DX9 does here? Maybe a quick test? I suspect cylindrical wrapping would be done after the divide. A quick test shows it is legal to have perspective and cylindrical interpolation. In fact, I see no difference between projected and non-projected version with REF device -- both are perspective correct. I think I am stuck at this point and need further help. I am trying to modify tri_persp_coeff() in softpipe in a similar manner to tri_linear_coeff(), but all I get are lousy rendering artifacts. If we need do cylindrical wrapping after divide, it must be done as part of shader interpolator, but the only place where we have enough information to do wrapping is in primitive setup. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch
Keith, This feature branch adds cylindrical wrap texcoord mode to gallium shader tokens and removes prefilter field from sampler state. Implemented cylindrical wrapping for linear interpolator in softpipe. Not sure whether it makes sense to do it for perspective interpolator. Documented TGSI declaration token. Sample fragment shader declaration that wraps S and T coordinates follows. DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY Please review so I can merge it to master. Thanks. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots
Luca Barbieri wrote on 2010-02-01 21:42: 1. All the semantic indices in OpenGL are limited, according to the ARB specification 2. All the sematic indices in DirectX 9/10 are limited, according to http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx At least for SM3.0, one can specify a vertex shader output semantic like COLOR15 and have it running as long as one has also a pixel shader with a matching input semantic. Though I agree with you we don't really want to go this route and have something more sensible. We could, for example, limit COLOR and BCOLOR indices to [0, 1], remove FOG and NORMAL names, and have a well-defined limit on GENERIC index value. After all, we only need non-generic semantics to communicate with the fixed-function part of the pipeline, that is rasteriser. name index range POSITION no limit? COLOR 0..1, explicit clamp? BCOLOR 0..1, explicit clamp? FOGremove? PSIZE 0 GENERIC0..max generics NORMAL remove FACE 0 EDGEFLAG 0 PRIMID 0 INSTANCEID 0 As for the routing table thing, I am not really convinced. The GLSL mechanism to link shaders based on varying names is GL-specific and thus should stay inside Mesa state tracker. In fact, D3D10 runtime is doing exactly the same thing and generating shader varients on the fly as they are mixed and matched by the application. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge
Brian Paul wrote on 2010-01-22 17:56: michal wrote: Brian Paul wrote on 2010-01-21 21:57: michal wrote: Hi, This simple feature branch adds support for two-dimensional constant buffers in TGSI. An example shader would look like this: FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL CONST[1][1..2] MAD OUT[0], IN[0], CONST[1][2], CONST[1][1] END For this to work, one needs to bind a buffer to slot nr 1 containing at least 3 vectors. Just a terminology thing: this feature really implements arrays of constant buffers, not really two-dimensional buffers, right? That's correct -- the access to constbuf data is two-dimensional, but the constbufs themselves are an array of differently-sized constat buffers. In p_state.h we should probably rename PIPE_MAX_CONSTANT to PIPE_MAX_CONSTANT_BUFFERS to be clearer. Don't we need a new PIPE_CAP_MAX_CONSTANT_BUFFERS query? Maybe even a query per shader stage? What about maximum size of a single constant buffer? I would think this is a more crtical parameter than the number of constbuf slots the driver support. Yeah, I thought we already had a query for that, but we don't. I'd suggest: PIPE_CAP_MAX_CONST_BUFFERS PIPE_CAP_MAX_CONST_BUFFER_SIZE (in bytes) All, Thanks for your comments, I have commited my changes to the branch and am awaiting for more comments. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] TGSI build and sse2 removal
I would like to have those two modules go away, as they are maintenance pain with no real benefits. The build module has been superseded by the ureg module, and apparently all third-party code has already migrated or is in the process of porting to new interface. I would like to nuke it if nobody minds. For sse2, I am looking at simplifying it enough to be able to accelerate pass-thru fragment shaders and simple vertex shaders. That's it. For more sophisticated stuff we already have llvmpipe. If nobody objects, I am going to start the rework next week. Thanks. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI build and sse2 removal
Brian Paul wrote on 2010-01-25 16:09: José Fonseca wrote: Michal, On Mon, 2010-01-25 at 06:27 -0800, michal wrote: I would like to have those two modules go away, as they are maintenance pain with no real benefits. The build module has been superseded by the ureg module, and apparently all third-party code has already migrated or is in the process of porting to new interface. I would like to nuke it if nobody minds. I'm fine with this. We can't remove this until we switch to ureg in the draw code. The draw_pipe_pstipple.c, draw_pipe_aaline.c and draw_pipe_aapoint.c files still haven't been converted to use tgsi_ureg. There may be some other uses elsewhere. Michal, can you update that code first? That's the plan. For sse2, I am looking at simplifying it enough to be able to accelerate pass-thru fragment shaders and simple vertex shaders. That's it. For more sophisticated stuff we already have llvmpipe. I agree with this in principle, but I think it's better not to get too much ahead of ourselves here: drivers are using tgsi_exec/sse2 for software vertex processing fallbacks. And while the plan is indeed to move the LLVM JIT code generation out of llvmpipe into the auxiliary modules so that all pipe drivers can use that for fallbacks, the fact is we're not there yet. So for tgsi_sse2 I think it's better not to introduce any performance regressions in vertex processing until llvm code generation is in place and working for everybody. I agree. It's too early to remove the sse2 code. OK, that makes sense, I will leave it alone for the time being. Thanks, guys. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge
Brian Paul wrote on 2010-01-21 21:57: michal wrote: Hi, This simple feature branch adds support for two-dimensional constant buffers in TGSI. An example shader would look like this: FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL CONST[1][1..2] MAD OUT[0], IN[0], CONST[1][2], CONST[1][1] END For this to work, one needs to bind a buffer to slot nr 1 containing at least 3 vectors. Just a terminology thing: this feature really implements arrays of constant buffers, not really two-dimensional buffers, right? That's correct -- the access to constbuf data is two-dimensional, but the constbufs themselves are an array of differently-sized constat buffers. In p_state.h we should probably rename PIPE_MAX_CONSTANT to PIPE_MAX_CONSTANT_BUFFERS to be clearer. Don't we need a new PIPE_CAP_MAX_CONSTANT_BUFFERS query? Maybe even a query per shader stage? What about maximum size of a single constant buffer? I would think this is a more crtical parameter than the number of constbuf slots the driver support. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] python and constant buffers
José Fonseca wrote on 2010-01-20 14:32: On Wed, 2010-01-20 at 03:55 -0800, michal wrote: Jose, How one can upload data to buffers in python state tracker? I am trying to do the following: +cb0_data = [ +0.0, 0.0, 0.0, 0.0, +0.0, 0.0, 0.0, 1.0, +1.0, 1.0, 1.0, 1.0, +2.0, 4.0, 8.0, 1.0, +] + +constbuf0 = dev.buffer_create( +16, +PIPE_BUFFER_USAGE_CONSTANT | + PIPE_BUFFER_USAGE_GPU_READ | + PIPE_BUFFER_USAGE_GPU_WRITE | + PIPE_BUFFER_USAGE_CPU_READ | + PIPE_BUFFER_USAGE_CPU_WRITE, +4 * 4 * 4) + +constbuf0.write_(cb0_data, 4 * 4 * 4) But I can't find a way to convert a list of floats to (char *). Do have an idea how to do it? Hi Michal, The Gallium - Python bindings are autogenerated by SWIG and there are several things which are not very pythonic. Writing data into/out of the buffers is one of them. ATM the only way to do this is using the python struct module, and pack the floats into a string... That is: import struct data = '' data += struct.pack('4f', 1.0, 2.0, 3.0, 4.0) ... That's perfect. Thanks, Jose. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge
Hi, This simple feature branch adds support for two-dimensional constant buffers in TGSI. An example shader would look like this: FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL CONST[1][1..2] MAD OUT[0], IN[0], CONST[1][2], CONST[1][1] END For this to work, one needs to bind a buffer to slot nr 1 containing at least 3 vectors. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] python and constant buffers
Jose, How one can upload data to buffers in python state tracker? I am trying to do the following: +cb0_data = [ +0.0, 0.0, 0.0, 0.0, +0.0, 0.0, 0.0, 1.0, +1.0, 1.0, 1.0, 1.0, +2.0, 4.0, 8.0, 1.0, +] + +constbuf0 = dev.buffer_create( +16, +PIPE_BUFFER_USAGE_CONSTANT | + PIPE_BUFFER_USAGE_GPU_READ | + PIPE_BUFFER_USAGE_GPU_WRITE | + PIPE_BUFFER_USAGE_CPU_READ | + PIPE_BUFFER_USAGE_CPU_WRITE, +4 * 4 * 4) + +constbuf0.write_(cb0_data, 4 * 4 * 4) But I can't find a way to convert a list of floats to (char *). Do have an idea how to do it? -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Implement double opcodes: ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad
Igor Oliveira wrote on 2010-01-18 19:55: The patches implement gallium opcodes ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad and add tests to it. They are applicable in gallium-double-opcode branch. The next patchs i will add documentation and missing double opcodes implementation like dfrac, dldexp and dfracexp. Excellent, commited with cosmetic changes. Thanks! -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] add dfrac, dfracexp, dldexp opcodes to gallium
Igor Oliveira wrote on 2010-01-20 00:37: Hi, These patches add support to dfrac, dldexp and fracexp opcodes. The fracexp opcode i think it is the only opcode that use 2 DST registers. The first one is used to store the fractional part(it store in a double) and the second one is used to store the exponent part(it is a int). In the tests we can see it working. static void +micro_dfrac(union tgsi_double_channel *dst, +const union tgsi_double_channel *src) +{ + dst-d[0] = src-d[0] - floor(src-d[0]); + dst-d[1] = src-d[1] - floor(src-d[0]); + dst-d[2] = src-d[2] - floor(src-d[0]); + dst-d[3] = src-d[3] - floor(src-d[0]) Igor, Shouldn't the second line have floor(src-d[1]), and so on? -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] instanced-arrays branch
Chia-I Wu wrote on 2010-01-16 02:28: On Fri, Jan 15, 2010 at 07:22:52PM +0100, michal wrote: I think I will try to manually patch it later. Thanks! The first line of the patch is somehow garbled. But I am not sure if that is a good fix, so please go ahead. Commited, thanks. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add GALLIUM_DUMP_VS environment variable
Luca Barbieri wrote on 2009-12-26 02:06: Add GALLIUM_DUMP_VS to dump the vertex shader to the console like GALLIUM_DUMP_FS in softpipe. Commited, thanks. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] instanced-arrays branch
Chia-I Wu wrote on 2010-01-15 15:09: On Fri, Jan 15, 2010 at 09:57:32PM +0800, Chia-I Wu wrote: On Wed, Jan 13, 2010 at 2:02 AM, michal mic...@vmware.com wrote: I would like to merge this branch to master soon. I am seeing all sorts of funny behaviors after the merge with OpenVG. The attached patch seems to fix the problem. I am not sure if this is the right fix... There are two typos in the changes to auxiliary/vl/. Here is the updated patch. Sorry for the trouble. The patch does not apply. /c/src/mesa (master) $ git am 0001-gallium-Fix-uninitialized-instance-divisor-and-index-v2.patch Applying: gallium: Fix uninitialized instance divisor and index. c:/src/mesa/.git/rebase-apply/patch:18: trailing whitespace. unsigned instance_id_index = ~0; c:/src/mesa/.git/rebase-apply/patch:30: trailing whitespace. velements[i].instance_divisor = 0; c:/src/mesa/.git/rebase-apply/patch:42: trailing whitespace. c-vertex_elems[0].instance_divisor = 0; c:/src/mesa/.git/rebase-apply/patch:50: trailing whitespace. c-vertex_elems[1].instance_divisor = 0; c:/src/mesa/.git/rebase-apply/patch:62: trailing whitespace. r-vertex_elems[0].instance_divisor = 0; error: patch failed: src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c:6 0 error: src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c: patch does not apply error: patch failed: src/gallium/auxiliary/util/u_draw_quad.c:61 error: src/gallium/auxiliary/util/u_draw_quad.c: patch does not apply error: patch failed: src/gallium/auxiliary/vl/vl_compositor.c:316 error: src/gallium/auxiliary/vl/vl_compositor.c: patch does not apply error: patch failed: src/gallium/auxiliary/vl/vl_mpeg12_mc_renderer.c:891 error: src/gallium/auxiliary/vl/vl_mpeg12_mc_renderer.c: patch does not apply error: patch failed: src/gallium/state_trackers/vega/polygon.c:293 error: src/gallium/state_trackers/vega/polygon.c: patch does not apply Patch failed at 0001 gallium: Fix uninitialized instance divisor and index. When you have resolved this problem run git am --resolved. If you would prefer to skip this patch, instead run git am --skip. To restore the original branch and stop patching run git am --abort. I think I will try to manually patch it later. Thanks! -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] instanced-arrays branch
michal wrote on 2010-01-12 19:02: Keith, I would like to merge this branch to master soon. It adds new entrypoints to pipe_context -- draw_arrays_instanced() and draw_elements_instanced(). A new system value is introduced to TGSI that allows vertex shaders to access current instance ID. The new entrypoints are implemented in draw module, and softpipe driver has been properly adjusted for that. There is no capability bit defined for that -- I wasn't sure whether we still want to go this route. Please review, thanks. Attached a patch that adds documentation for drawing commands, done against master branch since instanced-arrays branch doesn't have the gallium docs tree yet. Thanks. diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst index 21f5f91..9686537 100644 --- a/src/gallium/docs/source/context.rst +++ b/src/gallium/docs/source/context.rst @@ -72,12 +72,64 @@ stencil-only clears of packed depth-stencil buffers. Drawing ^^^ -``draw_arrays`` +``draw_arrays`` draws a specified primitive. -``draw_elements`` +This command is equivalent to calling ``draw_arrays_instanced`` +with ``startInstance`` set to 0 and ``instanceCount`` set to 1. + +``draw_elements`` draws a specified primitive using an optional +index buffer. + +This command is equivalent to calling ``draw_elements_instanced`` +with ``startInstance`` set to 0 and ``instanceCount`` set to 1. ``draw_range_elements`` +XXX: this is (probably) a temporary entrypoint, as the range +information should be available from the vertex_buffer state. +Using this to quickly evaluate a specialized path in the draw +module. + +``draw_arrays_instanced`` draws multiple instances of the same primitive. + +This command is equivalent to calling ``draw_elements_instanced`` +with ``indexBuffer`` set to NULL and ``indexSize`` set to 0. + +``draw_elements_instanced`` draws multiple instances of the same primitive +using an optional index buffer. + +For instanceID in the range between ``startInstance`` +and ``startInstance``+``instanceCount``-1, inclusive, draw a primitive +specified by ``mode`` and sequential numbers in the range between ``start`` +and ``start``+``count``-1, inclusive. + +If ``indexBuffer`` is not NULL, it specifies an index buffer with index +byte size of ``indexSize``. The sequential numbers are used to lookup +the index buffer and the resulting indices in turn are used to fetch +vertex attributes. + +If ``indexBuffer`` is NULL, the sequential numbers are used directly +as indices to fetch vertex attributes. + +If a given vertex element has ``instance_divisor`` set to 0, it is said +it contains per-vertex data and vertex attribute address needs +to be recalculated for every index. + + attribAddr = ``stride`` * index + ``src_offset`` + +If a given vertex element has ``instance_divisor`` set to non-zero, +it is said it contains per-instance data and vertex attribute address +needs to recalculated for every ``instance_divisor``-th instance. + + attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset`` + +In the above formulas, ``src_offset`` is taken from the given vertex element +and ``stride`` is taken from a vertex buffer associated with the given +vertex element. + +The value of ``instanceID`` can be read in a vertex shader through a system +value register declared with INSTANCEID semantic name. + Queries ^^^ -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATH] Add double opcodes to TGSI Revision 2
Igor Oliveira wrote on 2010-01-12 12:52: Michal: i am seeing the double opcode branch i can move the opcode codes to use the exec_double_binary/unary Igor, Yes, that was my intention. It would be great if you looked at what has been done in that branch and for each new opcode provide reference implementation in tgsi_exec.c, document it in gallium/docs/source/tgsi.rst and it would super cool if you could add a unit test in python/tests/regress/fragment-shader. Thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] instanced-arrays branch
Keith, I would like to merge this branch to master soon. It adds new entrypoints to pipe_context -- draw_arrays_instanced() and draw_elements_instanced(). A new system value is introduced to TGSI that allows vertex shaders to access current instance ID. The new entrypoints are implemented in draw module, and softpipe driver has been properly adjusted for that. There is no capability bit defined for that -- I wasn't sure whether we still want to go this route. Please review, thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (glsl-pp-rework-2): scons: Get GLSL code building correctly when cross compiling.
José Fonseca wrote on 2010-01-12 19:51: On Mon, 2010-01-11 at 15:28 -0800, Stephan Raue wrote: Hi all, Am 10.12.2009 17:36, schrieb José Fonseca: On Thu, 2009-12-10 at 08:31 -0800, Jose Fonseca wrote: Module: Mesa Branch: glsl-pp-rework-2 Commit: 491f384c3958067e6c4c994041f5d8d413b806bc URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=491f384c3958067e6c4c994041f5d8d413b806bc Author: José Fonsecajfons...@vmware.com Date: Thu Dec 10 16:29:04 2009 + scons: Get GLSL code building correctly when cross compiling. This is quite messy. GLSL code has to be built twice: one for the host OS, another for the target OS. is there also an solution for building without scons? i am get the follow when i am crosscompile for x86_64 target with i386 host: gmake[3]: Entering directory `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa/shader/slang/library' ../../../../../src/glsl/apps/compile fragment slang_common_builtin.gc slang_common_builtin_gc.h ../../../../../src/glsl/apps/compile: ../../../../../src/glsl/apps/compile: cannot execute binary file gmake[3]: *** [slang_common_builtin_gc.h] Error 126 gmake[3]: Leaving directory `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa/shader/slang/library' gmake[2]: *** [glsl_builtin] Error 1 gmake[2]: Leaving directory `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa' make[1]: *** [subdirs] Error 1 make[1]: Leaving directory `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src' make: *** [default] Error 1 Nope, and I don't think it will be easy, since Mesa's makefile system doesn't support building stuff on a separate dir. Yes, and that's a good reason to go back to how it was before -- regenerating the files and checking them in by a developer that made a change to input files. I will do the necessary changes. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] add double opcodes to tgsi
Igor Oliveira wrote on 2010-01-11 14:37: These patches add support to double opcodes as discussed in mail list. The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul, dmuladd, drcp and dslt. They are used like suggested by Zack: MOVD A.xy, C.xy, c.xy where x is the lsb and y is the msb. There are still missing some opcodes being implemented(i will send the code soon), they are: dfrac, dfracexp, dldexp and convert between float and double. Igor, There are some bits and pieces in your patch that I am not sure if they are correct. To understand that, let me first create a new feature branch (gallium-double-opcodes) and add a few basic opcodes (F2D, D2F, DMOV, DADD). Also, since there is no API state tracker that supports doubles, I will add a test to the python state tracker to see how well things are going. Once done, it will be a lot easier for us to read your patches that introduce new opcodes. What do you think? -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [mesa] svga: Fix error: cannot take address of bit-field 'texture_target' in svga_tgsi.h
Sedat Dilek wrote on 2010-01-06 18:54: Compile-tested OK. Thanks, commited. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] add support to double opcodes
José Fonseca wrote on 2010-01-07 14:45: On Thu, 2010-01-07 at 05:25 -0800, Zack Rusin wrote: On Thursday 07 January 2010 06:50:36 José Fonseca wrote: I wonder if storage size of registers is such a big issue. Knowing the storage size of a register matters mostly for indexable temps. For regular assignments and intermediate computations storage everything gets transformed in SSA form, and the register size can be determined from the instructions where it is generated/used and there is no need for consistency. For example, imagine a shader that has: TEX TEMP[0], SAMP[0], IN[0] // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT -- use 4x32bit float registers MAX ?? ... TEX TEMP[0], SAMP[1], IN[0] // SAMP[1] is a PIPE_FORMAT_R64G64B64A64_FLOAT -- use 4x64bit double registers DMAX , TEMP[0], ??? That's not an issue because such a format doesn't exist. There's no 256bit sampling in any api. It's one of the self-inflicted wounds that we have. R64G64 is the most you'll get right now. That's interesting. Never realized that. TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both PIPE_FORMAT_R8G8B8A8_UNORM -- use 4x8bit unorm registers MOV OUT[0], TEMP[0] etc. There is actually programmable 3d hardware out there that has special 4x8bit registers, and for performance the compiler has to deduct where to use those 4xbit. llvmpipe will need to do similar thing, as the smaller the bit-width the higher the throughput. And at least current gallium statetrackers will reuse temps with no attempt to maintain consistency in use. So if the compilers already need to deal with this, if this notion that registers are 128bits is really necessary, and will prevail in the long term. Somehow this is the core issue it's the fact that TGSI is untyped anything but register size is constant implies TGSI is typed but the actual types have to be deduced by the drivers which goes against what Gallium was about (we put the complexity in the driver). The question of 8bit vs 32bit and 64bit vs 32bit are really different questions. The first one is about optimization - it will work perfectly well if the 128bit registers will be used, the second one is about correctness - it will not work if 128bit registers will be used for doubles and it will not work if 256bit registers will be used for floats. True. Also we don't have a 4x8bit instructions, they're all 4x32bit instructions (float, unsigned ints, signed ints), so doubles will be the first differently sized instructions. Which in turn will mean that either TGSI will have to be actually statically typed, but not typed declared i.e. D_ADD will only be able to take two 256bit registers as inputs and if anything else is passed it has to throw an error, which is especially difficult that those registers didn't have a size declared but it would have to be inferred from previous instructions, or we'd have to allow mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 which simply moves the problem from above into the driver. Really, unless we'll say the entire pipeline can run in 4x64 like we did for floats then I don't see an easier way of dealing with this than the xy, zw, swizzle form. Ok. I didn't felt strongly either way, but now I'm more convinced that restricting xy zw swizzles is less painful. Thanks for explaining this Zack. Zack, 1. Do I understand correctly that while D_ADD dst.xy, src1.xy, src2.zw will add one double, is the following code D_ADD dst, src1, src2.zwxy also valid, and results in two doubles being added together? 2. Is the list of double-precision opcodes proposed by Igor roughly enough for OpenCL implementation? Thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.
Attached. From c91abe0b58abc69743c162fd55f7461a716b9141 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Wed, 6 Jan 2010 09:48:41 +0100 Subject: [PATCH] Make sure we use only signed/unsigned ints with bitfields. Seems to be the only way to stay fully portable. --- src/gallium/drivers/svga/svga_tgsi.h | 18 +- .../dri/r300/compiler/radeon_pair_regalloc.c |2 +- .../drivers/dri/r300/compiler/radeon_program.h | 14 +++--- .../dri/r300/compiler/radeon_program_pair.h| 10 +- 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/svga/svga_tgsi.h b/src/gallium/drivers/svga/svga_tgsi.h index 896c90a..d132525 100644 --- a/src/gallium/drivers/svga/svga_tgsi.h +++ b/src/gallium/drivers/svga/svga_tgsi.h @@ -39,24 +39,24 @@ struct tgsi_token; struct svga_vs_compile_key { - ubyte need_prescale:1; - ubyte allow_psiz:1; unsigned zero_stride_vertex_elements; - ubyte num_zero_stride_vertex_elements:6; + unsigned need_prescale:1; + unsigned allow_psiz:1; + unsigned num_zero_stride_vertex_elements:6; }; struct svga_fs_compile_key { - boolean light_twoside:1; - boolean front_cw:1; + unsigned light_twoside:1; + unsigned front_cw:1; ubyte num_textures; ubyte num_unnormalized_coords; struct { - ubyte compare_mode : 1; - ubyte compare_func : 3; - ubyte unnormalized : 1; + unsigned compare_mode : 1; + unsigned compare_func : 3; + unsigned unnormalized : 1; - ubyte width_height_idx : 7; + unsigned width_height_idx : 7; ubyte texture_target; } tex[PIPE_MAX_SAMPLERS]; diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c index 828d0c8..b2fe7f7 100644 --- a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c +++ b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c @@ -49,7 +49,7 @@ struct register_info { unsigned int Used:1; unsigned int Allocated:1; - rc_register_file File:3; + unsigned int File:3; unsigned int Index:RC_REGISTER_INDEX_BITS; }; diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program.h b/src/mesa/drivers/dri/r300/compiler/radeon_program.h index 0359288..e318867 100644 --- a/src/mesa/drivers/dri/r300/compiler/radeon_program.h +++ b/src/mesa/drivers/dri/r300/compiler/radeon_program.h @@ -39,7 +39,7 @@ struct radeon_compiler; struct rc_src_register { - rc_register_file File:3; + unsigned int File:3; /** Negative values may be used for relative addressing. */ signed int Index:(RC_REGISTER_INDEX_BITS+1); @@ -55,7 +55,7 @@ struct rc_src_register { }; struct rc_dst_register { - rc_register_file File:3; + unsigned int File:3; /** Negative values may be used for relative addressing. */ signed int Index:(RC_REGISTER_INDEX_BITS+1); @@ -79,20 +79,20 @@ struct rc_sub_instruction { /** * Opcode of this instruction, according to \ref rc_opcode enums. */ - rc_opcode Opcode:8; + unsigned int Opcode:8; /** * Saturate each value of the result to the range [0,1] or [-1,1], * according to \ref rc_saturate_mode enums. */ - rc_saturate_mode SaturateMode:2; + unsigned int SaturateMode:2; /** * Writing to the special register RC_SPECIAL_ALU_RESULT */ /*...@{*/ - rc_write_aluresult WriteALUResult:2; - rc_compare_func ALUResultCompare:3; + unsigned int WriteALUResult:2; + unsigned int ALUResultCompare:3; /*...@}*/ /** @@ -103,7 +103,7 @@ struct rc_sub_instruction { unsigned int TexSrcUnit:5; /** Source texture target, one of the \ref rc_texture_target enums */ - rc_texture_target TexSrcTarget:3; + unsigned int TexSrcTarget:3; /** True if tex instruction should do shadow comparison */ unsigned int TexShadow:1; diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h b/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h index 1600598..6685ade 100644 --- a/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h +++ b/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h @@ -52,12 +52,12 @@ struct r300_fragment_program_compiler; struct radeon_pair_instruction_source { unsigned int Used:1; - rc_register_file File:3; + unsigned int File:3; unsigned int Index:RC_REGISTER_INDEX_BITS; }; struct radeon_pair_instruction_rgb { - rc_opcode Opcode:8; + unsigned int Opcode:8; unsigned int DestIndex:RC_REGISTER_INDEX_BITS; unsigned int WriteMask:3; unsigned int OutputWriteMask:3; @@ -74,7 +74,7 @@ struct radeon_pair_instruction_rgb { }; struct radeon_pair_instruction_alpha
Re: [Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.
Keith Whitwell wrote on 2010-01-06 10:43: On Wed, 2010-01-06 at 00:50 -0800, michal wrote: diff --git a/src/gallium/drivers/svga/svga_tgsi.h b/src/gallium/drivers/svga/svga_tgsi.h index 896c90a..d132525 100644 --- a/src/gallium/drivers/svga/svga_tgsi.h +++ b/src/gallium/drivers/svga/svga_tgsi.h @@ -39,24 +39,24 @@ struct tgsi_token; struct svga_vs_compile_key { - ubyte need_prescale:1; - ubyte allow_psiz:1; unsigned zero_stride_vertex_elements; - ubyte num_zero_stride_vertex_elements:6; + unsigned need_prescale:1; + unsigned allow_psiz:1; + unsigned num_zero_stride_vertex_elements:6; }; struct svga_fs_compile_key { - boolean light_twoside:1; - boolean front_cw:1; + unsigned light_twoside:1; + unsigned front_cw:1; ubyte num_textures; ubyte num_unnormalized_coords; struct { - ubyte compare_mode : 1; - ubyte compare_func : 3; - ubyte unnormalized : 1; + unsigned compare_mode : 1; + unsigned compare_func : 3; + unsigned unnormalized : 1; - ubyte width_height_idx : 7; + unsigned width_height_idx : 7; ubyte texture_target; } tex[PIPE_MAX_SAMPLERS]; Michal, these two structs should be kept as small as possible. It looks like there has been some drift away from well-packed fields anyway, but if you're making this change can you please take a moment to repack the fields as a result and get these down to as small as possible? In particular, it looks like fs_compile_key::tex array has probably doubled in size - could you repack it by changing texture_target to, eg: unsigned texture_target:8; or similar? The same would apply for the other ubyte fields that are now probably no longer tightly packed. Attached an update. There was nothing more I could do to svga_vs_compile_key, though, as the zero_stride_vertex_elements field is being fully used. From af7c95dd2539e6b5d64ad62c30ef6952e83fcf98 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Wed, 6 Jan 2010 11:23:43 +0100 Subject: [PATCH] Make sure we use only signed/unsigned ints with bitfields. Seems to be the only way to stay fully portable. --- src/gallium/drivers/svga/svga_tgsi.h | 26 +-- .../dri/r300/compiler/radeon_pair_regalloc.c |2 +- .../drivers/dri/r300/compiler/radeon_program.h | 14 +- .../dri/r300/compiler/radeon_program_pair.h| 10 4 files changed, 25 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/svga/svga_tgsi.h b/src/gallium/drivers/svga/svga_tgsi.h index 896c90a..1309c33 100644 --- a/src/gallium/drivers/svga/svga_tgsi.h +++ b/src/gallium/drivers/svga/svga_tgsi.h @@ -39,26 +39,24 @@ struct tgsi_token; struct svga_vs_compile_key { - ubyte need_prescale:1; - ubyte allow_psiz:1; unsigned zero_stride_vertex_elements; - ubyte num_zero_stride_vertex_elements:6; + unsigned need_prescale:1; + unsigned allow_psiz:1; + unsigned num_zero_stride_vertex_elements:6; }; struct svga_fs_compile_key { - boolean light_twoside:1; - boolean front_cw:1; - ubyte num_textures; - ubyte num_unnormalized_coords; + unsigned light_twoside:1; + unsigned front_cw:1; + unsigned num_textures:8; + unsigned num_unnormalized_coords:8; struct { - ubyte compare_mode : 1; - ubyte compare_func : 3; - ubyte unnormalized : 1; - - ubyte width_height_idx : 7; - - ubyte texture_target; + unsigned compare_mode:1; + unsigned compare_func:3; + unsigned unnormalized:1; + unsigned width_height_idx:7; + unsigned texture_target:8; } tex[PIPE_MAX_SAMPLERS]; }; diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c index 828d0c8..b2fe7f7 100644 --- a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c +++ b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c @@ -49,7 +49,7 @@ struct register_info { unsigned int Used:1; unsigned int Allocated:1; - rc_register_file File:3; + unsigned int File:3; unsigned int Index:RC_REGISTER_INDEX_BITS; }; diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program.h b/src/mesa/drivers/dri/r300/compiler/radeon_program.h index 0359288..e318867 100644 --- a/src/mesa/drivers/dri/r300/compiler/radeon_program.h +++ b/src/mesa/drivers/dri/r300/compiler/radeon_program.h @@ -39,7 +39,7 @@ struct radeon_compiler; struct rc_src_register { - rc_register_file File:3; + unsigned int File:3; /** Negative values may be used for relative addressing. */ signed int Index:(RC_REGISTER_INDEX_BITS+1); @@ -55,7 +55,7 @@ struct rc_src_register { }; struct rc_dst_register { - rc_register_file File:3; + unsigned int File:3; /** Negative values may be used for relative addressing. */ signed int
Re: [Mesa3d-dev] Mystery of u_format.csv
José Fonseca wrote on 2010-01-06 15:03: On Tue, 2010-01-05 at 23:36 -0800, michal wrote: michal wrote on 2010-01-06 07:58: michal wrote on 2009-12-22 10:00: Marek Olšák wrote on 2009-12-22 08:40: Hi, I noticed that gallium/auxiliary/util/u_format.csv contains some weird swizzling, for example see this: $ grep zyxw u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , un1 , zyxw, rgb PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , un4 , zyxw, rgb PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , sn8 , zyxw, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb It's hard to believe that ARGB, ABGR, and BGRA have the same swizzling. Let's continue our journey: $ grep A8R8G8B8 u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , wxyz, srgb Same formats, different swizzling? Also: $ grep B8G8R8A8 u_format.csv PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , yzwx, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb Same formats, different swizzling? I don't really get it. And there's much more cases like these. Could someone tell me what the intended order of channels should be? (or possibly propose a fix) The meaning of the whole table is self-contradictory and it's definitely the source of some r300g bugs. Marek, Yes, that seems like a defect. The format swizzle field tells us how to swizzle the incoming pixel so that its components are ordered in some predefined order. For RGB and SRGB colorspaces the order is R, G, B and A. For depth-stencil, ie. ZS color space the order is Z and then S. I will have a look at this. Marek, Jose, Can you review the attached patch? Ouch, it looks like we will have to leave 24-bit (s)rgb formats with array layout as the current code generator will bite us on big endian platforms. Attached an updated patch. Why are you changing the layout from array to arith? Please leave that alone. I did this because in the other thread you defined arith layout to apply to 32-or-less-bit formats. Since I still believe arith and array layout are somewhat redundant, we can go the other way round and convert other arith layouts to array, save for 16-or-less-bit formats. Yes, the code generator needs a big_ending - little endian call to be correct on big endian platforms, as gallium formats should always be thougth of in little endian terms, just like most hardware is. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mystery of u_format.csv
Christoph Bumiller wrote on 2010-01-06 12:08: On 06.01.2010 08:36, michal wrote: michal wrote on 2010-01-06 07:58: michal wrote on 2009-12-22 10:00: Marek Olšák wrote on 2009-12-22 08:40: Hi, I noticed that gallium/auxiliary/util/u_format.csv contains some weird swizzling, for example see this: $ grep zyxw u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , un1 , zyxw, rgb PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , un4 , zyxw, rgb PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , sn8 , zyxw, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb It's hard to believe that ARGB, ABGR, and BGRA have the same swizzling. Let's continue our journey: $ grep A8R8G8B8 u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , wxyz, srgb Same formats, different swizzling? Also: $ grep B8G8R8A8 u_format.csv PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , yzwx, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb Same formats, different swizzling? I don't really get it. And there's much more cases like these. Could someone tell me what the intended order of channels should be? (or possibly propose a fix) The meaning of the whole table is self-contradictory and it's definitely the source of some r300g bugs. Marek, Yes, that seems like a defect. The format swizzle field tells us how to swizzle the incoming pixel so that its components are ordered in some predefined order. For RGB and SRGB colorspaces the order is R, G, B and A. For depth-stencil, ie. ZS color space the order is Z and then S. I will have a look at this. Marek, Jose, Can you review the attached patch? Ouch, it looks like we will have to leave 24-bit (s)rgb formats with array layout as the current code generator will bite us on big endian platforms. Attached an updated patch. It looks like I always thought how to interpret the formats now. Which means the vertex element formats in mesa/state_tracker/st_draw.c should be corrected - the R8G8B8A8 and R8G8 vertex elements should be reversed, and the BGRA format should be A8R8G8B8. At least this would fix my (gallium/drivers/nv50/nv50.vbo) if (desc-swizzle[0] == UTIL_FORMAT_SWIZZLE_Z) /* BGRA */ check. I'm affraid you will also need to check desc-layout field, and if it is array, compare against desc-swizzle[3]. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mystery of u_format.csv
José Fonseca wrote on 2010-01-06 15:26: On Wed, 2010-01-06 at 06:11 -0800, michal wrote: José Fonseca wrote on 2010-01-06 15:03: On Tue, 2010-01-05 at 23:36 -0800, michal wrote: michal wrote on 2010-01-06 07:58: michal wrote on 2009-12-22 10:00: Marek Olšák wrote on 2009-12-22 08:40: Hi, I noticed that gallium/auxiliary/util/u_format.csv contains some weird swizzling, for example see this: $ grep zyxw u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , un1 , zyxw, rgb PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , un4 , zyxw, rgb PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , sn8 , zyxw, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb It's hard to believe that ARGB, ABGR, and BGRA have the same swizzling. Let's continue our journey: $ grep A8R8G8B8 u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , wxyz, srgb Same formats, different swizzling? Also: $ grep B8G8R8A8 u_format.csv PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , yzwx, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb Same formats, different swizzling? I don't really get it. And there's much more cases like these. Could someone tell me what the intended order of channels should be? (or possibly propose a fix) The meaning of the whole table is self-contradictory and it's definitely the source of some r300g bugs. Marek, Yes, that seems like a defect. The format swizzle field tells us how to swizzle the incoming pixel so that its components are ordered in some predefined order. For RGB and SRGB colorspaces the order is R, G, B and A. For depth-stencil, ie. ZS color space the order is Z and then S. I will have a look at this. Marek, Jose, Can you review the attached patch? Ouch, it looks like we will have to leave 24-bit (s)rgb formats with array layout as the current code generator will bite us on big endian platforms. Attached an updated patch. Why are you changing the layout from array to arith? Please leave that alone. I did this because in the other thread you defined arith layout to apply to 32-or-less-bit formats. Since I still believe arith and array layout are somewhat redundant, we can go the other way round and convert other arith layouts to array, save for 16-or-less-bit formats. Indeed arith applies to 32-or-less-bit formats, but I never meant to say that all 32-or-less-bit formats must be in arith. Understood. They are indeed redundant, but array is/will be more efficient and when code generation is more robust and big-endian-safe all x8, x8x8, x8x8x8, x8x8x8x8x8 formats will be likely in array layout. That is okay, we agree on that part. The question is what is the reason we treat PIPE_FORMAT_R8G8B8A8_UNORM as having array layout (before my patch), and e.g. PIPE_FORMAT_B8G8R8A8_UNORM as having arith layout? -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mystery of u_format.csv
Michel Dänzer wrote on 2010-01-06 15:23: On Wed, 2010-01-06 at 14:03 +, José Fonseca wrote: On Tue, 2010-01-05 at 23:36 -0800, michal wrote: michal wrote on 2010-01-06 07:58: michal wrote on 2009-12-22 10:00: Marek Olšák wrote on 2009-12-22 08:40: Hi, I noticed that gallium/auxiliary/util/u_format.csv contains some weird swizzling, for example see this: $ grep zyxw u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , un1 , zyxw, rgb PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , un4 , zyxw, rgb PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , sn8 , zyxw, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb It's hard to believe that ARGB, ABGR, and BGRA have the same swizzling. Let's continue our journey: $ grep A8R8G8B8 u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , wxyz, srgb Same formats, different swizzling? Also: $ grep B8G8R8A8 u_format.csv PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , yzwx, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb Same formats, different swizzling? I don't really get it. And there's much more cases like these. Could someone tell me what the intended order of channels should be? (or possibly propose a fix) The meaning of the whole table is self-contradictory and it's definitely the source of some r300g bugs. Marek, Yes, that seems like a defect. The format swizzle field tells us how to swizzle the incoming pixel so that its components are ordered in some predefined order. For RGB and SRGB colorspaces the order is R, G, B and A. For depth-stencil, ie. ZS color space the order is Z and then S. I will have a look at this. Marek, Jose, Can you review the attached patch? Ouch, it looks like we will have to leave 24-bit (s)rgb formats with array layout as the current code generator will bite us on big endian platforms. Attached an updated patch. Why are you changing the layout from array to arith? Please leave that alone. Yes, the code generator needs a big_ending - little endian call to be correct on big endian platforms, as gallium formats should always be thougth of in little endian terms, just like most hardware is. Actually, 'array' formats should be endianness neutral, and IMO 'arith' formats should be defined in the CPU endianness. Though as discussed before, having 'reversed' formats defined in the other endianness as well might be useful. Drivers which can work on setups where the CPU endianness doesn't match the GPU endianness should possibly only use 'array' formats, but then there might need to be some kind of mapping between the two kinds of formats somewhere, maybe in the state trackers or an auxiliary module... Interesting. Is there any reference that would say which formats are 'array', and which are not? Or is it a simple rule that when every component's bitsize is greater-or-equal to, say, 16, then it's an array format? -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.
Keith Whitwell wrote on 2010-01-06 11:31: Looks good to me Michal. Thanks, commited. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [mesa] svga: Fix error: cannot take address of bit-field 'texture_target' in svga_tgsi.h
Brian Paul wrote on 2010-01-06 18:07: Sedat Dilek wrote: Hi, this patch fixes a build-error in mesa GIT master after... commit 251363e8f1287b54dc7734e690daf2ae96728faf (patch) configs: set INTEL_LIBS, INTEL_CFLAGS, etcmaster From my build-log: ... In file included from svga_pipe_fs.c:37: svga_tgsi.h: In function 'svga_fs_key_size': svga_tgsi.h:122: error: cannot take address of bit-field 'texture_target' make[4]: *** [svga_pipe_fs.o] Error 1 Might be introduced in... commit 955f51270bb60ad77dba049799587dc7c0fb4dda Make sure we use only signed/unsigned ints with bitfields. Kind Regars, - Sedat - I just fixed that. Actually, we could go back to bitfields and fix broken svga_fs_key_size(). Attached a patch. Can somebody review, test-build and commit? From 7321aef0dfc5bb160ec8a33d1d4e686419f2ed3d Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Wed, 6 Jan 2010 18:36:45 +0100 Subject: [PATCH] svga: Fix fs key size computation and key comparison. This also allows us to have texture_target back as a bitfield and save us a few bytes. --- src/gallium/drivers/svga/svga_state_fs.c |9 +++-- src/gallium/drivers/svga/svga_tgsi.h |5 ++--- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/svga/svga_state_fs.c b/src/gallium/drivers/svga/svga_state_fs.c index 272d1dd..bba80a9 100644 --- a/src/gallium/drivers/svga/svga_state_fs.c +++ b/src/gallium/drivers/svga/svga_state_fs.c @@ -40,8 +40,13 @@ static INLINE int compare_fs_keys( const struct svga_fs_compile_key *a, const struct svga_fs_compile_key *b ) { - unsigned keysize = svga_fs_key_size( a ); - return memcmp( a, b, keysize ); + unsigned keysize_a = svga_fs_key_size( a ); + unsigned keysize_b = svga_fs_key_size( b ); + + if (keysize_a != keysize_b) { + return (int)(keysize_a - keysize_b); + } + return memcmp( a, b, keysize_a ); } diff --git a/src/gallium/drivers/svga/svga_tgsi.h b/src/gallium/drivers/svga/svga_tgsi.h index 043b991..737a221 100644 --- a/src/gallium/drivers/svga/svga_tgsi.h +++ b/src/gallium/drivers/svga/svga_tgsi.h @@ -56,7 +56,7 @@ struct svga_fs_compile_key unsigned compare_func:3; unsigned unnormalized:1; unsigned width_height_idx:7; - ubyte texture_target; + unsigned texture_target:8; } tex[PIPE_MAX_SAMPLERS]; }; @@ -119,8 +119,7 @@ static INLINE unsigned svga_vs_key_size( const struct svga_vs_compile_key *key ) static INLINE unsigned svga_fs_key_size( const struct svga_fs_compile_key *key ) { - return (const char *)key-tex[key-num_textures].texture_target - - (const char *)key; + return (const char *)key-tex[key-num_textures] - (const char *)key; } struct svga_shader_result * -- 1.6.4.msysgit.0 -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] gallium-integer-opcodes branch
Keith Whitwell wrote on 2010-01-04 18:46: On Mon, 2010-01-04 at 09:39 -0800, Brian Paul wrote: michal wrote: Hi, I would like to merge gallium-integer-opcodes branch to master this week. This feature branch adds support for integer operations in TGSI that is required by GLSL 1.30. In summary: * add a bunch of opcodes operating on signed and unsigned integers, * add signed/unsigned integer immediate types, * add new opcodes for switch-case statements, * source operand modifiers (abs, neg) treat operands as integers if a particular opcode expects integers as arguments. Since integer opcodes are a dependency for other future features, the plan is to merge this branch to master and then fork again to develop other features that sit on top of it. Please review and comment. Looks pretty good, Michal. You can merge whenever you want, as far as I'm concerned. Yes, looks good to me too Michal -- feel free to merge when you're ready. Thanks for having a look, guys. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Merge gallium-docs
José Fonseca wrote on 2010-01-05 17:12: On Tue, 2010-01-05 at 07:57 -0800, Keith Whitwell wrote: This doesn't really need to be on a branch, and by merging it I can start to ask for people to keep it up-to-date with interface changes... If nobody objects, I'll do this in the next couple of days. Keith Sound good to me. It makes it officially, so we can all start improving it. A minor nitpick: I'd prefer for the derived html not be committed into git. It forces every body to have the tool installed, adds noise to patches, and I really don't see the point if the .rst is already so easy to read. I'd rather have a cronjob in fdo.org generating and publishing the HTML/etc from the master HEAD. Also, it looks like you have to have the tool installed anyway -- at least to view the math formulas (e.g. TGSI pages). -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] gallium-integer-opcodes branch
Hi, I would like to merge gallium-integer-opcodes branch to master this week. This feature branch adds support for integer operations in TGSI that is required by GLSL 1.30. In summary: * add a bunch of opcodes operating on signed and unsigned integers, * add signed/unsigned integer immediate types, * add new opcodes for switch-case statements, * source operand modifiers (abs, neg) treat operands as integers if a particular opcode expects integers as arguments. Since integer opcodes are a dependency for other future features, the plan is to merge this branch to master and then fork again to develop other features that sit on top of it. Please review and comment. Thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] fix missing semantic name in tgsi_text.c
Igor Oliveira wrote on 2010-01-01 18:03: Hi, i found a tgsi bug running vega state tracker. The bug happens because in tgsi_text.c line 991: for (i = 0; i TGSI_SEMANTIC_COUNT; i++) TGSI_SEMANTIC_COUNT is bigger than semantic_name declared in tgsi_text.c: 936 static const char *semantic_names[TGSI_SEMANTIC_COUNT] = 937 { 938POSITION, 939COLOR, 940BCOLOR, 941FOG, 942PSIZE, 943GENERIC, 944NORMAL, 945FACE, 946PRIM_ID 947 }; TGSI_SEMANTIC_COUNT is 10 but there is just 8 elements seeing other files i see that there is missing semantic name: EDGEFLAG. The patch below add EDGEFLAG in semantic_names. diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index 2e3f9a9..9fcffed 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -932,6 +932,7 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] = GENERIC, NORMAL, FACE, + EDGEFLAG, PRIM_ID }; Commited, thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (instanced-arrays): translate: Implement instancing for linear SSE run .
Keith Whitwell wrote on 2009-12-30 16:22: Michal, Did you update the 'C' version of translate for this new functionality? You can't just extend the fast path - the fallback/default mode needs to be updated as well. Yes, I did that in the previous commit. Also, I'm sure it's not necessary to do a divide per vertex-element to achieve instancing. It can't be that hard to throw some more counters at the problem and do this with a couple of adds instead of a divide... The division is done once per instance, not per every vertex attrib. Are you serious about optimising such low-profile things? Keith On Wed, 2009-12-30 at 05:23 -0800, Micha?? Kr??l wrote: Module: Mesa Branch: instanced-arrays Commit: 09c0287b84725098c0b365668231ddf00487c84c URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=09c0287b84725098c0b365668231ddf00487c84c Author: Michal Krol mic...@vmware.com Date: Wed Dec 30 14:23:12 2009 +0100 translate: Implement instancing for linear SSE run. --- src/gallium/auxiliary/translate/translate_sse.c | 154 ++- 1 files changed, 120 insertions(+), 34 deletions(-) diff --git a/src/gallium/auxiliary/translate/translate_sse.c b/src/gallium/auxiliary/translate/translate_sse.c index edd0be1..ddfa4c6 100644 --- a/src/gallium/auxiliary/translate/translate_sse.c +++ b/src/gallium/auxiliary/translate/translate_sse.c @@ -49,6 +49,7 @@ typedef void (PIPE_CDECL *run_func)( struct translate *translate, unsigned start, unsigned count, + unsigned instance_id, void *output_buffer ); typedef void (PIPE_CDECL *run_elts_func)( struct translate *translate, @@ -59,7 +60,12 @@ typedef void (PIPE_CDECL *run_elts_func)( struct translate *translate, struct translate_buffer { const void *base_ptr; unsigned stride; - void *ptr; /* updated per vertex */ +}; + +struct translate_buffer_varient { + unsigned buffer_index; + unsigned instance_divisor; + void *ptr;/* updated either per vertex or per instance */ }; @@ -81,6 +87,16 @@ struct translate_sse { struct translate_buffer buffer[PIPE_MAX_ATTRIBS]; unsigned nr_buffers; + /* Multiple buffer varients can map to a single buffer. */ + struct translate_buffer_varient buffer_varient[PIPE_MAX_ATTRIBS]; + unsigned nr_buffer_varients; + + /* Multiple elements can map to a single buffer varient. */ + unsigned element_to_buffer_varient[PIPE_MAX_ATTRIBS]; + + boolean use_instancing; + unsigned instance_id; + run_func gen_run; run_elts_func gen_run_elts; @@ -360,31 +376,59 @@ static boolean init_inputs( struct translate_sse *p, { unsigned i; if (linear) { - for (i = 0; i p-nr_buffers; i++) { + struct x86_reg instance_id = x86_make_disp(p-machine_EDX, + get_offset(p, p-instance_id)); + + for (i = 0; i p-nr_buffer_varients; i++) { + struct translate_buffer_varient *varient = p-buffer_varient[i]; + struct translate_buffer *buffer = p-buffer[varient-buffer_index]; struct x86_reg buf_stride = x86_make_disp(p-machine_EDX, - get_offset(p, p-buffer[i].stride)); + get_offset(p, buffer-stride)); struct x86_reg buf_ptr = x86_make_disp(p-machine_EDX, - get_offset(p, p-buffer[i].ptr)); + get_offset(p, varient-ptr)); struct x86_reg buf_base_ptr = x86_make_disp(p-machine_EDX, - get_offset(p, p-buffer[i].base_ptr)); + get_offset(p, buffer-base_ptr)); struct x86_reg elt = p-idx_EBX; - struct x86_reg tmp = p-tmp_EAX; - + struct x86_reg tmp_EAX = p-tmp_EAX; /* Calculate pointer to first attrib: + * base_ptr + stride * index, where index depends on instance divisor */ - x86_mov(p-func, tmp, buf_stride); - x86_imul(p-func, tmp, elt); - x86_add(p-func, tmp, buf_base_ptr); + if (varient-instance_divisor) { +/* Our index is instance ID divided by instance divisor. + */ +x86_mov(p-func, tmp_EAX, instance_id); + +if (varient-instance_divisor != 1) { + struct x86_reg tmp_EDX = p-machine_EDX; + struct x86_reg tmp_ECX = p-outbuf_ECX; + + /* TODO: Add x86_shr() to rtasm and use it whenever +* instance divisor is power of two. +*/ + + x86_push(p
Re: [Mesa3d-dev] geometry shading patches
Zack Rusin wrote on 2009-12-24 14:24: yo, after our discussions i hacked a new version of geometry shading support in gallium. the new geometry shading syntax looks as follows: Zack, That looks nice. Once you commit I will take a closer look at patch #10 and see what's the issue there without bothering you. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mystery of u_format.csv
Marek Olšák wrote on 2009-12-22 08:40: Hi, I noticed that gallium/auxiliary/util/u_format.csv contains some weird swizzling, for example see this: $ grep zyxw u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , un1 , zyxw, rgb PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , un4 , zyxw, rgb PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , sn8 , zyxw, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb It's hard to believe that ARGB, ABGR, and BGRA have the same swizzling. Let's continue our journey: $ grep A8R8G8B8 u_format.csv PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , zyxw, rgb PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , wxyz, srgb Same formats, different swizzling? Also: $ grep B8G8R8A8 u_format.csv PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , un8 , yzwx, rgb PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 , zyxw, srgb Same formats, different swizzling? I don't really get it. And there's much more cases like these. Could someone tell me what the intended order of channels should be? (or possibly propose a fix) The meaning of the whole table is self-contradictory and it's definitely the source of some r300g bugs. Marek, Yes, that seems like a defect. The format swizzle field tells us how to swizzle the incoming pixel so that its components are ordered in some predefined order. For RGB and SRGB colorspaces the order is R, G, B and A. For depth-stencil, ie. ZS color space the order is Z and then S. I will have a look at this. Thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Vega State-Tracker not compiling
STEVE555 wrote on 2009-12-20 18:02: Hi everyone, I've noticed in the last few days of compiling Mesa from git that I get an error when I include Vega as one of the state-trackers in my configure options.It keeps coming up with this error: Fixed in git. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions
Guys, Does the attached patch make sense to you? I replaced the incomplete switch-cases with calls to u_format_access functions that are complete but are going to be a bit more expensive to call. Since they are used not very often in mesa state tracker, I thought it's a good compromise. Thanks. diff --git a/src/gallium/auxiliary/util/u_pack_color.h b/src/gallium/auxiliary/util/u_pack_color.h index a2e0f26..6080f1a 100644 --- a/src/gallium/auxiliary/util/u_pack_color.h +++ b/src/gallium/auxiliary/util/u_pack_color.h @@ -37,6 +37,7 @@ #include pipe/p_compiler.h #include pipe/p_format.h +#include util/u_format.h #include util/u_math.h @@ -55,85 +56,14 @@ static INLINE void util_pack_color_ub(ubyte r, ubyte g, ubyte b, ubyte a, enum pipe_format format, union util_color *uc) { - switch (format) { - case PIPE_FORMAT_R8G8B8A8_UNORM: - { - uc-ui = (r 24) | (g 16) | (b 8) | a; - } - return; - case PIPE_FORMAT_R8G8B8X8_UNORM: - { - uc-ui = (r 24) | (g 16) | (b 8) | 0xff; - } - return; - case PIPE_FORMAT_A8R8G8B8_UNORM: - { - uc-ui = (a 24) | (r 16) | (g 8) | b; - } - return; - case PIPE_FORMAT_X8R8G8B8_UNORM: - { - uc-ui = (0xff 24) | (r 16) | (g 8) | b; - } - return; - case PIPE_FORMAT_B8G8R8A8_UNORM: - { - uc-ui = (b 24) | (g 16) | (r 8) | a; - } - return; - case PIPE_FORMAT_B8G8R8X8_UNORM: - { - uc-ui = (b 24) | (g 16) | (r 8) | 0xff; - } - return; - case PIPE_FORMAT_R5G6B5_UNORM: - { - uc-us = ((r 0xf8) 8) | ((g 0xfc) 3) | (b 3); - } - return; - case PIPE_FORMAT_A1R5G5B5_UNORM: - { - uc-us = ((a 0x80) 8) | ((r 0xf8) 7) | ((g 0xf8) 2) | (b 3); - } - return; - case PIPE_FORMAT_A4R4G4B4_UNORM: - { - uc-us = ((a 0xf0) 8) | ((r 0xf0) 4) | ((g 0xf0) 0) | (b 4); - } - return; - case PIPE_FORMAT_A8_UNORM: - { - uc-ub = a; - } - return; - case PIPE_FORMAT_L8_UNORM: - case PIPE_FORMAT_I8_UNORM: - { - uc-ub = a; - } - return; - case PIPE_FORMAT_R32G32B32A32_FLOAT: - { - uc-f[0] = (float)r / 255.0f; - uc-f[1] = (float)g / 255.0f; - uc-f[2] = (float)b / 255.0f; - uc-f[3] = (float)a / 255.0f; - } - return; - case PIPE_FORMAT_R32G32B32_FLOAT: - { - uc-f[0] = (float)r / 255.0f; - uc-f[1] = (float)g / 255.0f; - uc-f[2] = (float)b / 255.0f; - } - return; + ubyte src[4]; - /* XXX lots more cases to add */ - default: - uc-ui = 0; /* keep compiler happy */ - debug_print_format(gallium: unhandled format in util_pack_color_ub(), format); - assert(0); - } + src[0] = r; + src[1] = g; + src[2] = b; + src[3] = a; + + util_format_write_4ub(format, src, 0, uc, 0, 0, 0, 1, 1); } @@ -144,150 +74,15 @@ static INLINE void util_unpack_color_ub(enum pipe_format format, union util_color *uc, ubyte *r, ubyte *g, ubyte *b, ubyte *a) { - switch (format) { - case PIPE_FORMAT_R8G8B8A8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 24) 0xff); - *g = (ubyte) ((p 16) 0xff); - *b = (ubyte) ((p 8) 0xff); - *a = (ubyte) ((p 0) 0xff); - } - return; - case PIPE_FORMAT_R8G8B8X8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 24) 0xff); - *g = (ubyte) ((p 16) 0xff); - *b = (ubyte) ((p 8) 0xff); - *a = (ubyte) 0xff; - } - return; - case PIPE_FORMAT_A8R8G8B8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 16) 0xff); - *g = (ubyte) ((p 8) 0xff); - *b = (ubyte) ((p 0) 0xff); - *a = (ubyte) ((p 24) 0xff); - } - return; - case PIPE_FORMAT_X8R8G8B8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 16) 0xff); - *g = (ubyte) ((p 8) 0xff); - *b = (ubyte) ((p 0) 0xff); - *a = (ubyte) 0xff; - } - return; - case PIPE_FORMAT_B8G8R8A8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 8) 0xff); - *g = (ubyte) ((p 16) 0xff); - *b = (ubyte) ((p 24) 0xff); - *a = (ubyte) ((p 0) 0xff); - } - return; - case PIPE_FORMAT_B8G8R8X8_UNORM: - { - uint p = uc-ui; - *r = (ubyte) ((p 8) 0xff); - *g = (ubyte) ((p 16) 0xff); - *b = (ubyte) ((p 24) 0xff); - *a = (ubyte) 0xff; - } - return; - case PIPE_FORMAT_R5G6B5_UNORM: - { - ushort p = uc-us; - *r = (ubyte) (((p 8) 0xf8) | ((p 13) 0x7)); - *g = (ubyte) (((p 3) 0xfc) | ((p 9) 0x3)); - *b = (ubyte) (((p 3) 0xf8) | ((p 2) 0x7)); - *a = (ubyte) 0xff; - } - return; - case
Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions
Roland Scheidegger pisze: On 15.12.2009 14:14, michal wrote: Guys, Does the attached patch make sense to you? I replaced the incomplete switch-cases with calls to u_format_access functions that are complete but are going to be a bit more expensive to call. Since they are used not very often in mesa state tracker, I thought it's a good compromise. They are not only used in state trackers, but drivers for instance as well. That said, it's probably not really a performance critical path. Though I'm not sure it makes sense to keep these functions even around if they'll just do a single function call. Also, I'm pretty sure your usage of the union isn't strict aliasing compliant (as far as I can tell you could just go back and remove that ugly union again), though it's probably one of the cases gcc won't complain (and hopefully won't miscompile). I am casting to (void *) and then u_format casts it back to whatever it needs to. I think I am innocent. Anyway, I will go after Keith's suggestion and fill in only the switch-default case. We can always nuke the special cases later when/if we realise the performance impact can be neglected. Thanks. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.
To fully support geometry shaders, we need some means to declare a two-dimensional register file. The following declaration DCL IN[3][0] would declare an input register with index 0 (first dimension) and size 3 (second dimension). Since the second dimension is a size, not an index (or, for that matter, an index range), a new token has been added that specifies the declared size of the register. Thanks. diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 588ca5e..e5a723f 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -107,10 +107,11 @@ struct tgsi_declaration unsigned File: 4; /** one of TGSI_FILE_x */ unsigned UsageMask : 4; /** bitmask of TGSI_WRITEMASK_x flags */ unsigned Interpolate : 4; /** one of TGSI_INTERPOLATE_x */ + unsigned Dimension : 1; /** BOOL, any second dimension info? */ unsigned Semantic: 1; /** BOOL, any semantic info? */ unsigned Centroid: 1; /** centroid sampling? */ unsigned Invariant : 1; /** invariant optimization? */ - unsigned Padding : 5; + unsigned Padding : 4; }; struct tgsi_declaration_range @@ -119,6 +120,12 @@ struct tgsi_declaration_range unsigned Last: 16; /** UINT */ }; +struct tgsi_declaration_dimension +{ + unsigned Size2D : 16; /** Size of the second dimension */ + unsigned Padding : 16; +}; + #define TGSI_SEMANTIC_POSITION 0 #define TGSI_SEMANTIC_COLOR1 #define TGSI_SEMANTIC_BCOLOR 2 /** back-face color */ -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.
Zack Rusin pisze: On Monday 14 December 2009 09:29:03 Keith Whitwell wrote: On Mon, 2009-12-14 at 06:23 -0800, michal wrote: To fully support geometry shaders, we need some means to declare a two-dimensional register file. The following declaration DCL IN[3][0] would declare an input register with index 0 (first dimension) and size 3 (second dimension). Since the second dimension is a size, not an index (or, for that matter, an index range), a new token has been added that specifies the declared size of the register. Is this a good representation? What would happen if there was: DCL IN[4][0] DCL IN[3][1] Presumably the 3 is always going to be 3, and it's a property of the geometry shader - I think Zack has a patch which adds something like: PROP GS_VERTICES_IN 3 Then couldn't we just have the equivalent of: DCL IN[][0] DCL IN[][1] with the size of the first dimension specified by the property? Yea, that's what I thought the dimensional arrays should look like for GS in TGSI (they already do in GLSL and HLSL). Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property. GL_ARB_geometry_shader4 has this mapping: Value of built-in Input primitive type gl_VerticesIn --- - POINTS 1 LINES 2 LINES_ADJACENCY_ARB 4 TRIANGLES 3 TRIANGLES_ADJACENCY_ARB 6 But that also defeats the purpose of this patch -- INPUT registers would have implied two-dimensionality when declared inside GS. Are there going to be cases where this doesn't work? I don't think so. Also if we decide to go with DCL IN[x][1] notation then it probably should be DCL IN[a..b][1] because otherwise it just looks weird that one component declares a range while the other the index. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium: add blitter
Keith Whitwell pisze: On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: +static INLINE +void util_blitter_save_fragment_sampler_states( + struct blitter_context *blitter, + int num_sampler_states, + void **sampler_states) +{ + assert(num_textures = 32); + + blitter-saved_num_sampler_states = num_sampler_states; + memcpy(blitter-saved_sampler_states, sampler_states, + num_sampler_states * sizeof(void *)); +} + Have you tried compiling with debug enabled? The assert above fails to compile. Also, can you use Elements() or similar instead of the hard-coded 32? Maybe we can figure out how to go back to having asserts keep exposing their contents to the compiler even on non-debug builds. This used to work without problem on linux and helped a lot to avoid these type of problems. Precisely. Recently I've been thinking about mapping assert() to __assume() for non-debug builds on windows and MSVC. http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium: add blitter
José Fonseca pisze: On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: +static INLINE +void util_blitter_save_fragment_sampler_states( + struct blitter_context *blitter, + int num_sampler_states, + void **sampler_states) +{ + assert(num_textures = 32); + + blitter-saved_num_sampler_states = num_sampler_states; + memcpy(blitter-saved_sampler_states, sampler_states, + num_sampler_states * sizeof(void *)); +} + Have you tried compiling with debug enabled? The assert above fails to compile. Also, can you use Elements() or similar instead of the hard-coded 32? Maybe we can figure out how to go back to having asserts keep exposing their contents to the compiler even on non-debug builds. This used to work without problem on linux and helped a lot to avoid these type of problems. I wouldn't say without a problem: defining assert(expr) as (void)0 instead of (void)(expr) on release builds yielded a non-negligible performance improvement. I don't recall the exact figure, but I believe it was the 3-5% for the driver I was benchmarking at the time. YMMV. Different drivers will give different results, but there's nothing platform specific about this. It's not hard to avoid excuting code... For instance we could always have it translated to something like: if (0) { (void)(expr); } (void)(0) Obviously I would have meant to say something cleaner like: do { if (0) { (void)(expr); } } while (0) This only works if expr has no calls, or just inline calls. Using my earlier example, if very_expensive_check() is in another file then the compiler has to assume the function will have side effects, and the call can't be removed. I'm not sure __assume keyword that Michal mentioned helps. It's more a hint to the compiler to help him optimize code around the assertion, but perhaps it helps with the warnings too. If I try to compile this: __assume(lalala); I get: error C2065: 'lalala' : undeclared identifier On the other side, the compiler is going to be serious about the assumptions inside __assume(), and if they happen to be false, the application can behave not as expected. This is against current gallium paradigm, where we put assertions, but also do the same check in non-debug builds to early out from a function or provide default values (e.g. in switch-case statements). -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] gallium: add blitter
José Fonseca pisze: On Mon, 2009-12-14 at 08:58 -0800, michal wrote: José Fonseca pisze: On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: +static INLINE +void util_blitter_save_fragment_sampler_states( + struct blitter_context *blitter, + int num_sampler_states, + void **sampler_states) +{ + assert(num_textures = 32); + + blitter-saved_num_sampler_states = num_sampler_states; + memcpy(blitter-saved_sampler_states, sampler_states, + num_sampler_states * sizeof(void *)); +} + Have you tried compiling with debug enabled? The assert above fails to compile. Also, can you use Elements() or similar instead of the hard-coded 32? Maybe we can figure out how to go back to having asserts keep exposing their contents to the compiler even on non-debug builds. This used to work without problem on linux and helped a lot to avoid these type of problems. I wouldn't say without a problem: defining assert(expr) as (void)0 instead of (void)(expr) on release builds yielded a non-negligible performance improvement. I don't recall the exact figure, but I believe it was the 3-5% for the driver I was benchmarking at the time. YMMV. Different drivers will give different results, but there's nothing platform specific about this. It's not hard to avoid excuting code... For instance we could always have it translated to something like: if (0) { (void)(expr); } (void)(0) Obviously I would have meant to say something cleaner like: do { if (0) { (void)(expr); } } while (0) This only works if expr has no calls, or just inline calls. Using my earlier example, if very_expensive_check() is in another file then the compiler has to assume the function will have side effects, and the call can't be removed. I'm not sure __assume keyword that Michal mentioned helps. It's more a hint to the compiler to help him optimize code around the assertion, but perhaps it helps with the warnings too. If I try to compile this: __assume(lalala); I get: error C2065: 'lalala' : undeclared identifier On the other side, the compiler is going to be serious about the assumptions inside __assume(), and if they happen to be false, the application can behave not as expected. This is against current gallium paradigm, where we put assertions, but also do the same check in non-debug builds to early out from a function or provide default values (e.g. in switch-case statements). Bummer... that's no good. On the third hand, we could transform the following idiom switch (foo) { case 1: bar = 22; default: assert(0); bar = 11; /* Safe value. */ } to use some flavour of assert() that doesn't get substituted with __assume() on non-debug builds. Something like weak_assert() or warning(). Then assert() could be used in places where there is no backup plan and the app is going to crash anyway. Or... do the opposite and introduce strong_assert() that translates to __assume() and leave assert() as it is now. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.
Keith Whitwell pisze: On Mon, 2009-12-14 at 06:51 -0800, michal wrote: Zack Rusin pisze: On Monday 14 December 2009 09:29:03 Keith Whitwell wrote: On Mon, 2009-12-14 at 06:23 -0800, michal wrote: To fully support geometry shaders, we need some means to declare a two-dimensional register file. The following declaration DCL IN[3][0] would declare an input register with index 0 (first dimension) and size 3 (second dimension). Since the second dimension is a size, not an index (or, for that matter, an index range), a new token has been added that specifies the declared size of the register. Is this a good representation? What would happen if there was: DCL IN[4][0] DCL IN[3][1] Presumably the 3 is always going to be 3, and it's a property of the geometry shader - I think Zack has a patch which adds something like: PROP GS_VERTICES_IN 3 Then couldn't we just have the equivalent of: DCL IN[][0] DCL IN[][1] with the size of the first dimension specified by the property? Yea, that's what I thought the dimensional arrays should look like for GS in TGSI (they already do in GLSL and HLSL). Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property. GL_ARB_geometry_shader4 has this mapping: Value of built-in Input primitive type gl_VerticesIn --- - POINTS 1 LINES 2 LINES_ADJACENCY_ARB 4 TRIANGLES 3 TRIANGLES_ADJACENCY_ARB 6 But that also defeats the purpose of this patch -- INPUT registers would have implied two-dimensionality when declared inside GS. We have agreed that, its true... So is this patch necessary? Is it sufficient to simply make the statements that: a) Geometry shader INPUTs are always two dimensional b) The first dimension is determined by the input primitive type? Yes, thanks. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.
Zack Rusin pisze: On Monday 14 December 2009 12:49:53 michal wrote: Keith Whitwell pisze: On Mon, 2009-12-14 at 06:51 -0800, michal wrote: Zack Rusin pisze: On Monday 14 December 2009 09:29:03 Keith Whitwell wrote: On Mon, 2009-12-14 at 06:23 -0800, michal wrote: To fully support geometry shaders, we need some means to declare a two-dimensional register file. The following declaration DCL IN[3][0] would declare an input register with index 0 (first dimension) and size 3 (second dimension). Since the second dimension is a size, not an index (or, for that matter, an index range), a new token has been added that specifies the declared size of the register. Is this a good representation? What would happen if there was: DCL IN[4][0] DCL IN[3][1] Presumably the 3 is always going to be 3, and it's a property of the geometry shader - I think Zack has a patch which adds something like: PROP GS_VERTICES_IN 3 Then couldn't we just have the equivalent of: DCL IN[][0] DCL IN[][1] with the size of the first dimension specified by the property? Yea, that's what I thought the dimensional arrays should look like for GS in TGSI (they already do in GLSL and HLSL). Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property. GL_ARB_geometry_shader4 has this mapping: Value of built-in Input primitive type gl_VerticesIn --- - POINTS 1 LINES 2 LINES_ADJACENCY_ARB 4 TRIANGLES 3 TRIANGLES_ADJACENCY_ARB 6 But that also defeats the purpose of this patch -- INPUT registers would have implied two-dimensionality when declared inside GS. We have agreed that, its true... So is this patch necessary? Is it sufficient to simply make the statements that: a) Geometry shader INPUTs are always two dimensional b) The first dimension is determined by the input primitive type? Yes, thanks. k, i'm a bit confused. i can't say it's very pretty but it works so i'm cool with any form of declarations but where does that leave the problem of actually accessing those inputs? i mean how will we access the color of the second vertex if multidimensional arrays don't exist. will it be GEOM PROPERTY GS_INPUT_PRIMITIVE TRIANGLES This basically says: The first dimension of IN is 3. DCL IN[0], POSITION This should read DCL IN[][0], POSITION DCL OUT[0], POSITION MOV OUT[0], IN[0][0] EMIT_VERTEX MOV OUT[0], IN[1][0] EMIT_VERTEX MOV OUT[0], IN[2][0] EMIT_VERTEX END_PRIMITIVE END And the above can be already expressed with what's inside p_shader_token.h, including 2D INPUT access. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium3d shader declarations
Zack Rusin pisze: On Wednesday 09 December 2009 15:07:45 michal wrote: Keith Whitwell pisze: On Wed, 2009-12-09 at 10:19 -0800, Keith Whitwell wrote: On Wed, 2009-12-09 at 07:18 -0800, Zack Rusin wrote: On Wednesday 09 December 2009 10:05:13 michal wrote: Zack Rusin pisze: On Wednesday 09 December 2009 08:55:09 michal wrote: Zack Rusin pisze: On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote: On Wed, 2009-12-09 at 04:41 -0800, michal wrote: Zack Rusin pisze: Hi, currently Gallium3d shaders predefine all their inputs/outputs. We've handled all inputs/outputs the same way. e.g. VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..9] DCL TEMP[0..3] or FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR There are certain inputs/output which don't really follow the typical rules for inputs/outputs though and we've been imitating those with extra normal semantics (e.g. front face). It all falls apart a bit on anything with shader model 4.x and up. That's because in there we've got what Microsoft calls system-value semantics. ( http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#Sys tem_ Va l ue ). They all represent system-generated inputs/outputs for shaders. And while so far we only really had to handle front-face since shader model 4.x we have to deal with lots of them (geometry shaders, domain shaders, computer shaders... they all have system generated inputs/outputs) I'm thinking of adding something similar to what D3D does to Gallium3d. So just adding a new DCL type, e.g. DCL_SV which takes the vector name and the system-value semantic as its inputs, so FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL IN[2], FACE, CONSTANT would become FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL_SV IN[2], FACE It likely could be done in a more generic fashion though. Opinions? Zack, What would be the difference between DCL IN[2], FACE, CONSTANT and DCL_SV IN[2], FACE then? Maybe the example is bad, but I don't see what DCL_SV would give us the existing DCL doesn't. I'd have proposed something slightly different where the SV values don't land in the INPUT file but some new register file. The reason is that when we start looking at geometry shaders, the INPUT register file becomes two-dimensional, but these SV values remain single-dimensional. That means that for current TGSI we'd have stuff like: DCL IN[0..3][0] POSITION DCL IN[0..3][1] COLOR DCL IN[2] SOME_SYSTEM_VALUE Which is pretty nasty - half of the input file is one dimensional, half two-dimensional, and you need to look at the index of the first dimension to figure out whether the input reg is legal or not. So, I'm think some new register file to handle these system-generated values is one possiblility, as in: DCL SV[0], FACE or DCL SV[1], PRIMITIVE_ID Thoughts? Yea, I like that. And then separate syntax to handle the properties or overloading DCL? i.e. DCL GS_INFO PRIM_IN TRIANGLES vs PROPERTY GS_INFO PRIM_IN TRIANGLES ? I think a geometry shader should have its own GS_INFO token that would convey the information it needs, i.e. no overloading of the DCL token. GS_INFO PRIM_IN TRIANGLES GS_INFO PRIM_OUT TRIANGLE_STRIP GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */ We'll be adding more of those then. Basically we'll need an extra token for every shader we have. COMPUTE_INFO WORK_GROUP_SIZE 4 4 4 /*x, y, z*/ DS_INFO DOMAIN 3 /*domain shader*/ HS_INFO MAXTESSFACTOR 3 /*hull shader*/ FS_INFO EARLYDEPTSTENCIL 1 etc. To me it looks uglier than a special decleration token that could handle all of them. Can you propose a patch against p_shader_tokens.h that introduces a PROPERTY token? I could do that but only if we agree it's in the name of love. So is everyone ok with a new register SV for system generated values and new declaration token called PROPERTY for shader specific properties (btw, d3d calls those attributes, but since attributes already have a meaning in glsl I think it'd probably wise to not try to redefine it). I'm OK with this general plan, though I'm not sure about these FS properties - early depth/stencil depends on more than just the shader as long as we continue to support legacy alphatest, for instance. This is probably something the driver has to figure out for itself based on the peculiarities of the hardware - some hardware may not even have such a concept. In terms of the SV register file, what do we do with the existing system values -- I'm guessing things like the FACE input semantic in fragment shaders is now a SV, right? Also, how
Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge
Roland Scheidegger pisze: On 09.12.2009 18:58, michal wrote: Keith Whitwell pisze: On Wed, 2009-12-09 at 09:16 -0800, michal wrote: Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Thanks. Michal, Can you detail what testing you've done on this branch and which environments you have/haven't built on? Testing: * Capture the output of the old syntax parser and comapre with the output of the new parser. No regressions found. Use a set of over 400 shaders to perform the comparison. * Run GLSL Parser Test to see if the new parser successfully intergrates with the rest of Mesa. No regressions found. So far I have been building that with scons on windows. I am planning to fix the build with make and scons on linux. Seems to compile just fine now with make. Too bad all the strict-aliasing violations are still there (in grammar.c), I'll give this a look (but don't wait for it for merging). Also, there seems to be some char/byte uncleanliness, I get a gazillion warnings like: shader/grammar/grammar.c: In function ‘get_spec’: shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness /usr/include/string.h:397: note: expected ‘const char *’ but argument is of type ‘byte *’ shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness shader/grammar/grammar.c:1978: note: expected ‘const char *’ but argument is of type ‘byte *’ shader/grammar/grammar.c:1978: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness Don't worry about it, Roland. This will go away once we merge (there's still grammar dependency in the branch, but not any more in master). -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (glsl-pp-rework-2): scons: Get GLSL code building correctly when cross compiling.
José Fonseca pisze: On Thu, 2009-12-10 at 08:31 -0800, Jose Fonseca wrote: Module: Mesa Branch: glsl-pp-rework-2 Commit: 491f384c3958067e6c4c994041f5d8d413b806bc URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=491f384c3958067e6c4c994041f5d8d413b806bc Author: José Fonseca jfons...@vmware.com Date: Thu Dec 10 16:29:04 2009 + scons: Get GLSL code building correctly when cross compiling. This is quite messy. GLSL code has to be built twice: one for the host OS, another for the target OS. Michal, I managed to get linux-windows cross compilation working again, but it was *very* complicated, because src/glsl/{pp,cl} has to be built twice -- one for the host os, another for the target os --, therefore we must ensure the .o and .a files are stored in different places. It's really messy and ugly, and any Linux distribution which uses cross compilers as part of their build process will have a hard time to package this. Is this absolutely necessary? No, it isn't. Another option is to write a python script that converts the .gc files into headers containing string literals and compile the library from that on runtime. I should be able to produce that script when I get some free time on my shoulders, and in the meantime use the existing solution. Thanks for helping me with that! -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium3d shader declarations
Zack Rusin pisze: Hi, currently Gallium3d shaders predefine all their inputs/outputs. We've handled all inputs/outputs the same way. e.g. VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..9] DCL TEMP[0..3] or FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR There are certain inputs/output which don't really follow the typical rules for inputs/outputs though and we've been imitating those with extra normal semantics (e.g. front face). It all falls apart a bit on anything with shader model 4.x and up. That's because in there we've got what Microsoft calls system-value semantics. ( http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Value ). They all represent system-generated inputs/outputs for shaders. And while so far we only really had to handle front-face since shader model 4.x we have to deal with lots of them (geometry shaders, domain shaders, computer shaders... they all have system generated inputs/outputs) I'm thinking of adding something similar to what D3D does to Gallium3d. So just adding a new DCL type, e.g. DCL_SV which takes the vector name and the system-value semantic as its inputs, so FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL IN[2], FACE, CONSTANT would become FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL_SV IN[2], FACE It likely could be done in a more generic fashion though. Opinions? Zack, What would be the difference between DCL IN[2], FACE, CONSTANT and DCL_SV IN[2], FACE then? Maybe the example is bad, but I don't see what DCL_SV would give us the existing DCL doesn't. Thanks. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Mesa (pipe-format-simplify): Simplify the redundant meaning of format layout.
José Fonseca pisze: This is not true. UTIL_FORMAT_LAYOUT_* are needed for code generation in u_format_access.py and llvmpipe. It seems here what you want is a is-compressed or not flag. If so add that flag to util_format_description, modify u_format_table.py to generate that flag, and leave UTIL_FORMAT_LAYOUT untouched. Agreed -- I had no idea you are going to use UTIL_FORMAT_LAYOUT information in llvmpipe without using the .csv file. I think I am going to provide a convenience function that return the information I want. Thanks. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium3d shader declarations
Zack Rusin pisze: On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote: On Wed, 2009-12-09 at 04:41 -0800, michal wrote: Zack Rusin pisze: Hi, currently Gallium3d shaders predefine all their inputs/outputs. We've handled all inputs/outputs the same way. e.g. VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..9] DCL TEMP[0..3] or FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR There are certain inputs/output which don't really follow the typical rules for inputs/outputs though and we've been imitating those with extra normal semantics (e.g. front face). It all falls apart a bit on anything with shader model 4.x and up. That's because in there we've got what Microsoft calls system-value semantics. ( http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Val ue ). They all represent system-generated inputs/outputs for shaders. And while so far we only really had to handle front-face since shader model 4.x we have to deal with lots of them (geometry shaders, domain shaders, computer shaders... they all have system generated inputs/outputs) I'm thinking of adding something similar to what D3D does to Gallium3d. So just adding a new DCL type, e.g. DCL_SV which takes the vector name and the system-value semantic as its inputs, so FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL IN[2], FACE, CONSTANT would become FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL_SV IN[2], FACE It likely could be done in a more generic fashion though. Opinions? Zack, What would be the difference between DCL IN[2], FACE, CONSTANT and DCL_SV IN[2], FACE then? Maybe the example is bad, but I don't see what DCL_SV would give us the existing DCL doesn't. I'd have proposed something slightly different where the SV values don't land in the INPUT file but some new register file. The reason is that when we start looking at geometry shaders, the INPUT register file becomes two-dimensional, but these SV values remain single-dimensional. That means that for current TGSI we'd have stuff like: DCL IN[0..3][0] POSITION DCL IN[0..3][1] COLOR DCL IN[2] SOME_SYSTEM_VALUE Which is pretty nasty - half of the input file is one dimensional, half two-dimensional, and you need to look at the index of the first dimension to figure out whether the input reg is legal or not. So, I'm think some new register file to handle these system-generated values is one possiblility, as in: DCL SV[0], FACE or DCL SV[1], PRIMITIVE_ID Thoughts? Yea, I like that. And then separate syntax to handle the properties or overloading DCL? i.e. DCL GS_INFO PRIM_IN TRIANGLES vs PROPERTY GS_INFO PRIM_IN TRIANGLES ? I think a geometry shader should have its own GS_INFO token that would convey the information it needs, i.e. no overloading of the DCL token. GS_INFO PRIM_IN TRIANGLES GS_INFO PRIM_OUT TRIANGLE_STRIP GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */ -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Gallium3d shader declarations
Zack Rusin pisze: On Wednesday 09 December 2009 08:55:09 michal wrote: Zack Rusin pisze: On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote: On Wed, 2009-12-09 at 04:41 -0800, michal wrote: Zack Rusin pisze: Hi, currently Gallium3d shaders predefine all their inputs/outputs. We've handled all inputs/outputs the same way. e.g. VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..9] DCL TEMP[0..3] or FRAG DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR There are certain inputs/output which don't really follow the typical rules for inputs/outputs though and we've been imitating those with extra normal semantics (e.g. front face). It all falls apart a bit on anything with shader model 4.x and up. That's because in there we've got what Microsoft calls system-value semantics. ( http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Va l ue ). They all represent system-generated inputs/outputs for shaders. And while so far we only really had to handle front-face since shader model 4.x we have to deal with lots of them (geometry shaders, domain shaders, computer shaders... they all have system generated inputs/outputs) I'm thinking of adding something similar to what D3D does to Gallium3d. So just adding a new DCL type, e.g. DCL_SV which takes the vector name and the system-value semantic as its inputs, so FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL IN[2], FACE, CONSTANT would become FRAG DCL IN[0], COLOR, LINEAR DCL IN[1], COLOR[1], LINEAR DCL_SV IN[2], FACE It likely could be done in a more generic fashion though. Opinions? Zack, What would be the difference between DCL IN[2], FACE, CONSTANT and DCL_SV IN[2], FACE then? Maybe the example is bad, but I don't see what DCL_SV would give us the existing DCL doesn't. I'd have proposed something slightly different where the SV values don't land in the INPUT file but some new register file. The reason is that when we start looking at geometry shaders, the INPUT register file becomes two-dimensional, but these SV values remain single-dimensional. That means that for current TGSI we'd have stuff like: DCL IN[0..3][0] POSITION DCL IN[0..3][1] COLOR DCL IN[2] SOME_SYSTEM_VALUE Which is pretty nasty - half of the input file is one dimensional, half two-dimensional, and you need to look at the index of the first dimension to figure out whether the input reg is legal or not. So, I'm think some new register file to handle these system-generated values is one possiblility, as in: DCL SV[0], FACE or DCL SV[1], PRIMITIVE_ID Thoughts? Yea, I like that. And then separate syntax to handle the properties or overloading DCL? i.e. DCL GS_INFO PRIM_IN TRIANGLES vs PROPERTY GS_INFO PRIM_IN TRIANGLES ? I think a geometry shader should have its own GS_INFO token that would convey the information it needs, i.e. no overloading of the DCL token. GS_INFO PRIM_IN TRIANGLES GS_INFO PRIM_OUT TRIANGLE_STRIP GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */ We'll be adding more of those then. Basically we'll need an extra token for every shader we have. COMPUTE_INFO WORK_GROUP_SIZE 4 4 4 /*x, y, z*/ DS_INFO DOMAIN 3 /*domain shader*/ HS_INFO MAXTESSFACTOR 3 /*hull shader*/ FS_INFO EARLYDEPTSTENCIL 1 etc. To me it looks uglier than a special decleration token that could handle all of them. Can you propose a patch against p_shader_tokens.h that introduces a PROPERTY token? -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] glsl-pp-rework-2 branch merge
Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Thanks. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge
Keith Whitwell pisze: On Wed, 2009-12-09 at 09:16 -0800, michal wrote: Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Thanks. Michal, Can you detail what testing you've done on this branch and which environments you have/haven't built on? Testing: * Capture the output of the old syntax parser and comapre with the output of the new parser. No regressions found. Use a set of over 400 shaders to perform the comparison. * Run GLSL Parser Test to see if the new parser successfully intergrates with the rest of Mesa. No regressions found. So far I have been building that with scons on windows. I am planning to fix the build with make and scons on linux. -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge
Kenneth Graunke pisze: On Wednesday 09 December 2009 09:16:57 michal wrote: Hi all, I would like to merge this branch back to master this week. If anoyone could test if the build works on his/her system, it would be nice. Thanks. Hi Michal, I don't see any code in the new branch that handles the #extension directive. In particular, the old branch had code to support ARB_draw_buffers and ARB_texture_rectangle. What's the status of these in glsl-pp-rework-2? This is handled in sl_pp_extension.c. Also, according to the spec for those extensions, they're supposed to #define their name (i.e. #define GL_ARB_draw_buffers 1)...but I don't see code in either branch to do this. Thanks, this bit is missing. I will fix that. --Kenneth -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] Branch pipe-format-simplify open for review
This branch simplifies pipe/p_format.h by making enum pipe_format what it should have been -- an enum. As a result there is no extra information encoded in it and one needs to use auxiliary/util/u_format.h to get that info instead. Linking to the auxiliary/util lib is necessary. Please review and if you can test if it doesn't break your setup, I will appreciate it. I would like to hear from r300 and nouveau guys, as those drivers were using some internal macros and I weren't 100% sure I got the conversion right. Thanks! -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Branch pipe-format-simplify open for review
Christoph Bumiller pisze: michal schrieb: This branch simplifies pipe/p_format.h by making enum pipe_format what it should have been -- an enum. ... I would like to hear from r300 and nouveau guys, as those drivers were using some internal macros and I weren't 100% sure I got the conversion right. Hi ! In nv50_vbo.c/nv50_vbo_type_to_hw you imply that UTIL_FORMAT_LAYOUT_ARITH means normalized (UNORM, SNORM) and LAYOUT_ARRAY means SCALED, which seems to be not the case. PIPE_FORMAT_R32G32B32A32_SNORM for instance also has layout ARRAY. I'm not sure what ARRAY/ARITH are supposed to mean ... Thanks, I will fix that. Anyway, you could probably base the check on channel[0].normalized, since the formats used for vertex elements are not mixed. I still don't see how to distinguish SCALED and INT though, which at some point will have to indicate integer attributes ... Aren't those the same? What's the distinction between SCALED and INT on NV hardware? -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] adds glsl-shader-loader which is a framework that loads glsl shaders from a .ini file. The files can include test requirements, uniforms to pass to the shaders and expected va
Ian Romanick pisze: + } +} + +void +testFile_parse() +{ + FILE* filePointer; + int i=0, secLength=0, fileLength=0, state=0, currentTest=0; + char c; + char word[32]; + char *cP; + + filePointer = fopen(filename, rt); + if(!filePointer) + piglit_report_result(PIGLIT_FAILURE); + + + while(fgetc(filePointer)!=EOF) + fileLength++; + + if(fileLength 1) + piglit_report_result(PIGLIT_FAILURE); + + fclose(filePointer); + + filePointer = fopen(filename, rt); + buffer = (char*) malloc(fileLength+1); + + c = fgetc(filePointer); + while(c != EOF) + { + buffer[i] = c; + ++i; + c = fgetc(filePointer); + } + + buffer[i] = '\0'; + fclose(filePointer); The code above made my eyes bleed. fp = fopen(filename, r); if (fp == NULL) /* error */ ; fseek(fp, 0, SEEK_END); fileLength = ftell(fp); fseek(fp, 0, SEEK_SET); buffer = malloc(fileLength + 1); fread(buffer, fileLength, 1, fp); fclose(fp); buffer[fileLength] = '\0'; This won't always work on Windows due to newline conversion taking place on fread(), but not being taken into account when calculating file size using fseek()/ftell() pair. Or just use piglit_load_text_file. buffer = pitlit_load_text_file(filename, fileLength); Yes, use that one or open the file in binary mode. -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: On Thu, 2009-11-26 at 10:42 -0800, michal wrote: Keith Whitwell pisze: On Wed, 2009-11-25 at 08:51 -0800, michal wrote: michal pisze: Keith Whitwell pisze: On Wed, 2009-11-25 at 06:28 -0800, michal wrote: Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Interesting. I'd be keen to see a patch. Attached. But the more I look at it the more lame it gets. Another option would be to define tgsi_any_register that would have File, Index, Indirect and Dimension fields. Then there would be more specialised tgsi_*_register tokens, that would be binary compatible with the first one. One could cast them using a union and avoid more mistakes at compile time. That way we don't have to put the constraints in comments, but be more strict and use the compiler to enforce them. I will follow up with a patch. Attached. This makes me wonder about a couple of other things, like whether 16 bits is sufficient for the index value. Probably its fine, but it's not beyond belief to consider a constant buffer of 256k or larger. I'd consider dropping the generic_register struct and any idea of a union of these registers. I'm not really sure we want to encourage the idea of people casting between these registers -- for the most part they should be building these things with ureg-style functions rather than messing around with the tokens directly. If you can easily cast between registers, that defeats any static constraints you attempt to impose via the type system, and you may as well just use src_register for predicates and dimensions. An interpreter which might benefit from being able to share some code paths for the different registers doesn't need the union to be public. Basically, this looks like a good regularization/cleanup, but let's drop generic_register and not create any public union of these register structs. Attached an updated patch. One thing to note in general is that by removing the Extended flags and the fact that some of the tokens already use up all the available 32 bits, the only way to extend the language may be by incrementing the version number in shader's header. This can be a good or a bad thing, depending on the direction Gallium is heading, but with a bit of discipline that should be a good thing. Michal, What's the status of this change? Are you working on building up a full change based on this patch? I'd like to merge this branch sooner rather than later, so if you haven't got something that's pretty much ready to go, let's handle this change in a branch of its own. Nothing more has been done for it, so go ahead and merge. -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state
Attached an updated patch. * Increased PIPE_MAX_VERTEX_SAMPLERS to 16. * Removed PIPE_CAP_MAX_VERTEX_TEXTURES since there's already an equivalent PIPE_CAP_MAX_VERTEX_TEXTURE_UNITS. * Added PIPE_CAP_MAX_COMBINED_SAMPLERS to query maximum texture image units accessible from vertex and fragment shaders combined. Michal Krol pisze: That means we need an additional cap bit to support GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS because it's no longer a simple sum of max vertex and fragment samplers. For i965 max vertex/fragment/combined samplers would be then 16. -- Michal Krol Od: Keith Whitwell Wysłano: 28 listopada 2009 00:40 Do: Michal Krol; Roland Scheidegger DW: mesa3d-dev Temat: RE: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state The i965 can surely do 16, though maybe shared with the fragment shaders. Keith From: michal [mic...@vmware.com] Sent: Friday, November 27, 2009 2:20 PM To: Roland Scheidegger Cc: Keith Whitwell; mesa3d-dev Subject: Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state Roland Scheidegger pisze: On 27.11.2009 19:32, michal wrote: Why is the MAX here smaller than for fragment samplers? Doesn't GL require them to be the same, because GL effectively binds the same set of sampler states in both cases? Can you take a closer look at what the GL state tracker would have to do to expose this functionality and make sure it's valid? It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many samplers can be used in a vertex shader. Anything above that is used only with fragment shaders and ignored for vertex shaders. I fail to see though why the limit needs to be that low. All modern hardware nowadays can use the same number of texture samplers for both fragment and vertex shading (it's the same sampler hardware, after all). Some older hardware (typically non-unified, D3D9 shader model 3 compliant) though indeed only had limited support for this (like the GeForce 6/7 series) probably only supporting 4 (can't remember exactly), though other hardware never implemented it despite d3d9 sm3 requiring it (thanks to a api loophole). Wow, it looks like I need to upgrade my hardware. I thought 4 vertex texture units is generous. I have no problem with setting that limit to, say, 16. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index 5569001..4456620 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -123,7 +123,12 @@ struct pipe_context { void * (*create_sampler_state)(struct pipe_context *, const struct pipe_sampler_state *); - void (*bind_sampler_states)(struct pipe_context *, unsigned num, void **); + void (*bind_fragment_sampler_states)(struct pipe_context *, + unsigned num_samplers, + void **samplers); + void (*bind_vertex_sampler_states)(struct pipe_context *, +unsigned num_samplers, +void **samplers); void (*delete_sampler_state)(struct pipe_context *, void *); void * (*create_rasterizer_state)(struct pipe_context *, @@ -173,9 +178,13 @@ struct pipe_context { void (*set_viewport_state)( struct pipe_context *, const struct pipe_viewport_state * ); - void (*set_sampler_textures)( struct pipe_context *, - unsigned num_textures, - struct pipe_texture ** ); + void (*set_fragment_sampler_textures)(struct pipe_context *, + unsigned num_textures, + struct pipe_texture ** ); + + void (*set_vertex_sampler_textures)(struct pipe_context *, + unsigned num_textures, + struct pipe_texture **); void (*set_vertex_buffers)( struct pipe_context *, unsigned num_buffers, diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index fd14dc8..69a0970 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -390,6 +390,8 @@ enum
[Mesa3d-dev] ODP: [PATCH] Add entrypoints for setting vertex texture state
That means we need an additional cap bit to support GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS because it's no longer a simple sum of max vertex and fragment samplers. For i965 max vertex/fragment/combined samplers would be then 16. -- Michal Krol Od: Keith Whitwell Wysłano: 28 listopada 2009 00:40 Do: Michal Krol; Roland Scheidegger DW: mesa3d-dev Temat: RE: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state The i965 can surely do 16, though maybe shared with the fragment shaders. Keith From: michal [mic...@vmware.com] Sent: Friday, November 27, 2009 2:20 PM To: Roland Scheidegger Cc: Keith Whitwell; mesa3d-dev Subject: Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state Roland Scheidegger pisze: On 27.11.2009 19:32, michal wrote: Why is the MAX here smaller than for fragment samplers? Doesn't GL require them to be the same, because GL effectively binds the same set of sampler states in both cases? Can you take a closer look at what the GL state tracker would have to do to expose this functionality and make sure it's valid? It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many samplers can be used in a vertex shader. Anything above that is used only with fragment shaders and ignored for vertex shaders. I fail to see though why the limit needs to be that low. All modern hardware nowadays can use the same number of texture samplers for both fragment and vertex shading (it's the same sampler hardware, after all). Some older hardware (typically non-unified, D3D9 shader model 3 compliant) though indeed only had limited support for this (like the GeForce 6/7 series) probably only supporting 4 (can't remember exactly), though other hardware never implemented it despite d3d9 sm3 requiring it (thanks to a api loophole). Wow, it looks like I need to upgrade my hardware. I thought 4 vertex texture units is generous. I have no problem with setting that limit to, say, 16. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: Michal, It's really not the job of the shader representation to do this type of versioning between halves of a driver. If we ever get to a point where we want to do versioning in gallium, we'll want the version control to cover the entire interface, not just the shaders. Given that, and given we won't want to have 1 version of TGSI active within a particular version of gallium, there's no purpose for separate versioning of the shader token stream -- it's just one aspect of the total interface. If we want to do such a sanity check, it should be done at context or screen creation - well before we ever get around to creating shaders. So I still don't see any point in a shader version token. The fact that TGSI has been through dramatic changes in its lifetime and still advertises itself as 1.1 illustrates this - it's redundant currently and I don't see any use for it in the future either... That's done. It even didn't hurt that much. Should I go ahead with the patch I sent you earlier? -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state
Hello, Please review the patch below. It extends the gallium interface to allow setting vertex texture sampler states. This is an optional feature -- drivers not wishing to implement it return 0 for PIPE_CAP_MAX_VERTEX_TEXTURES capability query. Drivers may also choose to support it, but always fallback to software implementation from the draw auxiliary module. diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index 5569001..70f9c8b 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -124,6 +124,9 @@ struct pipe_context { void * (*create_sampler_state)(struct pipe_context *, const struct pipe_sampler_state *); void (*bind_sampler_states)(struct pipe_context *, unsigned num, void **); + void (*bind_vertex_sampler_states)(struct pipe_context *, +unsigned num_samplers, +void **samplers); void (*delete_sampler_state)(struct pipe_context *, void *); void * (*create_rasterizer_state)(struct pipe_context *, @@ -184,6 +187,10 @@ struct pipe_context { void (*set_vertex_elements)( struct pipe_context *, unsigned num_elements, const struct pipe_vertex_element * ); + + void (*set_vertex_sampler_textures)(struct pipe_context *, + unsigned num_textures, + struct pipe_texture **); /*...@}*/ diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index fd14dc8..eac6904 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -390,6 +390,7 @@ enum pipe_transfer_usage { #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28 #define PIPE_CAP_SM3 29 /* Shader Model 3 supported */ #define PIPE_CAP_MAX_PREDICATE_REGISTERS 30 +#define PIPE_CAP_MAX_VERTEX_TEXTURES 31 /** diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index 287b424..ce22f89 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -60,6 +60,7 @@ extern C { #define PIPE_MAX_COLOR_BUFS8 #define PIPE_MAX_CONSTANT 32 #define PIPE_MAX_SAMPLERS 16 +#define PIPE_MAX_VERTEX_SAMPLERS 4 #define PIPE_MAX_SHADER_INPUTS16 #define PIPE_MAX_SHADER_OUTPUTS 16 #define PIPE_MAX_TEXTURE_LEVELS 16 -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state
Keith Whitwell pisze: On Fri, 2009-11-27 at 10:10 -0800, michal wrote: Hello, Please review the patch below. It extends the gallium interface to allow setting vertex texture sampler states. This is an optional feature -- drivers not wishing to implement it return 0 for PIPE_CAP_MAX_VERTEX_TEXTURES capability query. Drivers may also choose to support it, but always fallback to software implementation from the draw auxiliary module. Michal, couple of comments inline. diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index 5569001..70f9c8b 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -124,6 +124,9 @@ struct pipe_context { void * (*create_sampler_state)(struct pipe_context *, const struct pipe_sampler_state *); void (*bind_sampler_states)(struct pipe_context *, unsigned num, void **); + void (*bind_vertex_sampler_states)(struct pipe_context *, +unsigned num_samplers, +void **samplers); void (*delete_sampler_state)(struct pipe_context *, void *); void * (*create_rasterizer_state)(struct pipe_context *, @@ -184,6 +187,10 @@ struct pipe_context { void (*set_vertex_elements)( struct pipe_context *, unsigned num_elements, const struct pipe_vertex_element * ); + + void (*set_vertex_sampler_textures)(struct pipe_context *, + unsigned num_textures, + struct pipe_texture **); /*...@}*/ If we're adding these functions, can the old ones be renamed to fragment_sampler_states/textures for clarity? Right, forgot about this one. Attached an updated version. diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index fd14dc8..eac6904 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -390,6 +390,7 @@ enum pipe_transfer_usage { #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28 #define PIPE_CAP_SM3 29 /* Shader Model 3 supported */ #define PIPE_CAP_MAX_PREDICATE_REGISTERS 30 +#define PIPE_CAP_MAX_VERTEX_TEXTURES 31 /** diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index 287b424..ce22f89 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -60,6 +60,7 @@ extern C { #define PIPE_MAX_COLOR_BUFS8 #define PIPE_MAX_CONSTANT 32 #define PIPE_MAX_SAMPLERS 16 +#define PIPE_MAX_VERTEX_SAMPLERS 4 #define PIPE_MAX_SHADER_INPUTS16 #define PIPE_MAX_SHADER_OUTPUTS 16 #define PIPE_MAX_TEXTURE_LEVELS 16 Why is the MAX here smaller than for fragment samplers? Doesn't GL require them to be the same, because GL effectively binds the same set of sampler states in both cases? Can you take a closer look at what the GL state tracker would have to do to expose this functionality and make sure it's valid? It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many samplers can be used in a vertex shader. Anything above that is used only with fragment shaders and ignored for vertex shaders. diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index 5569001..b0f13e6 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -123,7 +123,12 @@ struct pipe_context { void * (*create_sampler_state)(struct pipe_context *, const struct pipe_sampler_state *); - void (*bind_sampler_states)(struct pipe_context *, unsigned num, void **); + void (*bind_fragment_sampler_states)(struct pipe_context *, + unsigned numsamplers, + void **samplers); + void (*bind_vertex_sampler_states)(struct pipe_context *, +unsigned num_samplers, +void **samplers); void (*delete_sampler_state)(struct pipe_context *, void *); void * (*create_rasterizer_state)(struct pipe_context *, @@ -173,9 +178,9 @@ struct pipe_context { void (*set_viewport_state)( struct pipe_context *, const struct pipe_viewport_state * ); - void (*set_sampler_textures)( struct pipe_context *, - unsigned num_textures, - struct pipe_texture ** ); + void (*set_fragment_sampler_textures)(struct pipe_context *, + unsigned num_textures, + struct pipe_texture ** ); void (*set_vertex_buffers)( struct pipe_context *, unsigned
Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state
Roland Scheidegger pisze: On 27.11.2009 19:32, michal wrote: Why is the MAX here smaller than for fragment samplers? Doesn't GL require them to be the same, because GL effectively binds the same set of sampler states in both cases? Can you take a closer look at what the GL state tracker would have to do to expose this functionality and make sure it's valid? It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many samplers can be used in a vertex shader. Anything above that is used only with fragment shaders and ignored for vertex shaders. I fail to see though why the limit needs to be that low. All modern hardware nowadays can use the same number of texture samplers for both fragment and vertex shading (it's the same sampler hardware, after all). Some older hardware (typically non-unified, D3D9 shader model 3 compliant) though indeed only had limited support for this (like the GeForce 6/7 series) probably only supporting 4 (can't remember exactly), though other hardware never implemented it despite d3d9 sm3 requiring it (thanks to a api loophole). Wow, it looks like I need to upgrade my hardware. I thought 4 vertex texture units is generous. I have no problem with setting that limit to, say, 16. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: On Wed, 2009-11-25 at 08:51 -0800, michal wrote: michal pisze: Keith Whitwell pisze: On Wed, 2009-11-25 at 06:28 -0800, michal wrote: Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Interesting. I'd be keen to see a patch. Attached. But the more I look at it the more lame it gets. Another option would be to define tgsi_any_register that would have File, Index, Indirect and Dimension fields. Then there would be more specialised tgsi_*_register tokens, that would be binary compatible with the first one. One could cast them using a union and avoid more mistakes at compile time. That way we don't have to put the constraints in comments, but be more strict and use the compiler to enforce them. I will follow up with a patch. Attached. This makes me wonder about a couple of other things, like whether 16 bits is sufficient for the index value. Probably its fine, but it's not beyond belief to consider a constant buffer of 256k or larger. I'd consider dropping the generic_register struct and any idea of a union of these registers. I'm not really sure we want to encourage the idea of people casting between these registers -- for the most part they should be building these things with ureg-style functions rather than messing around with the tokens directly. If you can easily cast between registers, that defeats any static constraints you attempt to impose via the type system, and you may as well just use src_register for predicates and dimensions. An interpreter which might benefit from being able to share some code paths for the different registers doesn't need the union to be public. Basically, this looks like a good regularization/cleanup, but let's drop generic_register and not create any public union of these register structs. Attached an updated patch. One thing to note in general is that by removing the Extended flags and the fact that some of the tokens already use up all the available 32 bits, the only way to extend the language may be by incrementing the version number in shader's header. This can be a good or a bad thing, depending on the direction Gallium is heading, but with a bit of discipline that should be a good thing. diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 7d73d7d..18eed97 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -290,7 +290,7 @@ union tgsi_immediate_data * respectively. For a given operation code, those numbers are fixed and are * present here only for convenience. * - * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows. + * If Predicate is TRUE, tgsi_predicate_register token immediately follows. * * Saturate controls how are final results in destination registers modified. */ @@ -350,77 +350,88 @@ struct tgsi_instruction_texture unsigned Padding : 24; }; -/* - * For SM3, the following constraint applies. - * - Swizzle is either set to identity or replicate. - */ -struct tgsi_instruction_predicate -{ - int Index: 16; /* SINT */ - unsigned SwizzleX : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleY : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleZ : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleW : 2; /* TGSI_SWIZZLE_x */ - unsigned Negate : 1; /* BOOL */ - unsigned Padding : 7; -}; - /** * File specifies the register array to access. * - * Index specifies the element number of a register in the register file. + * Index specifies the register number in the specified
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: On Thu, 2009-11-26 at 10:42 -0800, michal wrote: Keith Whitwell pisze: On Wed, 2009-11-25 at 08:51 -0800, michal wrote: michal pisze: Keith Whitwell pisze: On Wed, 2009-11-25 at 06:28 -0800, michal wrote: Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Interesting. I'd be keen to see a patch. Attached. But the more I look at it the more lame it gets. Another option would be to define tgsi_any_register that would have File, Index, Indirect and Dimension fields. Then there would be more specialised tgsi_*_register tokens, that would be binary compatible with the first one. One could cast them using a union and avoid more mistakes at compile time. That way we don't have to put the constraints in comments, but be more strict and use the compiler to enforce them. I will follow up with a patch. Attached. This makes me wonder about a couple of other things, like whether 16 bits is sufficient for the index value. Probably its fine, but it's not beyond belief to consider a constant buffer of 256k or larger. I'd consider dropping the generic_register struct and any idea of a union of these registers. I'm not really sure we want to encourage the idea of people casting between these registers -- for the most part they should be building these things with ureg-style functions rather than messing around with the tokens directly. If you can easily cast between registers, that defeats any static constraints you attempt to impose via the type system, and you may as well just use src_register for predicates and dimensions. An interpreter which might benefit from being able to share some code paths for the different registers doesn't need the union to be public. Basically, this looks like a good regularization/cleanup, but let's drop generic_register and not create any public union of these register structs. Attached an updated patch. One thing to note in general is that by removing the Extended flags and the fact that some of the tokens already use up all the available 32 bits, the only way to extend the language may be by incrementing the version number in shader's header. This can be a good or a bad thing, depending on the direction Gallium is heading, but with a bit of discipline that should be a good thing. I don't see that as an issue. First and foremost, TGSI is part of gallium, which itself makes no binary compatibility guarantees from one build to the next. In terms of tracing and replay, or any other use of TGSI to communicate shaders between components that weren't necessarily built at the same time, then yes a version number would be nice. But those shaders won't exist in isolation and the rest of the 3d commands state will need to establish compatibility. It's not the job of the shader representation to do versioning between two gallium-speaking entities. From that point of view I'm really not sure what the purpose of the version number, is in our representation, unless we want to be able to support multiple versions of TGSI simultaneously in one gallium instance. And in turn, I can't really think why we'd want to do that... So -- lets remove the version token while we're here. One scenario is a sanity check done in the gallium driver. Check if version number matches (exact match) -- there can be changes in the interface
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Thanks. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI simplification branch
Keith Whitwell pisze: On Wed, 2009-11-25 at 06:28 -0800, michal wrote: Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Interesting. I'd be keen to see a patch. Attached. But the more I look at it the more lame it gets. Another option would be to define tgsi_any_register that would have File, Index, Indirect and Dimension fields. Then there would be more specialised tgsi_*_register tokens, that would be binary compatible with the first one. One could cast them using a union and avoid more mistakes at compile time. That way we don't have to put the constraints in comments, but be more strict and use the compiler to enforce them. I will follow up with a patch. diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 7d73d7d..7bea99a 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -290,7 +290,9 @@ union tgsi_immediate_data * respectively. For a given operation code, those numbers are fixed and are * present here only for convenience. * - * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows. + * If Predicate is TRUE, tgsi_src_register token immediately follows. Only + * the File, Index, Negate and Swizzle* fields are valid. File must be set + * to TGSI_FILE_PREDICATE and Swizzle is either set to identity or replicate. * * Saturate controls how are final results in destination registers modified. */ @@ -350,21 +352,6 @@ struct tgsi_instruction_texture unsigned Padding : 24; }; -/* - * For SM3, the following constraint applies. - * - Swizzle is either set to identity or replicate. - */ -struct tgsi_instruction_predicate -{ - int Index: 16; /* SINT */ - unsigned SwizzleX : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleY : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleZ : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleW : 2; /* TGSI_SWIZZLE_x */ - unsigned Negate : 1; /* BOOL */ - unsigned Padding : 7; -}; - /** * File specifies the register array to access. * @@ -396,23 +383,12 @@ struct tgsi_src_register }; /** - * If tgsi_src_register::Modifier is TRUE, tgsi_src_register_modifier follows. - * - * Then, if tgsi_src_register::Indirect is TRUE, another tgsi_src_register - * follows. + * If tgsi_src_register::Indirect is TRUE, tgsi_src_register follows. * - * Then, if tgsi_src_register::Dimension is TRUE, tgsi_dimension follows. + * If tgsi_src_register::Dimension is TRUE, tgsi_src_register follows. + * Only the Indirect, Dimension and Index fields are valid. */ - -struct tgsi_dimension -{ - unsigned Indirect: 1; /* BOOL */ - unsigned Dimension : 1; /* BOOL */ - unsigned Padding : 14; - int Index : 16; /* SINT */ -}; - struct tgsi_dst_register { unsigned File: 4; /* TGSI_FILE_ */ -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] TGSI simplification branch
michal pisze: Keith Whitwell pisze: On Wed, 2009-11-25 at 06:28 -0800, michal wrote: Keith Whitwell pisze: I've pushed a feature branch with some tgsi simplifications in it. With these I've removed the biggest remaining oddities of that language, and it's getting to a place where I'm starting to be happy with it as a foundation for future work. Most of the surprising stuff like multiple negate flags, etc, is gone now, and the core tokens are quite a bit easier to understand than in previous iterations. I've still got my eye on reducing the verbosity of the names in the tgsi_parse.h FullToken world, and promoting the tgsi_any_token union into p_shader_tokens.h. It would be good if people can review the interface changes and provide feedback, and also test out their drivers on this branch. I've done minimal softpipe testing so far but will do more over the next few days. All looks good to me, I'm happy somebody had the guts to cut off all the cruft in one shot. I see some compile errors on windows build -- I will fix those along with other minor bugs I have spotted. Now, looking at the interface, I'm thinking about removing some more tokens. 1) Remove tgsi_dimension and use tgsi_src_register directly with some well-defined constraints. 2) Do the same to tgsi_instruction_predicate. Really, it's just an optional src operand with some restrictions. Interesting. I'd be keen to see a patch. Attached. But the more I look at it the more lame it gets. Another option would be to define tgsi_any_register that would have File, Index, Indirect and Dimension fields. Then there would be more specialised tgsi_*_register tokens, that would be binary compatible with the first one. One could cast them using a union and avoid more mistakes at compile time. That way we don't have to put the constraints in comments, but be more strict and use the compiler to enforce them. I will follow up with a patch. Attached. diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 7d73d7d..2be8fbc 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -290,7 +290,7 @@ union tgsi_immediate_data * respectively. For a given operation code, those numbers are fixed and are * present here only for convenience. * - * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows. + * If Predicate is TRUE, tgsi_predicate_register token immediately follows. * * Saturate controls how are final results in destination registers modified. */ @@ -350,77 +350,99 @@ struct tgsi_instruction_texture unsigned Padding : 24; }; -/* - * For SM3, the following constraint applies. - * - Swizzle is either set to identity or replicate. +/** + * File specifies the register array to access. + * + * Index specifies the register number in the specified register file. + * + * If Indirect is TRUE, Index should be offset by the tgsi_indirect_register + * that follows. + * + * If Dimension is TRUE, tgsi_dimension_register follows. */ -struct tgsi_instruction_predicate + +struct tgsi_generic_register { int Index: 16; /* SINT */ - unsigned SwizzleX : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleY : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleZ : 2; /* TGSI_SWIZZLE_x */ - unsigned SwizzleW : 2; /* TGSI_SWIZZLE_x */ - unsigned Negate : 1; /* BOOL */ - unsigned Padding : 7; + unsigned File : 4; /* TGSI_FILE_ */ + unsigned Indirect : 1; /* BOOL */ + unsigned Dimension: 1; /* BOOL */ + unsigned Reserved : 10; }; /** - * File specifies the register array to access. - * - * Index specifies the element number of a register in the register file. + * If Absolute is TRUE, all components of the register get their signs + * cleared. * - * If Indirect is TRUE, Index should be offset by the X component of a source - * register that follows. The register can be now fetched into local storage - * for further processing. + * If Negate is TRUE, all components of the register are negated. * - * If Negate is TRUE, all components of the fetched register are negated. - * - * The fetched register components are swizzled according to SwizzleX, SwizzleY, + * The register components are swizzled according to SwizzleX, SwizzleY, * SwizzleZ and SwizzleW. - * */ struct tgsi_src_register { - unsigned File: 4; /* TGSI_FILE_ */ - unsigned Indirect: 1; /* BOOL */ - unsigned Dimension : 1; /* BOOL */ - int Index : 16; /* SINT */ - unsigned SwizzleX: 2; /* TGSI_SWIZZLE_ */ - unsigned SwizzleY: 2; /* TGSI_SWIZZLE_ */ - unsigned SwizzleZ: 2; /* TGSI_SWIZZLE_ */ - unsigned SwizzleW: 2; /* TGSI_SWIZZLE_ */ - unsigned Absolute: 1;/* BOOL */ - unsigned Negate : 1;/* BOOL */ + int Index: 16; /* SINT */ + unsigned File : 4
Re: [Mesa3d-dev] Mesa (master): tgsi: Fix POSITION and FACE fragment shader inputs.
Keith Whitwell pisze: On Mon, 2009-11-23 at 17:28 -0800, Brian Paul wrote: For OpenGL, the front-facing attribute is either 0 (back) or 1 (front) rather than +/-1. I think we'll need to do some additional work (insert a MAD instr?) in the Mesa-TGSI translation to account for this difference. I could dig into that someday... I'm assuming DX or some other API uses +/-1? If we define tgsi to use +/-1, then the GL 0/1 version can be reached by just saturating. Getting from 0/1 to +/-1 looks like it would be a MAD as you say. Probably +/-1 is easy to calculate as the sign of the determinant, which would be an intermediate step to calculate GL's version. If it's OK, let's define TGSI's face reg as +/-1, and have Mesa insert the saturate if necessary. OK, I have documented that as negative/positive since it gives us more freedom in the future. If it is a problem, we can explicitly say it's either -1 or +1 later. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] GLSL compiler performance
This is a heads up for what is going on on the glsl-pp-rework-2 feature branch. The goal was to rewrite the existing preprocessor for GLSL compiler as a stepping stone for a better GLSL compiler in general. Make it faster, easy to understand and maintain. But the most important thing was to make it easy to plug a new syntax parser in the future (Ian has started some work towards this). That's done. The next step was to integrate the new preprocessor with the existing syntax parser to allow me to measure compiler performance. The results were not very satisfying -- it turned out the syntax parser was such a huge bottleneck, it did not matter how fast the preprocessor was. So I just hacked up a simple and fast syntax parser that basically emulates the old one, so that we don't have to touch too much code, spend too much time on it and intruduce regressions. It's not perfect, it looks ugly but it works well. I don't mind scrapping it and replacing with a new, bison-based parser. And for the time being, here are the numbers. The benchmark takes CorrectConstFolding2.vert shader from the GLSL Compiler Test suite. It's one of the biggest files with around 400 lines of code. What is being timed is preprocessing + syntax parsing, so no further semantic checking and code generation is taken into account. old 128,157 us new 4,719 us improvement 27 x After that the new preprocessor becomes the bottleneck. That means it's worthwile to keep on improving it. If the preprocessing step is taken out of the equation and we only measure the syntax parsing stage, we get the following numbers. old 124,671 new 1,016 improvement 122 x The code for the new parse should be in mesa repository in a week or so. Thanks. -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.
michal pisze: Keith Whitwell pisze: On Fri, 2009-10-30 at 11:24 -0700, michal wrote: +/* + * Currently, the following constraints apply. + * + * - PredSwizzleXYZW is either set to identity or replicate. + * - PredSrcIndex is 0. + */ Michal, This is looking a lot better. In terms of the above comment, is this talking about the semantics of PIPE_CAP_GPU3 ? Or is GPU3 supposed to do full PredSwizzle/PredSrcIndex, just we haven't implemented it somewhere (eg in tgsi_exec.c)? I'd think we want to either: - remove fields from the token so that the comment isn't necessary, - remove the comment and have GPU3 mean that the full semantics are available - come up with yet another cap bit to say whether or not full predicate semantics are implemented by a particular driver. Needless to say I don't like the last option, so I guess that means we need to decide now whether the full semantics in the token are in or out. How does SM3 fall on these issues? The SM3 specification explicitly states that the predicate swizzle needs to be either .xyzw or component replicate. The GL_MESA_gpu_program3 spec allows arbitrary swizzles (there's nothing in the document that would say otherwise). I say, rename PIPE_CAP_GPU3 to PIPE_CAP_SM3 to indicate predicates are supported with the mentioned swizzle constraints. When the dust settles on gpu_program3 spec, the state tracker will compensate for the lack of arbitrary swizzles if needed. Also, add PIPE_CAP_MAX_PREDICATES to query the number of predicate registers supported by the driver. That will allow us to remove the `PredSrcIndex is 0' constraint. Attached a proposed patch for that. From 085a5e8c33b2ed6347de2e86d9e972d70438c014 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Sat, 31 Oct 2009 09:09:26 + Subject: [PATCH] gallium: Cleanup predicate and condition code TGSI tokens. There is little point in having a special TGSI token just to handle predicate register updates. Remove tgsi_dst_register_ext_predicate token and instead use a new PREDICATE register file to update predicates. Actually, the contents of the obsolete token are being moved to tgsi_instruction_ext_predicate, where they should be from the very beginning. Remove the NVIDIA-specific condition code tokens -- nobody uses them and they can be emulated with predicates if needed. Introduce PIPE_CAP_SM3 that indicates whether a driver supports SM3-level instructions, and in particular predicates. Add PIPE_CAP_MAX_PREDICATE_REGISTERS that can be used to query the driver how many predicate registers it supports (currently it would be 1). --- src/gallium/include/pipe/p_defines.h |2 + src/gallium/include/pipe/p_shader_tokens.h | 117 --- 2 files changed, 20 insertions(+), 99 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 52887ea..6a61aea 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -333,6 +333,8 @@ enum pipe_transfer_usage { #define PIPE_CAP_MAX_VERTEX_TEXTURE_UNITS 26 #define PIPE_CAP_TGSI_CONT_SUPPORTED 27 #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28 +#define PIPE_CAP_SM3 29 /* Shader Model 3 supported */ +#define PIPE_CAP_MAX_PREDICATE_REGISTERS 30 /** diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..d4c8aad 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -1,6 +1,7 @@ /** * * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas. + * Copyright 2009 VMware, Inc. * All Rights Reserved. * * Permission is hereby granted, free of charge, to any person obtaining a @@ -25,8 +26,8 @@ * **/ -#ifndef TGSI_TOKEN_H -#define TGSI_TOKEN_H +#ifndef P_SHADER_TOKENS_H +#define P_SHADER_TOKENS_H #ifdef __cplusplus extern C { @@ -79,6 +80,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; @@ -319,7 +321,6 @@ struct tgsi_instruction * instruction, including the instruction word. */ -#define TGSI_INSTRUCTION_EXT_TYPE_NV0 #define TGSI_INSTRUCTION_EXT_TYPE_LABEL 1 #define TGSI_INSTRUCTION_EXT_TYPE_TEXTURE 2 #define TGSI_INSTRUCTION_EXT_TYPE_PREDICATE 3 @@ -332,9 +333,6 @@ struct tgsi_instruction_ext }; /* - * If tgsi_instruction_ext::Type is TGSI_INSTRUCTION_EXT_TYPE_NV, it should - * be cast to tgsi_instruction_ext_nv. - * * If tgsi_instruction_ext::Type is TGSI_INSTRUCTION_EXT_TYPE_LABEL, it * should be cast to tgsi_instruction_ext_label. * @@ -348,56 +346,11
[Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.
gallium: Add a PREDICATE register file. There's already a shader token that allows composition of predicated instructions (tgsi_instruction_ext_predicate). However, there is no way one can write to thos predicate registers in the first place. --- src/gallium/include/pipe/p_shader_tokens.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..6aa8b27 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -79,6 +79,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; -- 1.6.4.msysgit.0 -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.
Keith Whitwell pisze: On Fri, 2009-10-30 at 03:43 -0700, michal wrote: gallium: Add a PREDICATE register file. There's already a shader token that allows composition of predicated instructions (tgsi_instruction_ext_predicate). However, there is no way one can write to thos predicate registers in the first place. --- src/gallium/include/pipe/p_shader_tokens.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..6aa8b27 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -79,6 +79,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; Michal, Is your expectation that all drivers become able to understand instructions with predicates? That seems unreasonable. What is the expected way of setting a predicate register? What functionality will use this? For example: DECL IN[0..1] DECL OUT[0] DECL PRED[0] 1: MOV OUT[0], IN[0] 2: SGT PRED[0], IN[0], IN[1] 3: (PRED[0]) MOV OUT[0], IN[1] In (2) we set each component of PRED[0] to 1.0 if the corresponding components of IN[0] are greater than IN[1], and to 0.0 otherwise. In (3) we write IN[1] to only those components of OUT[0], for which the respective components of PRED[0] are non-zero. It seems there are three ways to do conditional execution in TGSI currently -- predicates, condition codes and IF/THEN/ELSE instructions. I'd really prefer to have at most two, and in fact preferably just one. Can you take a look at the three alternatives and figure out if one can be amputated? We could kill off the condition codes -- no driver uses that, and it's easier for us to emulate them with predicates than the other way round. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.
Brian Paul pisze: Keith Whitwell wrote: On Fri, 2009-10-30 at 04:36 -0700, michal wrote: Keith Whitwell pisze: On Fri, 2009-10-30 at 03:43 -0700, michal wrote: gallium: Add a PREDICATE register file. There's already a shader token that allows composition of predicated instructions (tgsi_instruction_ext_predicate). However, there is no way one can write to thos predicate registers in the first place. --- src/gallium/include/pipe/p_shader_tokens.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..6aa8b27 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -79,6 +79,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; Michal, Is your expectation that all drivers become able to understand instructions with predicates? That seems unreasonable. What is the expected way of setting a predicate register? What functionality will use this? For example: DECL IN[0..1] DECL OUT[0] DECL PRED[0] 1: MOV OUT[0], IN[0] 2: SGT PRED[0], IN[0], IN[1] 3: (PRED[0]) MOV OUT[0], IN[1] In (2) we set each component of PRED[0] to 1.0 if the corresponding components of IN[0] are greater than IN[1], and to 0.0 otherwise. In (3) we write IN[1] to only those components of OUT[0], for which the respective components of PRED[0] are non-zero. It seems there are three ways to do conditional execution in TGSI currently -- predicates, condition codes and IF/THEN/ELSE instructions. I'd really prefer to have at most two, and in fact preferably just one. Can you take a look at the three alternatives and figure out if one can be amputated? We could kill off the condition codes -- no driver uses that, and it's easier for us to emulate them with predicates than the other way round. I think I agree with that. Condition codes are pretty wierd, the only reason I'd keep them around is that there is the NV GPU4 extension sitting there as a ready-made definition of a high-end SM4-level assembly language. I don't know if Ian plans to introduce a MESA version of the program4 extension that more closely matches his program3 extension (ie predicates instead of condition codes). Just FYI: GL_NV_fragment_program uses condition codes but we haven't supported that extension with Gallium; only the ARB versions. We could always remove condition codes later, when Ian decides about their future. Attached is an updated patch that obsoletes one TGSI token and fixes the other one, so we can specify swizzles and negation of predicate registers, per GL_MESA_gpu_program3. Thanks for comments. From d1efdb692ae99871585554ddeaff75e700349d70 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Fri, 30 Oct 2009 13:55:14 + Subject: [PATCH] gallium: Add a PREDICATE register file. There is little point in having a special TGSI token just to handle predicate register updates. Remove tgsi_dst_register_ext_predicate token and instead use a new PREDICATE register file to update predicates. Actually, the contents of the obsolete token are being moved to tgsi_instruction_ext_predicate, where they should be from the very beginning. --- src/gallium/include/pipe/p_shader_tokens.h | 40 +++- 1 files changed, 10 insertions(+), 30 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..f3b8a7b 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -79,6 +79,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; @@ -427,11 +428,15 @@ struct tgsi_instruction_ext_texture struct tgsi_instruction_ext_predicate { - unsigned Type : 4;/* TGSI_INSTRUCTION_EXT_TYPE_PREDICATE */ - unsigned PredDstIndex : 4;/* UINT */ - unsigned PredWriteMask: 4;/* TGSI_WRITEMASK_ */ - unsigned Padding : 19; - unsigned Extended : 1;/* BOOL */ + unsigned Type : 4;/* TGSI_INSTRUCTION_EXT_TYPE_PREDICATE */ + unsigned PredSwizzleX : 2;/* TGSI_SWIZZLE_ */ + unsigned PredSwizzleY : 2;/* TGSI_SWIZZLE_ */ + unsigned PredSwizzleZ : 2;/* TGSI_SWIZZLE_ */ + unsigned PredSwizzleW : 2;/* TGSI_SWIZZLE_ */ + unsigned PredSrcIndex : 4;/* UINT */ + unsigned Negate : 1;/* BOOL */ + unsigned Padding : 14; + unsigned Extended : 1;/* BOOL */ }; /** @@ -548,7 +553,6 @@ struct tgsi_dst_register #define
Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.
Keith Whitwell pisze: On Fri, 2009-10-30 at 10:19 -0700, michal wrote: Brian Paul pisze: Keith Whitwell wrote: On Fri, 2009-10-30 at 04:36 -0700, michal wrote: Keith Whitwell pisze: On Fri, 2009-10-30 at 03:43 -0700, michal wrote: gallium: Add a PREDICATE register file. There's already a shader token that allows composition of predicated instructions (tgsi_instruction_ext_predicate). However, there is no way one can write to thos predicate registers in the first place. --- src/gallium/include/pipe/p_shader_tokens.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index de338c4..6aa8b27 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -79,6 +79,7 @@ enum tgsi_file_type { TGSI_FILE_ADDRESS =6, TGSI_FILE_IMMEDIATE =7, TGSI_FILE_LOOP=8, + TGSI_FILE_PREDICATE =9, TGSI_FILE_COUNT /** how many TGSI_FILE_ types */ }; Michal, Is your expectation that all drivers become able to understand instructions with predicates? That seems unreasonable. What is the expected way of setting a predicate register? What functionality will use this? For example: DECL IN[0..1] DECL OUT[0] DECL PRED[0] 1: MOV OUT[0], IN[0] 2: SGT PRED[0], IN[0], IN[1] 3: (PRED[0]) MOV OUT[0], IN[1] In (2) we set each component of PRED[0] to 1.0 if the corresponding components of IN[0] are greater than IN[1], and to 0.0 otherwise. In (3) we write IN[1] to only those components of OUT[0], for which the respective components of PRED[0] are non-zero. It seems there are three ways to do conditional execution in TGSI currently -- predicates, condition codes and IF/THEN/ELSE instructions. I'd really prefer to have at most two, and in fact preferably just one. Can you take a look at the three alternatives and figure out if one can be amputated? We could kill off the condition codes -- no driver uses that, and it's easier for us to emulate them with predicates than the other way round. I think I agree with that. Condition codes are pretty wierd, the only reason I'd keep them around is that there is the NV GPU4 extension sitting there as a ready-made definition of a high-end SM4-level assembly language. I don't know if Ian plans to introduce a MESA version of the program4 extension that more closely matches his program3 extension (ie predicates instead of condition codes). Just FYI: GL_NV_fragment_program uses condition codes but we haven't supported that extension with Gallium; only the ARB versions. We could always remove condition codes later, when Ian decides about their future. Attached is an updated patch that obsoletes one TGSI token and fixes the other one, so we can specify swizzles and negation of predicate registers, per GL_MESA_gpu_program3. Thanks for comments. OK, I think I'd prefer to remove condition codes as part of this -- there are no users for them (that we care about), and we don't want drivers to have to implement both techniques. If in the future we want condition codes in the mesa state tracker, we'll have to do the work of converting them to predicates and/or IF/THEN/ELSE, but it will probably less effort than trying to teach all the drivers about condition codes. Can I ask for a third version that removes condition codes? In terms of drivers supporting this, we probably want another pipe_cap flag, probably PIPE_CAP_GPU3, to indicate that a particular driver has GPU3/SM3 support. Can you add that to the interface as well? Attached third version. From d3102528484decff0a6d1effb27545c4d76976d1 Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Fri, 30 Oct 2009 18:19:52 + Subject: [PATCH] gallium: Cleanup predicate and condition code TGSI tokens. There is little point in having a special TGSI token just to handle predicate register updates. Remove tgsi_dst_register_ext_predicate token and instead use a new PREDICATE register file to update predicates. Actually, the contents of the obsolete token are being moved to tgsi_instruction_ext_predicate, where they should be from the very beginning. Remove the NVIDIA-specific condition code tokens -- nobody uses them and they can be emulated with predicates if needed. Introduce PIPE_CAP_GPU3 that indicates whether a driver supports SM3-level instructions, and in particular predicates. --- src/gallium/include/pipe/p_defines.h |1 + src/gallium/include/pipe/p_shader_tokens.h | 111 2 files changed, 17 insertions(+), 95 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 52887ea..98cb9e8 100644 --- a/src
[Mesa3d-dev] drawing with elements out of range
I've been able to crash my app that uses a gallium driver by feeding the draw module an index buffer with garbage contents. Is there a desire to add out-of-bounds checking of every index element, or is it being ignored on purpose for performance reasons? Thanks. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] drawing with elements out of range
Keith Whitwell pisze: On Fri, 2009-10-09 at 04:10 -0700, michal wrote: I've been able to crash my app that uses a gallium driver by feeding the draw module an index buffer with garbage contents. Is there a desire to add out-of-bounds checking of every index element, or is it being ignored on purpose for performance reasons? Michal, There is code that should prevent this, but it probably doesn't get heaps of testing. Probably the best thing to do is provide a trivial/ example that exercises the problem you're seeing. While I would agree otherwise, there is a high chance the test app won't trigger a segfault. I am happy the intent is to check element indirections and should be able to provide a patch for it for a review instead. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] draw: Do an out-of-bounds check on array elements.
From 5ebc14fc47a5e31b3c6be54142550bdf2ac093df Mon Sep 17 00:00:00 2001 From: Michal Krol mic...@vmware.com Date: Fri, 9 Oct 2009 13:30:52 +0100 Subject: [PATCH] draw: Do an out-of-bounds check on array elements. Do not draw a reduced primitive if any of its vertices reaches outside of the vertex array. --- src/gallium/auxiliary/draw/draw_pipe.c | 68 ++- 1 files changed, 48 insertions(+), 20 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe.c b/src/gallium/auxiliary/draw/draw_pipe.c index 1c6d657..5b88f00 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.c +++ b/src/gallium/auxiliary/draw/draw_pipe.c @@ -158,37 +158,64 @@ static void do_triangle( struct draw_context *draw, -#define QUAD(i0,i1,i2,i3) \ +#define QUAD(i0,i1,i2,i3) do {\ + uint e0 = (uint)elts[i0]; \ + uint e1 = (uint)elts[i1]; \ + uint e2 = (uint)elts[i2]; \ + uint e3 = (uint)elts[i3]; \ + if (e0 = vertex_count || e1 = vertex_count || e2 = vertex_count || \ + e3 = vertex_count) { \ + break; \ + } \ do_triangle( draw, \ ( DRAW_PIPE_RESET_STIPPLE | \ DRAW_PIPE_EDGE_FLAG_0 | \ DRAW_PIPE_EDGE_FLAG_2 ), \ -verts + stride * elts[i0], \ -verts + stride * elts[i1], \ -verts + stride * elts[i3]); \ + verts + stride * e0, \ + verts + stride * e1, \ + verts + stride * e3); \ do_triangle( draw, \ ( DRAW_PIPE_EDGE_FLAG_0 | \ DRAW_PIPE_EDGE_FLAG_1 ), \ -verts + stride * elts[i1], \ -verts + stride * elts[i2], \ -verts + stride * elts[i3]) - -#define TRIANGLE(flags,i0,i1,i2)\ + verts + stride * e1, \ + verts + stride * e2, \ + verts + stride * e3); \ +} while (0) + +#define TRIANGLE(flags,i0,i1,i2) do { \ + uint e0 = (uint)elts[i0] ~DRAW_PIPE_FLAG_MASK; \ + uint e1 = (uint)elts[i1]; \ + uint e2 = (uint)elts[i2]; \ + if (e0 = vertex_count || e1 = vertex_count || e2 = vertex_count) { \ + break; \ + } \ do_triangle( draw, \ elts[i0], /* flags */ \ -verts + stride * (elts[i0] ~DRAW_PIPE_FLAG_MASK), \ -verts + stride * elts[i1], \ -verts + stride * elts[i2]) - -#define LINE(flags,i0,i1) \ + verts + stride * e0, \ + verts + stride * e1, \ + verts + stride * e2); \ +} while (0) + +#define LINE(flags,i0,i1) do {\ + uint e0 = (uint)elts[i0] ~DRAW_PIPE_FLAG_MASK; \ + uint e1 = (uint)elts[i1]; \ + if (e0 = vertex_count || e1 = vertex_count) {\ + break; \ + } \ do_line( draw, \ elts[i0], \ -verts + stride * (elts[i0] ~DRAW_PIPE_FLAG_MASK), \ -verts + stride * elts[i1]) - -#define POINT(i0) \ + verts + stride * e0, \ + verts + stride * e1); \ +} while (0) + +#define POINT(i0) do
Re: [Mesa3d-dev] [PATCH] draw: Do an out-of-bounds check on array elements.
Keith Whitwell pisze: Michal, Sorry, this isn't a great way to do this. This can usually be caught much earlier in the pipeline and with much less overhead by validating the incoming index list. OK, so we scan the whole element array beforehand, and if any element is out of range, we kill the while primitive, right? We normally do that in Mesa or the state tracker, if that helps. Does this mean we actually don't want to check that in the draw module and we should deal with it on the state tracker level? -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev