from:"michal"

Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge

2010-03-15 Thread michal

michal wrote on 2010-03-12 15:00:
 michal wrote on 2010-03-11 17:59:
   
 Keith Whitwell wrote on 2010-03-11 16:16:
   
 
 On Thu, 2010-03-11 at 06:05 -0800, michal wrote:
   
 
   
 Keith Whitwell wrote on 2010-03-11 14:21:
 
   
 
 On Thu, 2010-03-11 at 03:16 -0800, michal wrote:
   
   
 
   
 Hi,

 I would like to merge the branch in subject this week. This feature 
 branch allows state trackers to bind sampler views instead of textures 
 to shader stages.

 A sampler view object holds a reference to a texture and also overrides 
 internal texture format (resource casting) and specifies RGBA swizzle 
 (needed for GL_EXT_texture_swizzle extension).
 
 
   
 
 Michal,

 I've got some issues with the way the sampler views are being generated
 and used inside the CSO module.

 The point of a sampler view is that it gives the driver an opportunity
 to do expensive operations required for special sampling modes (which
 may include copying surface data if hardware is deficient in some way).

 This approach works if a sampler view is created once, then used
 multiple times before being deleted.

 Unfortunately, it seems the changes to support this in the CSO module
 provide only a single-shot usage model.  Sampler views are created in
 cso_set_XXX_sampler_textures, bound to the context, and then
 dereferenced/destroyed on the next bind.

   
   
 
   
 The reason CSO code looks like this is because it was meant to be an 
 itermediate step towards migration to sampler view model. Fully 
 converting all existing state trackers is non-trivial and thus I chose 
 this conservative approach. State trackers that do not care about extra 
 features a sampler view provides will keep using this one-shot CSO 
 interface with the hope that creation of sampler objects is lighweight 
 (format matches texture format, swizzle matches native texel layout, 
 etc.). 
 
   
 
 On the surface, this hope isn't likely to be fulfilled - lots of
 hardware doesn't support non-zero first_level.  Most cases of drivers
 implementing sampler views internally are to catch this issue.

 Of course, it seems like your branch so leaves the existing
 driver-specific sampler view code in place, so that there are
 potentially two implementations of sampler views in those drivers.  

 I guess this means that you can get away with the current implementation
 for now, but it prevents drivers actually taking advantage of the fact
 that these entities exist in the interface -- they will continue to have
 to duplicate the concept internally until the state trackers and/or CSO
 module start caching views.

   
 
   
 Ideally, everybody moves on and we stop using CSO for sampler 
 views. I prefer putting my effort into incremental migration of state 
 trackers rather than caching something that by definition doesn't need 
 to be cached.
 
   
 
 The CSO module exists to manage this type of caching on behalf of state
 trackers.  I would have thought that this was a sensible extension of
 the existing purpose of the CSO module.

 Won't all state-trackers implementing APIs which don't expose sampler
 views to the application require essentially the same caching logic, as
 is the case with regular state?  Wouldn't it be least effort to do that
 caching once only in the CSO module?
   
 
   
 OK, I see your point. I will make the necessary changes and ping you 
 when that's done.

   
 
 Keith,

 I changed my mind, went ahead and implemented sampler view caching in 
 mesa state tracker, rather than inside cso context.

 I strongly believe that doing caching on cso side would be slower and 
 more complicated. A state tracker has a better understanding of the 
 relationship between a texture and sampler view. In case of mesa, this 
 is trivial 1-to-1 mapping. Later, when we'll need more sampler views per 
 texture, we can have a per-texture cache for that, and yes, the code for 
 that would be in cso.

 There are two other state trackers that need to be fixed: xorg and vega. 
 The transition should be similar to mesa -- I can help with doing that, 
 but I can't do it myself. Once that's done we can purge one-shot sampler 
 view wrappers.

 What do you think?

   
Keith,

I just finished transforming mesa and auxiliary modules to new sampler 
view interfaces. The remaining bits are vega and xorg state trackers -- 
I will need help with them, but they could be fixed after the merge, as 
they are not broken, and just set sampler view in suboptimal fashion.

Please review, thanks.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw

Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge

2010-03-15 Thread michal

Keith Whitwell wrote on 2010-03-15 15:19:
 On Mon, 2010-03-15 at 07:08 -0700, michal wrote:
   
 michal wrote on 2010-03-12 15:00:
 
 michal wrote on 2010-03-11 17:59:
   
   
 Keith Whitwell wrote on 2010-03-11 16:16:
   
 
 
 On Thu, 2010-03-11 at 06:05 -0800, michal wrote:
   
 
   
   
 Keith Whitwell wrote on 2010-03-11 14:21:
 
   
 
 
 On Thu, 2010-03-11 at 03:16 -0800, michal wrote:
   
   
 
   
   
 Hi,

 I would like to merge the branch in subject this week. This feature 
 branch allows state trackers to bind sampler views instead of textures 
 to shader stages.

 A sampler view object holds a reference to a texture and also 
 overrides 
 internal texture format (resource casting) and specifies RGBA swizzle 
 (needed for GL_EXT_texture_swizzle extension).
 
 
   
 
 
 Michal,

 I've got some issues with the way the sampler views are being generated
 and used inside the CSO module.

 The point of a sampler view is that it gives the driver an opportunity
 to do expensive operations required for special sampling modes (which
 may include copying surface data if hardware is deficient in some way).

 This approach works if a sampler view is created once, then used
 multiple times before being deleted.

 Unfortunately, it seems the changes to support this in the CSO module
 provide only a single-shot usage model.  Sampler views are created in
 cso_set_XXX_sampler_textures, bound to the context, and then
 dereferenced/destroyed on the next bind.

   
   
 
   
   
 The reason CSO code looks like this is because it was meant to be an 
 itermediate step towards migration to sampler view model. Fully 
 converting all existing state trackers is non-trivial and thus I chose 
 this conservative approach. State trackers that do not care about extra 
 features a sampler view provides will keep using this one-shot CSO 
 interface with the hope that creation of sampler objects is lighweight 
 (format matches texture format, swizzle matches native texel layout, 
 etc.). 
 
   
 
 
 On the surface, this hope isn't likely to be fulfilled - lots of
 hardware doesn't support non-zero first_level.  Most cases of drivers
 implementing sampler views internally are to catch this issue.

 Of course, it seems like your branch so leaves the existing
 driver-specific sampler view code in place, so that there are
 potentially two implementations of sampler views in those drivers.  

 I guess this means that you can get away with the current implementation
 for now, but it prevents drivers actually taking advantage of the fact
 that these entities exist in the interface -- they will continue to have
 to duplicate the concept internally until the state trackers and/or CSO
 module start caching views.

   
 
   
   
 Ideally, everybody moves on and we stop using CSO for sampler 
 views. I prefer putting my effort into incremental migration of state 
 trackers rather than caching something that by definition doesn't need 
 to be cached.
 
   
 
 
 The CSO module exists to manage this type of caching on behalf of state
 trackers.  I would have thought that this was a sensible extension of
 the existing purpose of the CSO module.

 Won't all state-trackers implementing APIs which don't expose sampler
 views to the application require essentially the same caching logic, as
 is the case with regular state?  Wouldn't it be least effort to do that
 caching once only in the CSO module?
   
 
   
   
 OK, I see your point. I will make the necessary changes and ping you 
 when that's done.

   
 
 
 Keith,

 I changed my mind, went ahead and implemented sampler view caching in 
 mesa state tracker, rather than inside cso context.

 I strongly believe that doing caching on cso side would be slower and 
 more complicated. A state tracker has a better understanding of the 
 relationship between a texture and sampler view. In case of mesa, this 
 is trivial 1-to-1 mapping. Later, when we'll need more sampler views per 
 texture, we can have a per-texture cache for that, and yes, the code for 
 that would be in cso.

 There are two other state trackers that need to be fixed: xorg and vega. 
 The transition should be similar to mesa -- I can help with doing that, 
 but I can't do it myself. Once that's done we can purge one-shot sampler 
 view wrappers.

 What do you think?

   
   
 Keith,

 I just finished transforming mesa and auxiliary modules to new sampler 
 view interfaces. The remaining bits are vega and xorg state trackers -- 
 I will need help with them, but they could be fixed after the merge, as 
 they are not broken, and just set sampler view in suboptimal fashion.

 Please review, thanks.
 


 Michal,

 Did you get a chance to look at the double

Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge

2010-03-12 Thread michal

michal wrote on 2010-03-11 17:59:
 Keith Whitwell wrote on 2010-03-11 16:16:
   
 On Thu, 2010-03-11 at 06:05 -0800, michal wrote:
   
 
 Keith Whitwell wrote on 2010-03-11 14:21:
 
   
 On Thu, 2010-03-11 at 03:16 -0800, michal wrote:
   
   
 
 Hi,

 I would like to merge the branch in subject this week. This feature 
 branch allows state trackers to bind sampler views instead of textures 
 to shader stages.

 A sampler view object holds a reference to a texture and also overrides 
 internal texture format (resource casting) and specifies RGBA swizzle 
 (needed for GL_EXT_texture_swizzle extension).
 
 
   
 Michal,

 I've got some issues with the way the sampler views are being generated
 and used inside the CSO module.

 The point of a sampler view is that it gives the driver an opportunity
 to do expensive operations required for special sampling modes (which
 may include copying surface data if hardware is deficient in some way).

 This approach works if a sampler view is created once, then used
 multiple times before being deleted.

 Unfortunately, it seems the changes to support this in the CSO module
 provide only a single-shot usage model.  Sampler views are created in
 cso_set_XXX_sampler_textures, bound to the context, and then
 dereferenced/destroyed on the next bind.

   
   
 
 The reason CSO code looks like this is because it was meant to be an 
 itermediate step towards migration to sampler view model. Fully 
 converting all existing state trackers is non-trivial and thus I chose 
 this conservative approach. State trackers that do not care about extra 
 features a sampler view provides will keep using this one-shot CSO 
 interface with the hope that creation of sampler objects is lighweight 
 (format matches texture format, swizzle matches native texel layout, 
 etc.). 
 
   
 On the surface, this hope isn't likely to be fulfilled - lots of
 hardware doesn't support non-zero first_level.  Most cases of drivers
 implementing sampler views internally are to catch this issue.

 Of course, it seems like your branch so leaves the existing
 driver-specific sampler view code in place, so that there are
 potentially two implementations of sampler views in those drivers.  

 I guess this means that you can get away with the current implementation
 for now, but it prevents drivers actually taking advantage of the fact
 that these entities exist in the interface -- they will continue to have
 to duplicate the concept internally until the state trackers and/or CSO
 module start caching views.

   
 
 Ideally, everybody moves on and we stop using CSO for sampler 
 views. I prefer putting my effort into incremental migration of state 
 trackers rather than caching something that by definition doesn't need 
 to be cached.
 
   
 The CSO module exists to manage this type of caching on behalf of state
 trackers.  I would have thought that this was a sensible extension of
 the existing purpose of the CSO module.

 Won't all state-trackers implementing APIs which don't expose sampler
 views to the application require essentially the same caching logic, as
 is the case with regular state?  Wouldn't it be least effort to do that
 caching once only in the CSO module?
   
 
 OK, I see your point. I will make the necessary changes and ping you 
 when that's done.

   
Keith,

I changed my mind, went ahead and implemented sampler view caching in 
mesa state tracker, rather than inside cso context.

I strongly believe that doing caching on cso side would be slower and 
more complicated. A state tracker has a better understanding of the 
relationship between a texture and sampler view. In case of mesa, this 
is trivial 1-to-1 mapping. Later, when we'll need more sampler views per 
texture, we can have a per-texture cache for that, and yes, the code for 
that would be in cso.

There are two other state trackers that need to be fixed: xorg and vega. 
The transition should be similar to mesa -- I can help with doing that, 
but I can't do it myself. Once that's done we can purge one-shot sampler 
view wrappers.

What do you think?

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (gallium-sampler-view): st/mesa: Associate a sampler view with an st texture object.

2010-03-12 Thread michal

Keith Whitwell wrote on 2010-03-12 14:46:
 Michal,

 Is the intention to have 1 sampler view active in the Mesa state
 tracker, specifically in the cases where min_lod varies?

 In other words, you seem to have two ways of specifying the same state:

   pipe_sampler_view::first_level

 and

   pipe_sampler::min_lod

 Is there a case to keep both of these?  Or is one enough?

   
It looks like one has to go away, and that would be 
pipe_sampler::min_lod. And we want to have a per-texture cache of 
sampler views in mesa.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge

2010-03-11 Thread michal

Keith Whitwell wrote on 2010-03-11 14:21:
 On Thu, 2010-03-11 at 03:16 -0800, michal wrote:
   
 Hi,

 I would like to merge the branch in subject this week. This feature 
 branch allows state trackers to bind sampler views instead of textures 
 to shader stages.

 A sampler view object holds a reference to a texture and also overrides 
 internal texture format (resource casting) and specifies RGBA swizzle 
 (needed for GL_EXT_texture_swizzle extension).
 

 Michal,

 I've got some issues with the way the sampler views are being generated
 and used inside the CSO module.

 The point of a sampler view is that it gives the driver an opportunity
 to do expensive operations required for special sampling modes (which
 may include copying surface data if hardware is deficient in some way).

 This approach works if a sampler view is created once, then used
 multiple times before being deleted.

 Unfortunately, it seems the changes to support this in the CSO module
 provide only a single-shot usage model.  Sampler views are created in
 cso_set_XXX_sampler_textures, bound to the context, and then
 dereferenced/destroyed on the next bind.

   
The reason CSO code looks like this is because it was meant to be an 
itermediate step towards migration to sampler view model. Fully 
converting all existing state trackers is non-trivial and thus I chose 
this conservative approach. State trackers that do not care about extra 
features a sampler view provides will keep using this one-shot CSO 
interface with the hope that creation of sampler objects is lighweight 
(format matches texture format, swizzle matches native texel layout, 
etc.). Ideally, everybody moves on and we stop using CSO for sampler 
views. I prefer putting my effort into incremental migration of state 
trackers rather than caching something that by definition doesn't need 
to be cached.

Thanks for having a look.

 To make this change worthwhile, we'd want to somehow cache sampler views
 and reuse them on multiple draws.  Currently drivers that implement
 views internally hang them off the relevant texture.  

 The choices in this branch are to do it in the CSO module, or push it up
 to the state tracker.

   


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-sampler-view branch merge

2010-03-11 Thread michal

Keith Whitwell wrote on 2010-03-11 16:16:
 On Thu, 2010-03-11 at 06:05 -0800, michal wrote:
   
 Keith Whitwell wrote on 2010-03-11 14:21:
 
 On Thu, 2010-03-11 at 03:16 -0800, michal wrote:
   
   
 Hi,

 I would like to merge the branch in subject this week. This feature 
 branch allows state trackers to bind sampler views instead of textures 
 to shader stages.

 A sampler view object holds a reference to a texture and also overrides 
 internal texture format (resource casting) and specifies RGBA swizzle 
 (needed for GL_EXT_texture_swizzle extension).
 
 
 Michal,

 I've got some issues with the way the sampler views are being generated
 and used inside the CSO module.

 The point of a sampler view is that it gives the driver an opportunity
 to do expensive operations required for special sampling modes (which
 may include copying surface data if hardware is deficient in some way).

 This approach works if a sampler view is created once, then used
 multiple times before being deleted.

 Unfortunately, it seems the changes to support this in the CSO module
 provide only a single-shot usage model.  Sampler views are created in
 cso_set_XXX_sampler_textures, bound to the context, and then
 dereferenced/destroyed on the next bind.

   
   
 The reason CSO code looks like this is because it was meant to be an 
 itermediate step towards migration to sampler view model. Fully 
 converting all existing state trackers is non-trivial and thus I chose 
 this conservative approach. State trackers that do not care about extra 
 features a sampler view provides will keep using this one-shot CSO 
 interface with the hope that creation of sampler objects is lighweight 
 (format matches texture format, swizzle matches native texel layout, 
 etc.). 
 

 On the surface, this hope isn't likely to be fulfilled - lots of
 hardware doesn't support non-zero first_level.  Most cases of drivers
 implementing sampler views internally are to catch this issue.

 Of course, it seems like your branch so leaves the existing
 driver-specific sampler view code in place, so that there are
 potentially two implementations of sampler views in those drivers.  

 I guess this means that you can get away with the current implementation
 for now, but it prevents drivers actually taking advantage of the fact
 that these entities exist in the interface -- they will continue to have
 to duplicate the concept internally until the state trackers and/or CSO
 module start caching views.

   
 Ideally, everybody moves on and we stop using CSO for sampler 
 views. I prefer putting my effort into incremental migration of state 
 trackers rather than caching something that by definition doesn't need 
 to be cached.
 

 The CSO module exists to manage this type of caching on behalf of state
 trackers.  I would have thought that this was a sensible extension of
 the existing purpose of the CSO module.

 Won't all state-trackers implementing APIs which don't expose sampler
 views to the application require essentially the same caching logic, as
 is the case with regular state?  Wouldn't it be least effort to do that
 caching once only in the CSO module?
   
OK, I see your point. I will make the necessary changes and ping you 
when that's done.

Thanks.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium questions ...

2010-03-11 Thread michal

Jerome Glisse wrote on 2010-03-11 18:13:
 Hi all,

 I have been a little bit out of the loop on the mesa side, thus now i am
 having a bunch of questions relating to gallium, apologies if i am asking
 for obvious thing.

 First in tgsi compiler there is a Dimension field (struct tgsi_dimension)
 that i don't understand, it seems all driver are ignoring it, from quick
 glimpse to tgsi code it's for 2d array addressing, but i think glsl only
 talks about 1d array. What are the exepction for this field ? Should
 driver care ?

   
It makes sense for geometry shaders, where you use one dimension to 
address input vertex, and another one to index a particular input 
attribute within that vertex.

 What is the indirect boolean for in src or dst operand of an instruction ?
 What is the GLSL equivalent of it.

   
This is used to e.g. address constant registers with a non-constant index.

 A more practical question are what are the gallium branches likely to be
 merge in the next few weeks ? I will likely have r600g driver in good
 shape enough in the next few weeks to consider merging it with master
 but i would like first to port it to the lastest gallium change before
 merging it so i don't put the burden on people working on those
 branches.

 What are the plan to expand TGSI to support new shader feature ? (double
 precision op, ...
I am planning to add a new set of texture fetch/sampler instructions in 
the immediate future. The gallium-double-opcodes branch has stalled, 
though, so it won't be merged any time soon.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] PK/UP* and NV_[vertex|fragment]_program* support in Gallium?

2010-03-02 Thread michal

Luca Barbieri wrote on 2010-03-01 18:25:
 I see that PK2US and friends are being removed.
 These would be necessary to implement NV_fragment_program_option,
 NV_fragment_program2 and NV_gpu_program4.

 Currently the no drivers (including Nouveau) support them, but since
 we already have some support in Mesa (even parsers for the nVidia
 syntax), it would be nice to support them in Gallium eventually.

 Not sure about STR/SFL though: they can be encoded/decoded as MOV x,
 0/1, but they complete the SETcond instruction set.

 How about keeping them and adding a capability bit for them?
   
I don't know if anybody cares about those NV extensions, and if there's 
somebody eager enough to add support for them to the whole gallium 
stack, nothing stops him/her from re-adding those opcodes.

The point of gallium-no-nvidia-opcodes is to strip down TGSI instruction 
set to what's being actually used by state trackers and implemented by 
drivers.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] gallium-no-rhw-position branch merge

2010-03-01 Thread michal

Hi,

This branch removes bypass_vs_clip_and_viewport flag from pipe 
rasterizer state. The benefits of having this bit around were dubious 
for everybody and burdensome for driver writers.
 
All the utility code that relied on this flag have been rewritten to 
pass vertex positions in clip space and set clip and viewport state. I 
would like to ask the maintainers of u_blitter module to please test my 
changes and provide feedback.

Please review.

Thanks.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-25 Thread michal

Roland Scheidegger wrote on 2010-02-24 15:18:
 On 24.02.2010 12:48, Christoph Bumiller wrote:
   
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.

 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.

 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it there
 in gallium either.

 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.
 

 I agree this is not sampler state, but I don't quite agree this should
 be texture state.
 This changes how texture coordinates get interpreted in the interpolator
 - in that sense it is similar to the cylindrical texture coord wrap
 which we moved away from sampler state recently. This one got moved to
 shader declaration. I wonder if the normalization bit should be treated
 the same.
 Though OTOH you're quite right that in OpenGL this really is texture
 property (it is a different texture target after all), and afaik d3d
 doesn't support non-normalized coords (?). Hmm...

   
Isn't it the case that for RECT targets we clear the bit, and for others 
we always set it?

In mesa st I see:

 if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB)
sampler-normalized_coords = 1;

By definition, RECT texture with normalised coordinates is just an NPOT. 
If we removed this apparently redundant flag, would that make nouveau 
developers life easier?


   
 And, finally, I've seen you reverted the changes for independent image
 and sampler index in the texture opcodes. What's up with that ?
 Is the code not nice enough, or has the idea been discarded and by problem
 disappears ?

 

Please consider this branch dead. It will be easier for me to introduce 
new, optional sampler and fetch opcodes a'la GL 3.0. There's just too 
much code to fix and test and we still want the older hardware not to 
stand on its head to try and translate back to old model.

Thanks.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-25 Thread michal

Christoph Bumiller wrote on 2010-02-25 19:39:
 On 25.02.2010 19:00, Brian Paul wrote:
   
 Roland Scheidegger wrote:
   
 
 On 25.02.2010 18:39, michal wrote:
 
   
 Roland Scheidegger wrote on 2010-02-24 15:18:
   
 
 On 24.02.2010 12:48, Christoph Bumiller wrote:
   
 
   
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.

 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.

 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it 
 there
 in gallium either.

 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.
 
   
 
 I agree this is not sampler state, but I don't quite agree this should
 be texture state.
 This changes how texture coordinates get interpreted in the interpolator
 - in that sense it is similar to the cylindrical texture coord wrap
 which we moved away from sampler state recently. This one got moved to
 shader declaration. I wonder if the normalization bit should be treated
 the same.
 Though OTOH you're quite right that in OpenGL this really is texture
 property (it is a different texture target after all), and afaik d3d
 doesn't support non-normalized coords (?). Hmm...

   
 
   
 Isn't it the case that for RECT targets we clear the bit, and for others 
 we always set it?

 In mesa st I see:

  if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB)
 sampler-normalized_coords = 1;

 By definition, RECT texture with normalised coordinates is just an NPOT. 
 If we removed this apparently redundant flag, would that make nouveau 
 developers life easier?
   
 
 But we don't have rect targets in gallium hence we need the flag. I
 think conceptually this makes sense since for texture layouts etc.
 drivers won't care one bit if this is 2d npot or rect texture.
 Though I guess introducing rect targets instead would be another option.
 
   
 We should also be thinking about texture array targets.  With a 2D 
 texture array, the S and T coords would be normalized, but not R.

 I think we either need new texture targets for RECT, 1D_ARRAY, 
 2D_ARRAY, etc. or per-dimension normalization flags.  I'm thinking the 
 former may be better (simpler) since textures are created as a 
 particular type and not changed afterward.  We also know the texture 
 type/target when we execute TEX shader instructions.  If it's part of 
 sampler state it gives the impression that it's variable state, but it 
 really isn't.

   
 
 We'd also need a BUFFER target then, they also have scaled
 coordinates.
 The problem is I think that this drivers gallium a little towards
 catering towards specific APIs (OpenGL).

 OpenCL for instance does have a per sampler normalization bit
 iirc, but it seems there's no hardware that reflects this property.

 Then again, TGSI does have a RECT target already, so we might
 as well add corresponding PIPE targets.

 I want to remind again that the normalization bit only becomes
 problematic once samplers and textures can be independently
 combined, and that it seems older hardware can't nicely do this
 anyway, except if they take it upon them to recompile their shaders
 (although I hear some need to do that already ...)

 I admit I'm actually being a bit selfish here, trying to get the interface
 more adapted to nv50, but, if other hardware doesn't have conflicting
 views, why not ? Maybe I should accept nv50 is getting old.
   
Why do you say that? NV50 is a DX10-level card -- it deserves better 
treatment. Your request is valid and we should go and ask gallium 
gatekeepers to get this change pushed.

Thanks.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-15 Thread michal

Roland Scheidegger wrote on 2010-02-12 20:55:
On 12.02.2010 20:20, Corbin Simpson wrote:

On Fri, Feb 12, 2010 at 10:49 AM, Brian Paul bri...@vmware.com wrote:

Roland Scheidegger wrote:

On 12.02.2010 19:00, Keith Whitwell wrote:

On Fri, 2010-02-12 at 09:56 -0800, Roland Scheidegger wrote:

On 12.02.2010 18:42, Keith Whitwell wrote:

On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote:

On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote:

On 12.02.2010 14:44, michal wrote:

Keith Whitwell wrote on 2010-02-12 14:28:

On Fri, 2010-02-12 at 05:09 -0800, michal wrote:

Keith Whitwell wrote on 2010-02-12 13:39:

On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:

Module: Mesa
Branch: master
Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

Author: Michal Krol mic...@vmware.com
Date: Fri Feb 12 13:32:35 2010 +0100

util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.

Michal,

Is this more like two different users expecting two different
results in
those unused columns?

In particular, we definitely require the missing elements to be
extended
to (0,0,0,1) when fetching vertex data, and probably also in
OpenGL
texture sampling (if we supported these formats for that).

Gallium should follow D3D rules, so I've been following D3D here.
Also,
util_unpack_color_ub() in u_pack_color.h already sets the remaining
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex
attribute
data since you can't have X or XY vertex positions, only XYZ (with
W
extended to 1 as in GL) and XYZW.

But surely D3D permits two-component texture coordinates, which
would be
PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

Brian added a table of differences between GL and other APIs
recently to
gallium/docs - does your change agree with that?

Where's that exactly, I can't find it?

It seems like we'd want to be able to support both usages - the
alternative in texture sampling would be forcing the state tracker
to
generate varients of the shader when 2-component textures are
bound. I
would say that's an unreasonable requirement on the state tracker.

It seems like in GL would want (0,0,0,1) expansion everywhere, but
D3D
would want differing expansions in different parts of the pipeline.
That indicates a single flag in the context somewhere isn't
sufficient
to choose between the two.

Maybe there need to be two versions of these PIPE_FORMAT_ enums to
capture the different values in the missing components?

EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

? or something along those lines??

You are right.

Alternatively, follow the more sane API (GL apparently), assume 0001
as
default and use the infix to override.

Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
different. Well for texture sampling anyway, I don't know what d3d
does
for vertex formats.

Though for most hardware it would make sense to have only one format
per
different expansion, and use some swizzling parameter for sampling,
because that's actually how the hardware works. But not all drivers
will
be able to do this, unfortunately.

You mean, having a swizzle in pipe_sampler_state ?

It sounds a good idea.

In the worst case some component will inevitably need to make shader
variants with different swizzles. In this case it probably makes sense
to be the pipe driver -- it's a tiny shader variation which could be
done without recompiling the whole shader, but if the state tracker
does
it then the pipe driver will always have to recompile.

In the best case it is handled by the hardware's texture sampling unit.

It's in theory similar to baking the swizzle in the format as Keith
suggested, but cleaner IMHO. The question is whether it makes sense to
have full xwyz01 swizzles, or just 01 swizzles.

Another alternative is to just add the behaviour we really need - a
single flag at context creation time that says what the behaviour of the
sampler should be for these textures.

Then the driver wouldn't have to worry about varients or mixing two
different expansions. Hardware (i965 at least) seems to have one global
mode to switch between these, and that's all we need to choose the right
behaviour for each state tracker.

It might be simpler all round just to specify it at context creation.

Yes, for rg01 vs rg11

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread michal

Keith Whitwell wrote on 2010-02-12 13:39:
 On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:
   
 Module: Mesa
 Branch: master
 Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

 Author: Michal Krol mic...@vmware.com
 Date:   Fri Feb 12 13:32:35 2010 +0100

 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.
 

 Michal,

 Is this more like two different users expecting two different results in
 those unused columns?

 In particular, we definitely require the missing elements to be extended
 to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
 texture sampling (if we supported these formats for that).  

   
Gallium should follow D3D rules, so I've been following D3D here. Also, 
util_unpack_color_ub() in u_pack_color.h already sets the remaining 
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex attribute 
data since you can't have X or XY vertex positions, only XYZ (with W 
extended to 1 as in GL) and XYZW.

 Brian added a table of differences between GL and other APIs recently to
 gallium/docs - does your change agree with that?

   
Where's that exactly, I can't find it?

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread michal

Keith Whitwell wrote on 2010-02-12 14:28:
On Fri, 2010-02-12 at 05:09 -0800, michal wrote:

Keith Whitwell wrote on 2010-02-12 13:39:

On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:

Module: Mesa
Branch: master
Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

Author: Michal Krol mic...@vmware.com
Date: Fri Feb 12 13:32:35 2010 +0100

util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.

Michal,

Is this more like two different users expecting two different results in
those unused columns?

In particular, we definitely require the missing elements to be extended
to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
texture sampling (if we supported these formats for that).

Gallium should follow D3D rules, so I've been following D3D here. Also,
util_unpack_color_ub() in u_pack_color.h already sets the remaining
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex attribute
data since you can't have X or XY vertex positions, only XYZ (with W
extended to 1 as in GL) and XYZW.

But surely D3D permits two-component texture coordinates, which would be
PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

Brian added a table of differences between GL and other APIs recently to
gallium/docs - does your change agree with that?

Where's that exactly, I can't find it?

It seems like we'd want to be able to support both usages - the
alternative in texture sampling would be forcing the state tracker to
generate varients of the shader when 2-component textures are bound. I
would say that's an unreasonable requirement on the state tracker.

It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D
would want differing expansions in different parts of the pipeline.
That indicates a single flag in the context somewhere isn't sufficient
to choose between the two.

Maybe there need to be two versions of these PIPE_FORMAT_ enums to
capture the different values in the missing components?

EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

? or something along those lines??

You are right.

Alternatively, follow the more sane API (GL apparently), assume 0001 as
default and use the infix to override.

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] Gallium query types

2010-02-11 Thread michal

Hi,

I can't find any information regarding two Gallium query types. No 
documentation, no source code.

#define PIPE_QUERY_PRIMITIVES_GENERATED  1
#define PIPE_QUERY_PRIMITIVES_EMITTED2

Do they have something to do with NV_transform_feedback extension? If 
not, do they mean the number of primitves before clipping, and after 
clipping, respectively?

Thanks.


--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-05 Thread michal

Brian Paul wrote on 2010-02-04 22:07:
 michal wrote:
   
 Brian Paul wrote on 2010-02-03 17:58:
 
 Keith Whitwell wrote:
   
   
 Michal,

 why do you need this for linear interpolator and not perspective? I
 think d3d mobile let you disable perspective correct texturing, but it
 is always enabled for normal d3d.
   
 
 
 I could not think of a use case that uses perspective and cylindrical 
 interpolation at the same time. If you think it's valid, we can 
 implement cylindrical wrapping for perspective interpolator, but then I 
 am not sure how exactly it should be done, i.e. should we divide and 
 then wrap or the opposite?
   
   
 Is there some way we can figure out what DX9 does here?  Maybe a quick
 test?
 
 
 I suspect cylindrical wrapping would be done after the divide.

   
   
 A quick test shows it is legal to have perspective and cylindrical 
 interpolation. In fact, I see no difference between projected and 
 non-projected version with REF device -- both are perspective correct.

 I think I am stuck at this point and need further help. I am trying to 
 modify tri_persp_coeff() in softpipe in a similar manner to 
 tri_linear_coeff(), but all I get are lousy rendering artifacts. If we 
 need do cylindrical wrapping after divide, it must be done as part of 
 shader interpolator, but the only place where we have enough information 
 to do wrapping is in primitive setup.
 

 Do you have a patch relative to gallium-cylindrical-wrap?  I'll take a 
 look.

   
Brian,

I have no half-working patch for you, sorry. I tried a few approaches, 
but they were nonsensical.

The linear coeff calculation is simple: calculate distance between two 
coordinates, and if it's greater than 0.5, apply wrapping by adjusting 
the distance.

However, for the perspective correct coeffs, we divide early by 
position.w before calculating the distance, and so my approach that 
worked for linear fails here. I am either not comprehending the math 
here (why do we divide the second time in interpolator, for instance?) 
or we need to put more information into struct tgsi_interp_coef so that 
the interpolator code has enough information to do wrapping on its own 
without help of primitive setup.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-05 Thread michal

michal wrote on 2010-02-05 11:05:
 Brian Paul wrote on 2010-02-04 22:07:
   
 michal wrote:
   
 
 Brian Paul wrote on 2010-02-03 17:58:
 
   
 Keith Whitwell wrote:
   
   
 
 Michal,

 why do you need this for linear interpolator and not perspective? I
 think d3d mobile let you disable perspective correct texturing, but it
 is always enabled for normal d3d.
   
 
 
   
 I could not think of a use case that uses perspective and cylindrical 
 interpolation at the same time. If you think it's valid, we can 
 implement cylindrical wrapping for perspective interpolator, but then I 
 am not sure how exactly it should be done, i.e. should we divide and 
 then wrap or the opposite?
   
   
 
 Is there some way we can figure out what DX9 does here?  Maybe a quick
 test?
 
 
   
 I suspect cylindrical wrapping would be done after the divide.

   
   
 
 A quick test shows it is legal to have perspective and cylindrical 
 interpolation. In fact, I see no difference between projected and 
 non-projected version with REF device -- both are perspective correct.

 I think I am stuck at this point and need further help. I am trying to 
 modify tri_persp_coeff() in softpipe in a similar manner to 
 tri_linear_coeff(), but all I get are lousy rendering artifacts. If we 
 need do cylindrical wrapping after divide, it must be done as part of 
 shader interpolator, but the only place where we have enough information 
 to do wrapping is in primitive setup.
 
   
 Do you have a patch relative to gallium-cylindrical-wrap?  I'll take a 
 look.

   
 
 Brian,

 I have no half-working patch for you, sorry. I tried a few approaches, 
 but they were nonsensical.

 The linear coeff calculation is simple: calculate distance between two 
 coordinates, and if it's greater than 0.5, apply wrapping by adjusting 
 the distance.

 However, for the perspective correct coeffs, we divide early by 
 position.w before calculating the distance, and so my approach that 
 worked for linear fails here. I am either not comprehending the math 
 here (why do we divide the second time in interpolator, for instance?) 
 or we need to put more information into struct tgsi_interp_coef so that 
 the interpolator code has enough information to do wrapping on its own 
 without help of primitive setup.
   

OK, I managed to correctly implement cylindrical wrap in softpipe both 
for linear and perspective interpolation, both for lines and triangles.

Tested with Brian's cylwrap test app -- it works.

Please re-review. Thanks.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-04 Thread michal

Brian Paul wrote on 2010-02-03 17:58:
 Keith Whitwell wrote:
   
 Michal,

 why do you need this for linear interpolator and not perspective? I
 think d3d mobile let you disable perspective correct texturing, but it
 is always enabled for normal d3d.
   
 
 I could not think of a use case that uses perspective and cylindrical 
 interpolation at the same time. If you think it's valid, we can 
 implement cylindrical wrapping for perspective interpolator, but then I 
 am not sure how exactly it should be done, i.e. should we divide and 
 then wrap or the opposite?
   
 Is there some way we can figure out what DX9 does here?  Maybe a quick
 test?
 

 I suspect cylindrical wrapping would be done after the divide.

   
A quick test shows it is legal to have perspective and cylindrical 
interpolation. In fact, I see no difference between projected and 
non-projected version with REF device -- both are perspective correct.

I think I am stuck at this point and need further help. I am trying to 
modify tri_persp_coeff() in softpipe in a similar manner to 
tri_linear_coeff(), but all I get are lousy rendering artifacts. If we 
need do cylindrical wrapping after divide, it must be done as part of 
shader interpolator, but the only place where we have enough information 
to do wrapping is in primitive setup.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-03 Thread michal

Keith,

This feature branch adds cylindrical wrap texcoord mode to gallium 
shader tokens and removes prefilter field from sampler state. 
Implemented cylindrical wrapping for linear interpolator in softpipe. 
Not sure whether it makes sense to do it for perspective interpolator. 
Documented TGSI declaration token.

Sample fragment shader declaration that wraps S and T coordinates follows.

DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY

Please review so I can merge it to master.

Thanks.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread michal

Luca Barbieri wrote on 2010-02-01 21:42:

 1. All the semantic indices in OpenGL are limited, according to the
 ARB specification
 2. All the sematic indices in DirectX 9/10 are limited, according to
 http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx

At least for SM3.0, one can specify a vertex shader output semantic like 
COLOR15 and have it running as long as one has also a pixel shader with 
a matching input semantic. Though I agree with you we don't really want 
to go this route and have something more sensible.

We could, for example, limit COLOR and BCOLOR indices to [0, 1], remove 
FOG and NORMAL names, and have a well-defined limit on GENERIC index 
value. After all, we only need non-generic semantics to communicate with 
the fixed-function part of the pipeline, that is rasteriser.

name   index range

POSITION   no limit?
COLOR  0..1, explicit clamp?
BCOLOR 0..1, explicit clamp?
FOGremove?
PSIZE  0
GENERIC0..max generics
NORMAL remove
FACE   0
EDGEFLAG   0
PRIMID 0
INSTANCEID 0


As for the routing table thing, I am not really convinced. The GLSL 
mechanism to link shaders based on varying names is GL-specific and thus 
should stay inside Mesa state tracker. In fact, D3D10 runtime is doing 
exactly the same thing and generating shader varients on the fly as they 
are mixed and matched by the application.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge

2010-01-25 Thread michal

Brian Paul wrote on 2010-01-22 17:56:
 michal wrote:
   
 Brian Paul wrote on 2010-01-21 21:57:
 
 michal wrote:
   
   
 Hi,

 This simple feature branch adds support for two-dimensional constant 
 buffers in TGSI.

 An example shader would look like this:

 FRAG

 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL CONST[1][1..2]

 MAD OUT[0], IN[0], CONST[1][2], CONST[1][1]

 END

 For this to work, one needs to bind a buffer to slot nr 1 containing at 
 least 3 vectors.
 
 
 Just a terminology thing: this feature really implements arrays of 
 constant buffers, not really two-dimensional buffers, right?

   
   
 That's correct -- the access to constbuf data is two-dimensional, but 
 the constbufs themselves are an array of differently-sized constat buffers.

 
 In p_state.h we should probably rename PIPE_MAX_CONSTANT to 
 PIPE_MAX_CONSTANT_BUFFERS to be clearer.

 Don't we need a new PIPE_CAP_MAX_CONSTANT_BUFFERS query?  Maybe even a 
 query per shader stage?

   
   
 What about maximum size of a single constant buffer? I would think this 
 is a more crtical parameter than the number of constbuf slots the driver 
 support.
 

 Yeah, I thought we already had a query for that, but we don't.

 I'd suggest:

 PIPE_CAP_MAX_CONST_BUFFERS
 PIPE_CAP_MAX_CONST_BUFFER_SIZE  (in bytes)

   

All,

Thanks for your comments, I have commited my changes to the branch and 
am awaiting for more comments.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] TGSI build and sse2 removal

2010-01-25 Thread michal

I would like to have those two modules go away, as they are maintenance 
pain with no real benefits.

The build module has been superseded by the ureg module, and apparently 
all third-party code has already migrated or is in the process of 
porting to new interface. I would like to nuke it if nobody minds.

For sse2, I am looking at simplifying it enough to be able to accelerate 
pass-thru fragment shaders and simple vertex shaders. That's it. For 
more sophisticated stuff we already have llvmpipe.

If nobody objects, I am going to start the rework next week.

Thanks.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI build and sse2 removal

2010-01-25 Thread michal

Brian Paul wrote on 2010-01-25 16:09:
 José Fonseca wrote:
   
 Michal,

 On Mon, 2010-01-25 at 06:27 -0800, michal wrote:
 
 I would like to have those two modules go away, as they are maintenance 
 pain with no real benefits.

 The build module has been superseded by the ureg module, and apparently 
 all third-party code has already migrated or is in the process of 
 porting to new interface. I would like to nuke it if nobody minds.
   
 I'm fine with this.
 

 We can't remove this until we switch to ureg in the draw code.  The 
 draw_pipe_pstipple.c, draw_pipe_aaline.c and draw_pipe_aapoint.c files 
 still haven't been converted to use tgsi_ureg.  There may be some 
 other uses elsewhere.  Michal, can you update that code first?


   
That's the plan.

 For sse2, I am looking at simplifying it enough to be able to accelerate 
 pass-thru fragment shaders and simple vertex shaders. That's it. For 
 more sophisticated stuff we already have llvmpipe.
   
 I agree with this in principle, but I think it's better not to get too
 much ahead of ourselves here: drivers are using tgsi_exec/sse2 for
 software vertex processing fallbacks. And while the plan is indeed to
 move the LLVM JIT code generation out of llvmpipe into the auxiliary
 modules so that all pipe drivers can use that for fallbacks, the fact is
 we're not there yet.

 So for tgsi_sse2 I think it's better not to introduce any performance
 regressions in vertex processing until llvm code generation is in place
 and working for everybody.
 

 I agree.  It's too early to remove the sse2 code.

   
OK, that makes sense, I will leave it alone for the time being.

Thanks, guys.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge

2010-01-22 Thread michal

Brian Paul wrote on 2010-01-21 21:57:
 michal wrote:
   
 Hi,

 This simple feature branch adds support for two-dimensional constant 
 buffers in TGSI.

 An example shader would look like this:

 FRAG

 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL CONST[1][1..2]

 MAD OUT[0], IN[0], CONST[1][2], CONST[1][1]

 END

 For this to work, one needs to bind a buffer to slot nr 1 containing at 
 least 3 vectors.
 


 Just a terminology thing: this feature really implements arrays of 
 constant buffers, not really two-dimensional buffers, right?

   
That's correct -- the access to constbuf data is two-dimensional, but 
the constbufs themselves are an array of differently-sized constat buffers.

 In p_state.h we should probably rename PIPE_MAX_CONSTANT to 
 PIPE_MAX_CONSTANT_BUFFERS to be clearer.

 Don't we need a new PIPE_CAP_MAX_CONSTANT_BUFFERS query?  Maybe even a 
 query per shader stage?

   
What about maximum size of a single constant buffer? I would think this 
is a more crtical parameter than the number of constbuf slots the driver 
support.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] python and constant buffers

2010-01-21 Thread michal

José Fonseca wrote on 2010-01-20 14:32:
 On Wed, 2010-01-20 at 03:55 -0800, michal wrote:
   
 Jose,

 How one can upload data to buffers in python state tracker?

 I am trying to do the following:

 +cb0_data = [
 +0.0, 0.0, 0.0, 0.0,
 +0.0, 0.0, 0.0, 1.0,
 +1.0, 1.0, 1.0, 1.0,
 +2.0, 4.0, 8.0, 1.0,
 +]
 +
 +constbuf0 = dev.buffer_create(
 +16,
 +PIPE_BUFFER_USAGE_CONSTANT |
 +  PIPE_BUFFER_USAGE_GPU_READ |
 +  PIPE_BUFFER_USAGE_GPU_WRITE |
 +  PIPE_BUFFER_USAGE_CPU_READ |
 +  PIPE_BUFFER_USAGE_CPU_WRITE,
 +4 * 4 * 4)
 +
 +constbuf0.write_(cb0_data, 4 * 4 * 4)

 But I can't find a way to convert a list of floats to (char *). Do have 
 an idea how to do it?
 

 Hi Michal,

 The Gallium - Python bindings are autogenerated by SWIG and there are
 several things which are not very pythonic. Writing data into/out of
 the buffers is one of them.

 ATM the only way to do this is using the python struct module, and pack
 the floats into a string... That is:

   import struct
   data = ''
   data += struct.pack('4f', 1.0, 2.0, 3.0, 4.0)
   ...

   
That's perfect. Thanks, Jose.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge

2010-01-21 Thread michal

Hi,

This simple feature branch adds support for two-dimensional constant 
buffers in TGSI.

An example shader would look like this:

FRAG

DCL IN[0], COLOR, LINEAR
DCL OUT[0], COLOR
DCL CONST[1][1..2]

MAD OUT[0], IN[0], CONST[1][2], CONST[1][1]

END

For this to work, one needs to bind a buffer to slot nr 1 containing at 
least 3 vectors.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] python and constant buffers

2010-01-20 Thread michal

Jose,

How one can upload data to buffers in python state tracker?

I am trying to do the following:

+cb0_data = [
+0.0, 0.0, 0.0, 0.0,
+0.0, 0.0, 0.0, 1.0,
+1.0, 1.0, 1.0, 1.0,
+2.0, 4.0, 8.0, 1.0,
+]
+
+constbuf0 = dev.buffer_create(
+16,
+PIPE_BUFFER_USAGE_CONSTANT |
+  PIPE_BUFFER_USAGE_GPU_READ |
+  PIPE_BUFFER_USAGE_GPU_WRITE |
+  PIPE_BUFFER_USAGE_CPU_READ |
+  PIPE_BUFFER_USAGE_CPU_WRITE,
+4 * 4 * 4)
+
+constbuf0.write_(cb0_data, 4 * 4 * 4)

But I can't find a way to convert a list of floats to (char *). Do have 
an idea how to do it?

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Implement double opcodes: ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad

2010-01-19 Thread michal

Igor Oliveira wrote on 2010-01-18 19:55:
 The patches implement gallium opcodes ddiv, dmul, dmax, dmin, dslt,
 dsge, dseq, drcp, dqsrt and dmad and add tests to it.
 They are applicable in gallium-double-opcode  branch.
 The next patchs i will add documentation and missing double opcodes
 implementation like dfrac, dldexp and dfracexp.

   

Excellent, commited with cosmetic changes.

Thanks!

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] add dfrac, dfracexp, dldexp opcodes to gallium

2010-01-19 Thread michal

Igor Oliveira wrote on 2010-01-20 00:37:
 Hi,

 These patches add support to dfrac, dldexp and fracexp opcodes.
 The fracexp opcode i think it is the only opcode that use 2 DST registers.
 The first one is used to store the fractional part(it store in a
 double) and the second one is used to store the exponent part(it is a
 int).
 In the tests we can see it working.

  static void
 +micro_dfrac(union tgsi_double_channel *dst,
 +const union tgsi_double_channel *src)
 +{
 +   dst-d[0] = src-d[0] - floor(src-d[0]);
 +   dst-d[1] = src-d[1] - floor(src-d[0]);
 +   dst-d[2] = src-d[2] - floor(src-d[0]);
 +   dst-d[3] = src-d[3] - floor(src-d[0])
Igor,

Shouldn't the second line have floor(src-d[1]), and so on?

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] instanced-arrays branch

2010-01-16 Thread michal

Chia-I Wu wrote on 2010-01-16 02:28:
 On Fri, Jan 15, 2010 at 07:22:52PM +0100, michal wrote:
   
 I think I will try to manually patch it later. Thanks!
 
 The first line of the patch is somehow garbled.  But I am not sure if
 that is a good fix, so please go ahead.

   
Commited, thanks.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add GALLIUM_DUMP_VS environment variable

2010-01-15 Thread michal

Luca Barbieri wrote on 2009-12-26 02:06:
 Add GALLIUM_DUMP_VS to dump the vertex shader to the console like
 GALLIUM_DUMP_FS in softpipe.

   
Commited, thanks.

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] instanced-arrays branch

2010-01-15 Thread michal

Chia-I Wu wrote on 2010-01-15 15:09:
 On Fri, Jan 15, 2010 at 09:57:32PM +0800, Chia-I Wu wrote:
   
 On Wed, Jan 13, 2010 at 2:02 AM, michal mic...@vmware.com wrote:
 
 I would like to merge this branch to master soon.
   
 I am seeing all sorts of funny behaviors after the merge with OpenVG.  The
 attached patch seems to fix the problem.  I am not sure if this is the right
 fix...
 
 There are two typos in the changes to auxiliary/vl/.  Here is the
 updated patch.  Sorry for the trouble.

   
The patch does not apply.

/c/src/mesa (master)
$ git am 0001-gallium-Fix-uninitialized-instance-divisor-and-index-v2.patch
Applying: gallium: Fix uninitialized instance divisor and index.
c:/src/mesa/.git/rebase-apply/patch:18: trailing whitespace.
   unsigned instance_id_index = ~0;
c:/src/mesa/.git/rebase-apply/patch:30: trailing whitespace.
  velements[i].instance_divisor = 0;
c:/src/mesa/.git/rebase-apply/patch:42: trailing whitespace.
   c-vertex_elems[0].instance_divisor = 0;
c:/src/mesa/.git/rebase-apply/patch:50: trailing whitespace.
   c-vertex_elems[1].instance_divisor = 0;
c:/src/mesa/.git/rebase-apply/patch:62: trailing whitespace.
   r-vertex_elems[0].instance_divisor = 0;
error: patch failed: 
src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c:6
0
error: src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c: patch 
does not
 apply
error: patch failed: src/gallium/auxiliary/util/u_draw_quad.c:61
error: src/gallium/auxiliary/util/u_draw_quad.c: patch does not apply
error: patch failed: src/gallium/auxiliary/vl/vl_compositor.c:316
error: src/gallium/auxiliary/vl/vl_compositor.c: patch does not apply
error: patch failed: src/gallium/auxiliary/vl/vl_mpeg12_mc_renderer.c:891
error: src/gallium/auxiliary/vl/vl_mpeg12_mc_renderer.c: patch does not 
apply
error: patch failed: src/gallium/state_trackers/vega/polygon.c:293
error: src/gallium/state_trackers/vega/polygon.c: patch does not apply
Patch failed at 0001 gallium: Fix uninitialized instance divisor and index.
When you have resolved this problem run git am --resolved.
If you would prefer to skip this patch, instead run git am --skip.
To restore the original branch and stop patching run git am --abort.



I think I will try to manually patch it later. Thanks!

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] instanced-arrays branch

2010-01-14 Thread michal


michal wrote on 2010-01-12 19:02:

Keith,

I would like to merge this branch to master soon.

It adds new entrypoints to pipe_context -- draw_arrays_instanced() and 
draw_elements_instanced(). A new system value is introduced to TGSI that 
allows vertex shaders to access current instance ID.


The new entrypoints are implemented in draw module, and softpipe driver 
has been properly adjusted for that.


There is no capability bit defined for that -- I wasn't sure whether we 
still want to go this route.


Please review, thanks.

  
Attached a patch that adds documentation for drawing commands, done 
against master branch since instanced-arrays branch doesn't have the 
gallium docs tree yet.


Thanks.
diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index 21f5f91..9686537 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -72,12 +72,64 @@ stencil-only clears of packed depth-stencil buffers.
 Drawing
 ^^^
 
-``draw_arrays``
+``draw_arrays`` draws a specified primitive.
 
-``draw_elements``
+This command is equivalent to calling ``draw_arrays_instanced``
+with ``startInstance`` set to 0 and ``instanceCount`` set to 1.
+
+``draw_elements`` draws a specified primitive using an optional
+index buffer.
+
+This command is equivalent to calling ``draw_elements_instanced``
+with ``startInstance`` set to 0 and ``instanceCount`` set to 1.
 
 ``draw_range_elements``
 
+XXX: this is (probably) a temporary entrypoint, as the range
+information should be available from the vertex_buffer state.
+Using this to quickly evaluate a specialized path in the draw
+module.
+
+``draw_arrays_instanced`` draws multiple instances of the same primitive.
+
+This command is equivalent to calling ``draw_elements_instanced``
+with ``indexBuffer`` set to NULL and ``indexSize`` set to 0.
+
+``draw_elements_instanced`` draws multiple instances of the same primitive
+using an optional index buffer.
+
+For instanceID in the range between ``startInstance``
+and ``startInstance``+``instanceCount``-1, inclusive, draw a primitive
+specified by ``mode`` and sequential numbers in the range between ``start``
+and ``start``+``count``-1, inclusive.
+
+If ``indexBuffer`` is not NULL, it specifies an index buffer with index
+byte size of ``indexSize``. The sequential numbers are used to lookup
+the index buffer and the resulting indices in turn are used to fetch
+vertex attributes.
+
+If ``indexBuffer`` is NULL, the sequential numbers are used directly
+as indices to fetch vertex attributes.
+
+If a given vertex element has ``instance_divisor`` set to 0, it is said
+it contains per-vertex data and vertex attribute address needs
+to be recalculated for every index.
+
+  attribAddr = ``stride`` * index + ``src_offset``
+
+If a given vertex element has ``instance_divisor`` set to non-zero,
+it is said it contains per-instance data and vertex attribute address
+needs to recalculated for every ``instance_divisor``-th instance.
+
+  attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset``
+
+In the above formulas, ``src_offset`` is taken from the given vertex element
+and ``stride`` is taken from a vertex buffer associated with the given
+vertex element.
+
+The value of ``instanceID`` can be read in a vertex shader through a system
+value register declared with INSTANCEID semantic name.
+
 
 Queries
 ^^^
--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATH] Add double opcodes to TGSI Revision 2

2010-01-12 Thread michal

Igor Oliveira wrote on 2010-01-12 12:52:
 Michal: i am seeing the double opcode branch i can move the opcode
 codes to use the exec_double_binary/unary
   
Igor,

Yes, that was my intention.

It would be great if you looked at what has been done in that branch and 
for each new opcode provide reference implementation in tgsi_exec.c, 
document it in gallium/docs/source/tgsi.rst and it would super cool if 
you could add a unit test in python/tests/regress/fragment-shader.

Thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] instanced-arrays branch

2010-01-12 Thread michal

Keith,

I would like to merge this branch to master soon.

It adds new entrypoints to pipe_context -- draw_arrays_instanced() and 
draw_elements_instanced(). A new system value is introduced to TGSI that 
allows vertex shaders to access current instance ID.

The new entrypoints are implemented in draw module, and softpipe driver 
has been properly adjusted for that.

There is no capability bit defined for that -- I wasn't sure whether we 
still want to go this route.

Please review, thanks.



--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (glsl-pp-rework-2): scons: Get GLSL code building correctly when cross compiling.

2010-01-12 Thread michal

José Fonseca wrote on 2010-01-12 19:51:
 On Mon, 2010-01-11 at 15:28 -0800, Stephan Raue wrote:
   
 Hi all,

 Am 10.12.2009 17:36, schrieb José Fonseca:
 
 On Thu, 2009-12-10 at 08:31 -0800, Jose Fonseca wrote:

   
 Module: Mesa
 Branch: glsl-pp-rework-2
 Commit: 491f384c3958067e6c4c994041f5d8d413b806bc
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=491f384c3958067e6c4c994041f5d8d413b806bc

 Author: José Fonsecajfons...@vmware.com
 Date:   Thu Dec 10 16:29:04 2009 +

 scons: Get GLSL code building correctly when cross compiling.

 This is quite messy. GLSL code has to be built twice: one for the
 host OS, another for the target OS.
  
 

   
 is there also an solution for building without scons?

 i am get the follow when i am crosscompile for x86_64 target with i386 host:

 gmake[3]: Entering directory 
 `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa/shader/slang/library'
 ../../../../../src/glsl/apps/compile fragment slang_common_builtin.gc 
 slang_common_builtin_gc.h
 ../../../../../src/glsl/apps/compile: 
 ../../../../../src/glsl/apps/compile: cannot execute binary file
 gmake[3]: *** [slang_common_builtin_gc.h] Error 126
 gmake[3]: Leaving directory 
 `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa/shader/slang/library'
 gmake[2]: *** [glsl_builtin] Error 1
 gmake[2]: Leaving directory 
 `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src/mesa'
 make[1]: *** [subdirs] Error 1
 make[1]: Leaving directory 
 `/home/stephan/projects/openelec/build.OpenELEC-intel_x64.x86_64.devel/Mesa-master-20100108/src'
 make: *** [default] Error 1
 

 Nope, and I don't think it will be easy, since Mesa's makefile system
 doesn't support building stuff on a separate dir.

   
Yes, and that's a good reason to go back to how it was before -- 
regenerating the files and checking them in by a developer that made a 
change to input files.

I will do the necessary changes.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] add double opcodes to tgsi

2010-01-11 Thread michal

Igor Oliveira wrote on 2010-01-11 14:37:
 These patches add support to double opcodes as discussed in mail list.
 The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul,
 dmuladd, drcp and dslt.
 They are used like suggested by Zack:

 MOVD A.xy, C.xy, c.xy

 where x is the lsb and y is the msb.

 There are still missing some opcodes being implemented(i will send the
 code soon), they are:
 dfrac, dfracexp, dldexp and convert between float and double.
   

Igor,

There are some bits and pieces in your patch that I am not sure if they 
are correct. To understand that, let me first create a new feature 
branch (gallium-double-opcodes) and add a few basic opcodes (F2D, D2F, 
DMOV, DADD). Also, since there is no API state tracker that supports 
doubles, I will add a test to the python state tracker to see how well 
things are going. Once done, it will be a lot easier for us to read your 
patches that introduce new opcodes.

What do you think?

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [mesa] svga: Fix error: cannot take address of bit-field 'texture_target' in svga_tgsi.h

2010-01-08 Thread michal

Sedat Dilek wrote on 2010-01-06 18:54:
 Compile-tested OK.

   
Thanks, commited.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] add support to double opcodes

2010-01-07 Thread michal

José Fonseca wrote on 2010-01-07 14:45:
 On Thu, 2010-01-07 at 05:25 -0800, Zack Rusin wrote:
   
 On Thursday 07 January 2010 06:50:36 José Fonseca wrote:
 
 I wonder if storage size of registers is such a big issue. Knowing the
 storage size of a register matters mostly for indexable temps. For
 regular assignments and intermediate computations storage everything
 gets transformed in SSA form, and the register size can be determined
 from the instructions where it is generated/used and there is no need
 for consistency.

 For example, imagine a shader that has:

TEX TEMP[0], SAMP[0], IN[0]  // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT
  -- use 4x32bit float registers MAX ??
...
TEX TEMP[0], SAMP[1], IN[0]  // SAMP[1] is a
  PIPE_FORMAT_R64G64B64A64_FLOAT -- use 4x64bit double registers DMAX ,
  TEMP[0], ???
   
 That's not an issue because such a format doesn't exist. There's no 256bit 
 sampling in any api. It's one of the self-inflicted wounds that we have. 
 R64G64 
 is the most you'll get right now.
 

 That's interesting. Never realized that.

   
TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both 
  PIPE_FORMAT_R8G8B8A8_UNORM  -- use 4x8bit unorm registers MOV OUT[0],
  TEMP[0]

 etc.

 There is actually programmable 3d hardware out there that has special
 4x8bit registers, and for performance the compiler has to deduct where
 to use those 4xbit. llvmpipe will need to do similar thing, as the
 smaller the bit-width the higher the throughput. And at least current
 gallium statetrackers will reuse temps with no attempt to maintain
 consistency in use.

 So if the compilers already need to deal with this, if this notion that
 registers are 128bits is really necessary, and will prevail in the long
 term.
   
 Somehow this is the core issue it's the fact that TGSI is untyped anything 
 but 
 register size is constant implies TGSI is typed but the actual types have 
 to be deduced by the drivers which goes against what Gallium was about (we 
 put the complexity in the driver). 

 The question of 8bit vs 32bit and 64bit vs 32bit are really different 
 questions. The first one is about optimization - it will work perfectly well 
 if 
 the 128bit registers will be used, the second one is about correctness - it 
 will not work if 128bit registers will be used for doubles and it will not 
 work if 256bit registers will be used for floats. 
 

 True.

   
 Also we don't have a 4x8bit 
 instructions, they're all 4x32bit instructions (float, unsigned ints, signed 
 ints), so doubles will be the first differently sized instructions. Which in 
 turn will mean that either TGSI will have to be actually statically typed, 
 but 
 not typed declared i.e. D_ADD will only be able to take two 256bit registers 
 as inputs and if anything else is passed it has to throw an error, which is 
 especially difficult that those registers didn't have a size declared but it 
 would have to be inferred from previous instructions, or we'd have to allow 
 mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 
 which 
 simply moves the problem from above into the driver.

 Really, unless we'll say the entire pipeline can run in 4x64 like we did 
 for 
 floats then I don't see an easier way of dealing with this than the xy, zw, 
 swizzle form.
 

 Ok. I didn't felt strongly either way, but now I'm more convinced that
 restricting xy zw swizzles is less painful. Thanks for explaining this
 Zack.

   
Zack,

1. Do I understand correctly that while

D_ADD dst.xy, src1.xy, src2.zw

will add one double, is the following code

D_ADD dst, src1, src2.zwxy

also valid, and results in two doubles being added together?

2. Is the list of double-precision opcodes proposed by Igor roughly 
enough for OpenCL implementation?

Thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.

2010-01-06 Thread michal


Attached.
From c91abe0b58abc69743c162fd55f7461a716b9141 Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Wed, 6 Jan 2010 09:48:41 +0100
Subject: [PATCH] Make sure we use only signed/unsigned ints with bitfields.

Seems to be the only way to stay fully portable.
---
 src/gallium/drivers/svga/svga_tgsi.h   |   18 +-
 .../dri/r300/compiler/radeon_pair_regalloc.c   |2 +-
 .../drivers/dri/r300/compiler/radeon_program.h |   14 +++---
 .../dri/r300/compiler/radeon_program_pair.h|   10 +-
 4 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_tgsi.h 
b/src/gallium/drivers/svga/svga_tgsi.h
index 896c90a..d132525 100644
--- a/src/gallium/drivers/svga/svga_tgsi.h
+++ b/src/gallium/drivers/svga/svga_tgsi.h
@@ -39,24 +39,24 @@ struct tgsi_token;
 
 struct svga_vs_compile_key
 {
-   ubyte need_prescale:1;
-   ubyte allow_psiz:1;
unsigned zero_stride_vertex_elements;
-   ubyte num_zero_stride_vertex_elements:6;
+   unsigned need_prescale:1;
+   unsigned allow_psiz:1;
+   unsigned num_zero_stride_vertex_elements:6;
 };
 
 struct svga_fs_compile_key
 {
-   boolean light_twoside:1;
-   boolean front_cw:1;
+   unsigned light_twoside:1;
+   unsigned front_cw:1;
ubyte num_textures;
ubyte num_unnormalized_coords;
struct {
-  ubyte compare_mode   : 1;
-  ubyte compare_func   : 3;
-  ubyte unnormalized   : 1;
+  unsigned compare_mode   : 1;
+  unsigned compare_func   : 3;
+  unsigned unnormalized   : 1;
 
-  ubyte width_height_idx   : 7;
+  unsigned width_height_idx   : 7;
 
   ubyte texture_target;
} tex[PIPE_MAX_SAMPLERS];
diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c 
b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
index 828d0c8..b2fe7f7 100644
--- a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
+++ b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
@@ -49,7 +49,7 @@ struct register_info {
 
unsigned int Used:1;
unsigned int Allocated:1;
-   rc_register_file File:3;
+   unsigned int File:3;
unsigned int Index:RC_REGISTER_INDEX_BITS;
 };
 
diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program.h 
b/src/mesa/drivers/dri/r300/compiler/radeon_program.h
index 0359288..e318867 100644
--- a/src/mesa/drivers/dri/r300/compiler/radeon_program.h
+++ b/src/mesa/drivers/dri/r300/compiler/radeon_program.h
@@ -39,7 +39,7 @@
 struct radeon_compiler;
 
 struct rc_src_register {
-   rc_register_file File:3;
+   unsigned int File:3;
 
/** Negative values may be used for relative addressing. */
signed int Index:(RC_REGISTER_INDEX_BITS+1);
@@ -55,7 +55,7 @@ struct rc_src_register {
 };
 
 struct rc_dst_register {
-   rc_register_file File:3;
+   unsigned int File:3;
 
/** Negative values may be used for relative addressing. */
signed int Index:(RC_REGISTER_INDEX_BITS+1);
@@ -79,20 +79,20 @@ struct rc_sub_instruction {
/**
 * Opcode of this instruction, according to \ref rc_opcode enums.
 */
-   rc_opcode Opcode:8;
+   unsigned int Opcode:8;
 
/**
 * Saturate each value of the result to the range [0,1] or [-1,1],
 * according to \ref rc_saturate_mode enums.
 */
-   rc_saturate_mode SaturateMode:2;
+   unsigned int SaturateMode:2;
 
/**
 * Writing to the special register RC_SPECIAL_ALU_RESULT
 */
/*...@{*/
-   rc_write_aluresult WriteALUResult:2;
-   rc_compare_func ALUResultCompare:3;
+   unsigned int WriteALUResult:2;
+   unsigned int ALUResultCompare:3;
/*...@}*/
 
/**
@@ -103,7 +103,7 @@ struct rc_sub_instruction {
unsigned int TexSrcUnit:5;
 
/** Source texture target, one of the \ref rc_texture_target enums */
-   rc_texture_target TexSrcTarget:3;
+   unsigned int TexSrcTarget:3;
 
/** True if tex instruction should do shadow comparison */
unsigned int TexShadow:1;
diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h 
b/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h
index 1600598..6685ade 100644
--- a/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h
+++ b/src/mesa/drivers/dri/r300/compiler/radeon_program_pair.h
@@ -52,12 +52,12 @@ struct r300_fragment_program_compiler;
 
 struct radeon_pair_instruction_source {
unsigned int Used:1;
-   rc_register_file File:3;
+   unsigned int File:3;
unsigned int Index:RC_REGISTER_INDEX_BITS;
 };
 
 struct radeon_pair_instruction_rgb {
-   rc_opcode Opcode:8;
+   unsigned int Opcode:8;
unsigned int DestIndex:RC_REGISTER_INDEX_BITS;
unsigned int WriteMask:3;
unsigned int OutputWriteMask:3;
@@ -74,7 +74,7 @@ struct radeon_pair_instruction_rgb {
 };
 
 struct radeon_pair_instruction_alpha

Re: [Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.

2010-01-06 Thread michal


Keith Whitwell wrote on 2010-01-06 10:43:

On Wed, 2010-01-06 at 00:50 -0800, michal wrote:
  

diff --git a/src/gallium/drivers/svga/svga_tgsi.h
b/src/gallium/drivers/svga/svga_tgsi.h
index 896c90a..d132525 100644
--- a/src/gallium/drivers/svga/svga_tgsi.h
+++ b/src/gallium/drivers/svga/svga_tgsi.h
@@ -39,24 +39,24 @@ struct tgsi_token;
 
 struct svga_vs_compile_key

 {
-   ubyte need_prescale:1;
-   ubyte allow_psiz:1;
unsigned zero_stride_vertex_elements;
-   ubyte num_zero_stride_vertex_elements:6;
+   unsigned need_prescale:1;
+   unsigned allow_psiz:1;
+   unsigned num_zero_stride_vertex_elements:6;
 };
 
 struct svga_fs_compile_key

 {
-   boolean light_twoside:1;
-   boolean front_cw:1;
+   unsigned light_twoside:1;
+   unsigned front_cw:1;
ubyte num_textures;
ubyte num_unnormalized_coords;
struct {
-  ubyte compare_mode   : 1;
-  ubyte compare_func   : 3;
-  ubyte unnormalized   : 1;
+  unsigned compare_mode   : 1;
+  unsigned compare_func   : 3;
+  unsigned unnormalized   : 1;
 
-  ubyte width_height_idx   : 7;

+  unsigned width_height_idx   : 7;
 
   ubyte texture_target;
} tex[PIPE_MAX_SAMPLERS]; 




Michal, these two structs should be kept as small as possible.  It looks
like there has been some drift away from well-packed fields anyway, but
if you're making this change can you please take a moment to repack the
fields as a result and get these down to as small as possible?

In particular, it looks like fs_compile_key::tex array has probably
doubled in size - could you repack it by changing texture_target to, eg:

unsigned texture_target:8;  


or similar?

The same would apply for the other ubyte fields that are now probably no
longer tightly packed.


  

Attached an update.

There was nothing more I could do to svga_vs_compile_key, though, as the 
zero_stride_vertex_elements field is being fully used.
From af7c95dd2539e6b5d64ad62c30ef6952e83fcf98 Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Wed, 6 Jan 2010 11:23:43 +0100
Subject: [PATCH] Make sure we use only signed/unsigned ints with bitfields.

Seems to be the only way to stay fully portable.
---
 src/gallium/drivers/svga/svga_tgsi.h   |   26 +--
 .../dri/r300/compiler/radeon_pair_regalloc.c   |2 +-
 .../drivers/dri/r300/compiler/radeon_program.h |   14 +-
 .../dri/r300/compiler/radeon_program_pair.h|   10 
 4 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_tgsi.h 
b/src/gallium/drivers/svga/svga_tgsi.h
index 896c90a..1309c33 100644
--- a/src/gallium/drivers/svga/svga_tgsi.h
+++ b/src/gallium/drivers/svga/svga_tgsi.h
@@ -39,26 +39,24 @@ struct tgsi_token;
 
 struct svga_vs_compile_key
 {
-   ubyte need_prescale:1;
-   ubyte allow_psiz:1;
unsigned zero_stride_vertex_elements;
-   ubyte num_zero_stride_vertex_elements:6;
+   unsigned need_prescale:1;
+   unsigned allow_psiz:1;
+   unsigned num_zero_stride_vertex_elements:6;
 };
 
 struct svga_fs_compile_key
 {
-   boolean light_twoside:1;
-   boolean front_cw:1;
-   ubyte num_textures;
-   ubyte num_unnormalized_coords;
+   unsigned light_twoside:1;
+   unsigned front_cw:1;
+   unsigned num_textures:8;
+   unsigned num_unnormalized_coords:8;
struct {
-  ubyte compare_mode   : 1;
-  ubyte compare_func   : 3;
-  ubyte unnormalized   : 1;
-
-  ubyte width_height_idx   : 7;
-
-  ubyte texture_target;
+  unsigned compare_mode:1;
+  unsigned compare_func:3;
+  unsigned unnormalized:1;
+  unsigned width_height_idx:7;
+  unsigned texture_target:8;
} tex[PIPE_MAX_SAMPLERS];
 };
 
diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c 
b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
index 828d0c8..b2fe7f7 100644
--- a/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
+++ b/src/mesa/drivers/dri/r300/compiler/radeon_pair_regalloc.c
@@ -49,7 +49,7 @@ struct register_info {
 
unsigned int Used:1;
unsigned int Allocated:1;
-   rc_register_file File:3;
+   unsigned int File:3;
unsigned int Index:RC_REGISTER_INDEX_BITS;
 };
 
diff --git a/src/mesa/drivers/dri/r300/compiler/radeon_program.h 
b/src/mesa/drivers/dri/r300/compiler/radeon_program.h
index 0359288..e318867 100644
--- a/src/mesa/drivers/dri/r300/compiler/radeon_program.h
+++ b/src/mesa/drivers/dri/r300/compiler/radeon_program.h
@@ -39,7 +39,7 @@
 struct radeon_compiler;
 
 struct rc_src_register {
-   rc_register_file File:3;
+   unsigned int File:3;
 
/** Negative values may be used for relative addressing. */
signed int Index:(RC_REGISTER_INDEX_BITS+1);
@@ -55,7 +55,7 @@ struct rc_src_register {
 };
 
 struct rc_dst_register {
-   rc_register_file File:3;
+   unsigned int File:3;
 
/** Negative values may be used for relative addressing. */
signed int

Re: [Mesa3d-dev] Mystery of u_format.csv

2010-01-06 Thread michal

José Fonseca wrote on 2010-01-06 15:03:
 On Tue, 2010-01-05 at 23:36 -0800, michal wrote:
   
 michal wrote on 2010-01-06 07:58:
 
 michal wrote on 2009-12-22 10:00:
   
   
 Marek Olšák wrote on 2009-12-22 08:40:
   
 
 
 Hi,

 I noticed that gallium/auxiliary/util/u_format.csv contains some weird 
 swizzling, for example see this:

 $ grep zyxw u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , 
 un1 , zyxw, rgb
 PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , 
 un4 , zyxw, rgb
 PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , 
 sn8 , zyxw, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , u8 
  , zyxw, srgb

 It's hard to believe that ARGB, ABGR, and BGRA have the same 
 swizzling. Let's continue our journey:

 $ grep A8R8G8B8 u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , wxyz, srgb

 Same formats, different swizzling? Also:

 $ grep B8G8R8A8 u_format.csv
 PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , yzwx, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , zyxw, srgb

 Same formats, different swizzling? I don't really get it. And there's 
 much more cases like these. Could someone tell me what the intended 
 order of channels should be? (or possibly propose a fix) The meaning 
 of the whole table is self-contradictory and it's definitely the 
 source of some r300g bugs.

 
   
   
 Marek,

 Yes, that seems like a defect. The format swizzle field tells us how to 
 swizzle the incoming pixel so that its components are ordered in some 
 predefined order. For RGB and SRGB colorspaces the order is R, G, B and 
 A. For depth-stencil, ie. ZS color space the order is Z and then S.

 I will have a look at this.
   
 
 
 Marek, Jose,

 Can you review the attached patch?
   
   
 Ouch, it looks like we will have to leave 24-bit (s)rgb formats with 
 array layout as the current code generator will bite us on big endian 
 platforms. Attached an updated patch.
 

 Why are you changing the layout from array to arith? Please leave that
 alone.

   

I did this because in the other thread you defined arith layout to apply 
to 32-or-less-bit formats. Since I still believe arith and array layout 
are somewhat redundant, we can go the other way round and convert other 
arith layouts to array, save for 16-or-less-bit formats.

 Yes, the code generator needs a big_ending - little endian call to be
 correct on big endian platforms, as gallium formats should always be
 thougth of in little endian terms, just like most hardware is.

   


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mystery of u_format.csv

2010-01-06 Thread michal

Christoph Bumiller wrote on 2010-01-06 12:08:
 On 06.01.2010 08:36, michal wrote:
 michal wrote on 2010-01-06 07:58:
 michal wrote on 2009-12-22 10:00:
  
 Marek Olšák wrote on 2009-12-22 08:40:
  
 Hi,

 I noticed that gallium/auxiliary/util/u_format.csv contains some 
 weird swizzling, for example see this:

 $ grep zyxw u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 
 , un8 , zyxw, rgb
 PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 
 , un1 , zyxw, rgb
 PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 
 , un4 , zyxw, rgb
 PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 
 , sn8 , zyxw, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  
 , u8  , zyxw, srgb

 It's hard to believe that ARGB, ABGR, and BGRA have the same 
 swizzling. Let's continue our journey:

 $ grep A8R8G8B8 u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 
 , un8 , zyxw, rgb
 PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8  , u8  , u8  
 , u8  , wxyz, srgb

 Same formats, different swizzling? Also:

 $ grep B8G8R8A8 u_format.csv
 PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 
 , un8 , yzwx, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  
 , u8  , zyxw, srgb

 Same formats, different swizzling? I don't really get it. And 
 there's much more cases like these. Could someone tell me what the 
 intended order of channels should be? (or possibly propose a fix) 
 The meaning of the whole table is self-contradictory and it's 
 definitely the source of some r300g bugs.

   
 Marek,

 Yes, that seems like a defect. The format swizzle field tells us 
 how to swizzle the incoming pixel so that its components are 
 ordered in some predefined order. For RGB and SRGB colorspaces the 
 order is R, G, B and A. For depth-stencil, ie. ZS color space the 
 order is Z and then S.

 I will have a look at this.
   
 Marek, Jose,

 Can you review the attached patch?
   

 Ouch, it looks like we will have to leave 24-bit (s)rgb formats with 
 array layout as the current code generator will bite us on big endian 
 platforms. Attached an updated patch.

 It looks like I always thought how to interpret the formats now.

 Which means the vertex element formats in mesa/state_tracker/st_draw.c
 should be corrected - the R8G8B8A8 and R8G8 vertex elements should be
 reversed, and the BGRA format should be A8R8G8B8.

 At least this would fix my (gallium/drivers/nv50/nv50.vbo)
if (desc-swizzle[0] == UTIL_FORMAT_SWIZZLE_Z) /* BGRA */
 check.


I'm affraid you will also need to check desc-layout field, and if it is 
array, compare against desc-swizzle[3].

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mystery of u_format.csv

2010-01-06 Thread michal

José Fonseca wrote on 2010-01-06 15:26:
 On Wed, 2010-01-06 at 06:11 -0800, michal wrote:
   
 José Fonseca wrote on 2010-01-06 15:03:
 
 On Tue, 2010-01-05 at 23:36 -0800, michal wrote:
   
   
 michal wrote on 2010-01-06 07:58:
 
 
 michal wrote on 2009-12-22 10:00:
   
   
   
 Marek Olšák wrote on 2009-12-22 08:40:
   
 
 
 
 Hi,

 I noticed that gallium/auxiliary/util/u_format.csv contains some weird 
 swizzling, for example see this:

 $ grep zyxw u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , 
 un1 , zyxw, rgb
 PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , 
 un4 , zyxw, rgb
 PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , 
 sn8 , zyxw, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , u8 
  , zyxw, srgb

 It's hard to believe that ARGB, ABGR, and BGRA have the same 
 swizzling. Let's continue our journey:

 $ grep A8R8G8B8 u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , wxyz, srgb

 Same formats, different swizzling? Also:

 $ grep B8G8R8A8 u_format.csv
 PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , yzwx, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , zyxw, srgb

 Same formats, different swizzling? I don't really get it. And there's 
 much more cases like these. Could someone tell me what the intended 
 order of channels should be? (or possibly propose a fix) The meaning 
 of the whole table is self-contradictory and it's definitely the 
 source of some r300g bugs.

 
   
   
   
 Marek,

 Yes, that seems like a defect. The format swizzle field tells us how to 
 swizzle the incoming pixel so that its components are ordered in some 
 predefined order. For RGB and SRGB colorspaces the order is R, G, B and 
 A. For depth-stencil, ie. ZS color space the order is Z and then S.

 I will have a look at this.
   
 
 
 
 Marek, Jose,

 Can you review the attached patch?
   
   
   
 Ouch, it looks like we will have to leave 24-bit (s)rgb formats with 
 array layout as the current code generator will bite us on big endian 
 platforms. Attached an updated patch.
 
 
 Why are you changing the layout from array to arith? Please leave that
 alone.

   
   
 I did this because in the other thread you defined arith layout to apply 
 to 32-or-less-bit formats. Since I still believe arith and array layout 
 are somewhat redundant, we can go the other way round and convert other 
 arith layouts to array, save for 16-or-less-bit formats.
 

 Indeed arith applies to 32-or-less-bit formats, but I never meant to say
 that all 32-or-less-bit formats must be in arith.

   

Understood.

 They are indeed redundant, but array is/will be more efficient and when
 code generation is more robust and big-endian-safe all x8, x8x8, x8x8x8,
 x8x8x8x8x8 formats will be likely in array layout. 

   

That is okay, we agree on that part. The question is what is the reason 
we treat PIPE_FORMAT_R8G8B8A8_UNORM as having array layout (before my 
patch), and e.g. PIPE_FORMAT_B8G8R8A8_UNORM as having arith layout?


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mystery of u_format.csv

2010-01-06 Thread michal

Michel Dänzer wrote on 2010-01-06 15:23:
 On Wed, 2010-01-06 at 14:03 +, José Fonseca wrote: 
   
 On Tue, 2010-01-05 at 23:36 -0800, michal wrote:
 
 michal wrote on 2010-01-06 07:58:
   
 michal wrote on 2009-12-22 10:00:
   
 
 Marek Olšák wrote on 2009-12-22 08:40:
   
 
   
 Hi,

 I noticed that gallium/auxiliary/util/u_format.csv contains some weird 
 swizzling, for example see this:

 $ grep zyxw u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , 
 un1 , zyxw, rgb
 PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , 
 un4 , zyxw, rgb
 PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , 
 sn8 , zyxw, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , u8 
  , zyxw, srgb

 It's hard to believe that ARGB, ABGR, and BGRA have the same 
 swizzling. Let's continue our journey:

 $ grep A8R8G8B8 u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , wxyz, srgb

 Same formats, different swizzling? Also:

 $ grep B8G8R8A8 u_format.csv
 PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , yzwx, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , zyxw, srgb

 Same formats, different swizzling? I don't really get it. And there's 
 much more cases like these. Could someone tell me what the intended 
 order of channels should be? (or possibly propose a fix) The meaning 
 of the whole table is self-contradictory and it's definitely the 
 source of some r300g bugs.

 
   
 
 Marek,

 Yes, that seems like a defect. The format swizzle field tells us how to 
 swizzle the incoming pixel so that its components are ordered in some 
 predefined order. For RGB and SRGB colorspaces the order is R, G, B and 
 A. For depth-stencil, ie. ZS color space the order is Z and then S.

 I will have a look at this.
   
 
   
 Marek, Jose,

 Can you review the attached patch?
   
 
 Ouch, it looks like we will have to leave 24-bit (s)rgb formats with 
 array layout as the current code generator will bite us on big endian 
 platforms. Attached an updated patch.
   
 Why are you changing the layout from array to arith? Please leave that
 alone.

 Yes, the code generator needs a big_ending - little endian call to be
 correct on big endian platforms, as gallium formats should always be
 thougth of in little endian terms, just like most hardware is.
 

 Actually, 'array' formats should be endianness neutral, and IMO 'arith'
 formats should be defined in the CPU endianness. Though as discussed
 before, having 'reversed' formats defined in the other endianness as
 well might be useful. Drivers which can work on setups where the CPU
 endianness doesn't match the GPU endianness should possibly only use
 'array' formats, but then there might need to be some kind of mapping
 between the two kinds of formats somewhere, maybe in the state trackers
 or an auxiliary module...

   
Interesting. Is there any reference that would say which formats are 
'array', and which are not? Or is it a simple rule that when every 
component's bitsize is greater-or-equal to, say, 16, then it's an array 
format?


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Make sure we use only signed/unsigned ints with bitfields.

2010-01-06 Thread michal

Keith Whitwell wrote on 2010-01-06 11:31:
 Looks good to me Michal.

   
Thanks, commited.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [mesa] svga: Fix error: cannot take address of bit-field 'texture_target' in svga_tgsi.h

2010-01-06 Thread michal


Brian Paul wrote on 2010-01-06 18:07:

Sedat Dilek wrote:
  

Hi,

this patch fixes a build-error in mesa GIT master after...

commit  251363e8f1287b54dc7734e690daf2ae96728faf (patch)
configs: set INTEL_LIBS, INTEL_CFLAGS, etcmaster

From my build-log:
...
In file included from svga_pipe_fs.c:37:
svga_tgsi.h: In function 'svga_fs_key_size':
svga_tgsi.h:122: error: cannot take address of bit-field 'texture_target'
make[4]: *** [svga_pipe_fs.o] Error 1

Might be introduced in...

commit  955f51270bb60ad77dba049799587dc7c0fb4dda
Make sure we use only signed/unsigned ints with bitfields.

Kind Regars,
- Sedat -




I just fixed that.

  

Actually, we could go back to bitfields and fix broken svga_fs_key_size().

Attached a patch.

Can somebody review, test-build and commit?

From 7321aef0dfc5bb160ec8a33d1d4e686419f2ed3d Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Wed, 6 Jan 2010 18:36:45 +0100
Subject: [PATCH] svga: Fix fs key size computation and key comparison.

This also allows us to have texture_target
back as a bitfield and save us a few bytes.
---
 src/gallium/drivers/svga/svga_state_fs.c |9 +++--
 src/gallium/drivers/svga/svga_tgsi.h |5 ++---
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_state_fs.c 
b/src/gallium/drivers/svga/svga_state_fs.c
index 272d1dd..bba80a9 100644
--- a/src/gallium/drivers/svga/svga_state_fs.c
+++ b/src/gallium/drivers/svga/svga_state_fs.c
@@ -40,8 +40,13 @@
 static INLINE int compare_fs_keys( const struct svga_fs_compile_key *a,
const struct svga_fs_compile_key *b )
 {
-   unsigned keysize = svga_fs_key_size( a );
-   return memcmp( a, b, keysize );
+   unsigned keysize_a = svga_fs_key_size( a );
+   unsigned keysize_b = svga_fs_key_size( b );
+
+   if (keysize_a != keysize_b) {
+  return (int)(keysize_a - keysize_b);
+   }
+   return memcmp( a, b, keysize_a );
 }
 
 
diff --git a/src/gallium/drivers/svga/svga_tgsi.h 
b/src/gallium/drivers/svga/svga_tgsi.h
index 043b991..737a221 100644
--- a/src/gallium/drivers/svga/svga_tgsi.h
+++ b/src/gallium/drivers/svga/svga_tgsi.h
@@ -56,7 +56,7 @@ struct svga_fs_compile_key
   unsigned compare_func:3;
   unsigned unnormalized:1;
   unsigned width_height_idx:7;
-  ubyte texture_target;
+  unsigned texture_target:8;
} tex[PIPE_MAX_SAMPLERS];
 };
 
@@ -119,8 +119,7 @@ static INLINE unsigned svga_vs_key_size( const struct 
svga_vs_compile_key *key )
 
 static INLINE unsigned svga_fs_key_size( const struct svga_fs_compile_key *key 
)
 {
-   return (const char *)key-tex[key-num_textures].texture_target -
-  (const char *)key;
+   return (const char *)key-tex[key-num_textures] - (const char *)key;
 }
 
 struct svga_shader_result *
-- 
1.6.4.msysgit.0

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev ___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-integer-opcodes branch

2010-01-05 Thread michal

Keith Whitwell wrote on 2010-01-04 18:46:
 On Mon, 2010-01-04 at 09:39 -0800, Brian Paul wrote:
   
 michal wrote:
 
 Hi,

 I would like to merge gallium-integer-opcodes branch to master this 
 week. This feature branch adds support for integer operations in TGSI 
 that is required by GLSL 1.30.

 In summary:
 * add a bunch of opcodes operating on signed and unsigned integers,
 * add signed/unsigned integer immediate types,
 * add new opcodes for switch-case statements,
 * source operand modifiers (abs, neg) treat operands as integers if a 
 particular opcode expects integers as arguments.

 Since integer opcodes are a dependency for other future features, the 
 plan is to merge this branch to master and then fork again to develop 
 other features that sit on top of it.

 Please review and comment.
   
 Looks pretty good, Michal.  You can merge whenever you want, as far as 
 I'm concerned.

 

 Yes, looks good to me too Michal -- feel free to merge when you're
 ready.

   
Thanks for having a look, guys.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Merge gallium-docs

2010-01-05 Thread michal

José Fonseca wrote on 2010-01-05 17:12:
 On Tue, 2010-01-05 at 07:57 -0800, Keith Whitwell wrote:
   
 This doesn't really need to be on a branch, and by merging it I can
 start to ask for people to keep it up-to-date with interface changes...

 If nobody objects, I'll do this in the next couple of days.

 Keith
 

 Sound good to me. It makes it officially, so we can all start improving
 it.

 A minor nitpick: I'd prefer for the derived html not be committed into
 git. It forces every body to have the tool installed, adds noise to
 patches, and I really don't see the point if the .rst is already so easy
 to read. I'd rather have a cronjob in fdo.org generating and publishing
 the HTML/etc from the master HEAD.

   
Also, it looks like you have to have the tool installed anyway -- at 
least to view the math formulas (e.g. TGSI pages).

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] gallium-integer-opcodes branch

2010-01-03 Thread michal

Hi,

I would like to merge gallium-integer-opcodes branch to master this 
week. This feature branch adds support for integer operations in TGSI 
that is required by GLSL 1.30.

In summary:
* add a bunch of opcodes operating on signed and unsigned integers,
* add signed/unsigned integer immediate types,
* add new opcodes for switch-case statements,
* source operand modifiers (abs, neg) treat operands as integers if a 
particular opcode expects integers as arguments.

Since integer opcodes are a dependency for other future features, the 
plan is to merge this branch to master and then fork again to develop 
other features that sit on top of it.

Please review and comment.

Thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] fix missing semantic name in tgsi_text.c

2010-01-01 Thread michal

Igor Oliveira wrote on 2010-01-01 18:03:
 Hi,

 i found a tgsi bug running vega state tracker.
 The bug happens because in tgsi_text.c line 991:
 for (i = 0; i  TGSI_SEMANTIC_COUNT; i++)

 TGSI_SEMANTIC_COUNT is bigger than semantic_name declared in tgsi_text.c:
  936 static const char *semantic_names[TGSI_SEMANTIC_COUNT] =
  937 {
  938POSITION,
  939COLOR,
  940BCOLOR,
  941FOG,
  942PSIZE,
  943GENERIC,
  944NORMAL,
  945FACE,
  946PRIM_ID
  947 };


 TGSI_SEMANTIC_COUNT is 10 but there is just 8 elements seeing other
 files i see that there is missing semantic name: EDGEFLAG. The patch
 below add EDGEFLAG in semantic_names.

 diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c
 b/src/gallium/auxiliary/tgsi/tgsi_text.c
 index 2e3f9a9..9fcffed 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
 @@ -932,6 +932,7 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] =
 GENERIC,
 NORMAL,
 FACE,
 +   EDGEFLAG,
 PRIM_ID
  };

   
Commited, thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (instanced-arrays): translate: Implement instancing for linear SSE run .

2009-12-30 Thread michal

Keith Whitwell wrote on 2009-12-30 16:22:
 Michal,

 Did you update the 'C' version of translate for this new functionality?
 You can't just extend the fast path - the fallback/default mode needs to
 be updated as well.

   
Yes, I did that in the previous commit.

 Also, I'm sure it's not necessary to do a divide per vertex-element to
 achieve instancing.  It can't be that hard to throw some more counters
 at the problem and do this with a couple of adds instead of a divide...

   
The division is done once per instance, not per every vertex attrib. Are 
you serious about optimising such low-profile things?

 Keith

 On Wed, 2009-12-30 at 05:23 -0800, Micha?? Kr??l wrote:
   
 Module: Mesa
 Branch: instanced-arrays
 Commit: 09c0287b84725098c0b365668231ddf00487c84c
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=09c0287b84725098c0b365668231ddf00487c84c

 Author: Michal Krol mic...@vmware.com
 Date:   Wed Dec 30 14:23:12 2009 +0100

 translate: Implement instancing for linear SSE run.

 ---

  src/gallium/auxiliary/translate/translate_sse.c |  154 
 ++-
  1 files changed, 120 insertions(+), 34 deletions(-)

 diff --git a/src/gallium/auxiliary/translate/translate_sse.c 
 b/src/gallium/auxiliary/translate/translate_sse.c
 index edd0be1..ddfa4c6 100644
 --- a/src/gallium/auxiliary/translate/translate_sse.c
 +++ b/src/gallium/auxiliary/translate/translate_sse.c
 @@ -49,6 +49,7 @@
  typedef void (PIPE_CDECL *run_func)( struct translate *translate,
   unsigned start,
   unsigned count,
 + unsigned instance_id,
   void *output_buffer );

  typedef void (PIPE_CDECL *run_elts_func)( struct translate *translate,
 @@ -59,7 +60,12 @@ typedef void (PIPE_CDECL *run_elts_func)( struct 
 translate *translate,
  struct translate_buffer {
 const void *base_ptr;
 unsigned stride;
 -   void *ptr;   /* updated per vertex */
 +};
 +
 +struct translate_buffer_varient {
 +   unsigned buffer_index;
 +   unsigned instance_divisor;
 +   void *ptr;/* updated either per vertex or per 
 instance */
  };


 @@ -81,6 +87,16 @@ struct translate_sse {
 struct translate_buffer buffer[PIPE_MAX_ATTRIBS];
 unsigned nr_buffers;

 +   /* Multiple buffer varients can map to a single buffer. */
 +   struct translate_buffer_varient buffer_varient[PIPE_MAX_ATTRIBS];
 +   unsigned nr_buffer_varients;
 +
 +   /* Multiple elements can map to a single buffer varient. */
 +   unsigned element_to_buffer_varient[PIPE_MAX_ATTRIBS];
 +
 +   boolean use_instancing;
 +   unsigned instance_id;
 +
 run_func  gen_run;
 run_elts_func gen_run_elts;

 @@ -360,31 +376,59 @@ static boolean init_inputs( struct translate_sse *p,
  {
 unsigned i;
 if (linear) {
 -  for (i = 0; i  p-nr_buffers; i++) {
 +  struct x86_reg instance_id = x86_make_disp(p-machine_EDX,
 + get_offset(p, 
 p-instance_id));
 +
 +  for (i = 0; i  p-nr_buffer_varients; i++) {
 + struct translate_buffer_varient *varient = p-buffer_varient[i];
 + struct translate_buffer *buffer = 
 p-buffer[varient-buffer_index];
   struct x86_reg buf_stride   = x86_make_disp(p-machine_EDX,
 - get_offset(p, 
 p-buffer[i].stride));
 + get_offset(p, 
 buffer-stride));
   struct x86_reg buf_ptr  = x86_make_disp(p-machine_EDX,
 - get_offset(p, 
 p-buffer[i].ptr));
 + get_offset(p, 
 varient-ptr));
   struct x86_reg buf_base_ptr = x86_make_disp(p-machine_EDX,
 - get_offset(p, 
 p-buffer[i].base_ptr));
 + get_offset(p, 
 buffer-base_ptr));
   struct x86_reg elt = p-idx_EBX;
 - struct x86_reg tmp = p-tmp_EAX;
 -
 + struct x86_reg tmp_EAX = p-tmp_EAX;

   /* Calculate pointer to first attrib:
 +  *   base_ptr + stride * index, where index depends on instance 
 divisor
*/
 - x86_mov(p-func, tmp, buf_stride);
 - x86_imul(p-func, tmp, elt);
 - x86_add(p-func, tmp, buf_base_ptr);
 + if (varient-instance_divisor) {
 +/* Our index is instance ID divided by instance divisor.
 + */
 +x86_mov(p-func, tmp_EAX, instance_id);
 +
 +if (varient-instance_divisor != 1) {
 +   struct x86_reg tmp_EDX = p-machine_EDX;
 +   struct x86_reg tmp_ECX = p-outbuf_ECX;
 +
 +   /* TODO: Add x86_shr() to rtasm and use it whenever
 +*   instance divisor is power of two.
 +*/
 +
 +   x86_push(p

Re: [Mesa3d-dev] geometry shading patches

2009-12-25 Thread michal

Zack Rusin wrote on 2009-12-24 14:24:
 yo,

 after our discussions i hacked a new version of geometry shading support in 
 gallium. the new geometry shading syntax looks as follows:
   
Zack,

That looks nice. Once you commit I will take a closer look at patch #10 
and see what's the issue there without bothering you.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mystery of u_format.csv

2009-12-22 Thread michal

Marek Olšák wrote on 2009-12-22 08:40:
 Hi,

 I noticed that gallium/auxiliary/util/u_format.csv contains some weird 
 swizzling, for example see this:

 $ grep zyxw u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A1R5G5B5_UNORM, arith , 1, 1, un5 , un5 , un5 , 
 un1 , zyxw, rgb
 PIPE_FORMAT_A4R4G4B4_UNORM, arith , 1, 1, un4 , un4 , un4 , 
 un4 , zyxw, rgb
 PIPE_FORMAT_A8B8G8R8_SNORM, arith , 1, 1, sn8 , sn8 , sn8 , 
 sn8 , zyxw, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , u8 
  , zyxw, srgb

 It's hard to believe that ARGB, ABGR, and BGRA have the same 
 swizzling. Let's continue our journey:

 $ grep A8R8G8B8 u_format.csv
 PIPE_FORMAT_A8R8G8B8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , zyxw, rgb
 PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , wxyz, srgb

 Same formats, different swizzling? Also:

 $ grep B8G8R8A8 u_format.csv
 PIPE_FORMAT_B8G8R8A8_UNORM, arith , 1, 1, un8 , un8 , un8 , 
 un8 , yzwx, rgb
 PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8  , u8  , u8  , 
 u8  , zyxw, srgb

 Same formats, different swizzling? I don't really get it. And there's 
 much more cases like these. Could someone tell me what the intended 
 order of channels should be? (or possibly propose a fix) The meaning 
 of the whole table is self-contradictory and it's definitely the 
 source of some r300g bugs.

Marek,

Yes, that seems like a defect. The format swizzle field tells us how to 
swizzle the incoming pixel so that its components are ordered in some 
predefined order. For RGB and SRGB colorspaces the order is R, G, B and 
A. For depth-stencil, ie. ZS color space the order is Z and then S.

I will have a look at this.

Thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Vega State-Tracker not compiling

2009-12-21 Thread michal

STEVE555 wrote on 2009-12-20 18:02:
 Hi everyone,
   I've noticed in the last few days of compiling Mesa from
 git that I get an error when I include Vega as one of the state-trackers in
 my configure options.It keeps coming up with this error:

   
Fixed in git.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread michal


Guys,

Does the attached patch make sense to you?

I replaced the incomplete switch-cases with calls to u_format_access 
functions that are complete but are going to be a bit more expensive to 
call. Since they are used not very often in mesa state tracker, I 
thought it's a good compromise.


Thanks.
diff --git a/src/gallium/auxiliary/util/u_pack_color.h 
b/src/gallium/auxiliary/util/u_pack_color.h
index a2e0f26..6080f1a 100644
--- a/src/gallium/auxiliary/util/u_pack_color.h
+++ b/src/gallium/auxiliary/util/u_pack_color.h
@@ -37,6 +37,7 @@
 
 #include pipe/p_compiler.h
 #include pipe/p_format.h
+#include util/u_format.h
 #include util/u_math.h
 
 
@@ -55,85 +56,14 @@ static INLINE void
 util_pack_color_ub(ubyte r, ubyte g, ubyte b, ubyte a,
enum pipe_format format, union util_color *uc)
 {
-   switch (format) {
-   case PIPE_FORMAT_R8G8B8A8_UNORM:
-  {
- uc-ui = (r  24) | (g  16) | (b  8) | a;
-  }
-  return;
-   case PIPE_FORMAT_R8G8B8X8_UNORM:
-  {
- uc-ui = (r  24) | (g  16) | (b  8) | 0xff;
-  }
-  return;
-   case PIPE_FORMAT_A8R8G8B8_UNORM:
-  {
- uc-ui = (a  24) | (r  16) | (g  8) | b;
-  }
-  return;
-   case PIPE_FORMAT_X8R8G8B8_UNORM:
-  {
- uc-ui = (0xff  24) | (r  16) | (g  8) | b;
-  }
-  return;
-   case PIPE_FORMAT_B8G8R8A8_UNORM:
-  {
- uc-ui = (b  24) | (g  16) | (r  8) | a;
-  }
-  return;
-   case PIPE_FORMAT_B8G8R8X8_UNORM:
-  {
- uc-ui = (b  24) | (g  16) | (r  8) | 0xff;
-  }
-  return;
-   case PIPE_FORMAT_R5G6B5_UNORM:
-  {
- uc-us = ((r  0xf8)  8) | ((g  0xfc)  3) | (b  3);
-  }
-  return;
-   case PIPE_FORMAT_A1R5G5B5_UNORM:
-  {
- uc-us = ((a  0x80)  8) | ((r  0xf8)  7) | ((g  0xf8)  2) | 
(b  3);
-  }
-  return;
-   case PIPE_FORMAT_A4R4G4B4_UNORM:
-  {
- uc-us = ((a  0xf0)  8) | ((r  0xf0)  4) | ((g  0xf0)  0) | 
(b  4);
-  }
-  return;
-   case PIPE_FORMAT_A8_UNORM:
-  {
- uc-ub = a;
-  }
-  return;
-   case PIPE_FORMAT_L8_UNORM:
-   case PIPE_FORMAT_I8_UNORM:
-  {
- uc-ub = a;
-  }
-  return;
-   case PIPE_FORMAT_R32G32B32A32_FLOAT:
-  {
- uc-f[0] = (float)r / 255.0f;
- uc-f[1] = (float)g / 255.0f;
- uc-f[2] = (float)b / 255.0f;
- uc-f[3] = (float)a / 255.0f;
-  }
-  return;
-   case PIPE_FORMAT_R32G32B32_FLOAT:
-  {
- uc-f[0] = (float)r / 255.0f;
- uc-f[1] = (float)g / 255.0f;
- uc-f[2] = (float)b / 255.0f;
-  }
-  return;
+   ubyte src[4];
 
-   /* XXX lots more cases to add */
-   default:
-  uc-ui = 0; /* keep compiler happy */
-  debug_print_format(gallium: unhandled format in util_pack_color_ub(), 
format);
-  assert(0);
-   }
+   src[0] = r;
+   src[1] = g;
+   src[2] = b;
+   src[3] = a;
+
+   util_format_write_4ub(format, src, 0, uc, 0, 0, 0, 1, 1);
 }
  
 
@@ -144,150 +74,15 @@ static INLINE void
 util_unpack_color_ub(enum pipe_format format, union util_color *uc,
  ubyte *r, ubyte *g, ubyte *b, ubyte *a)
 {
-   switch (format) {
-   case PIPE_FORMAT_R8G8B8A8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p  24)  0xff);
- *g = (ubyte) ((p  16)  0xff);
- *b = (ubyte) ((p   8)  0xff);
- *a = (ubyte) ((p   0)  0xff);
-  }
-  return;
-   case PIPE_FORMAT_R8G8B8X8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p  24)  0xff);
- *g = (ubyte) ((p  16)  0xff);
- *b = (ubyte) ((p   8)  0xff);
- *a = (ubyte) 0xff;
-  }
-  return;
-   case PIPE_FORMAT_A8R8G8B8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p  16)  0xff);
- *g = (ubyte) ((p   8)  0xff);
- *b = (ubyte) ((p   0)  0xff);
- *a = (ubyte) ((p  24)  0xff);
-  }
-  return;
-   case PIPE_FORMAT_X8R8G8B8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p  16)  0xff);
- *g = (ubyte) ((p   8)  0xff);
- *b = (ubyte) ((p   0)  0xff);
- *a = (ubyte) 0xff;
-  }
-  return;
-   case PIPE_FORMAT_B8G8R8A8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p   8)  0xff);
- *g = (ubyte) ((p  16)  0xff);
- *b = (ubyte) ((p  24)  0xff);
- *a = (ubyte) ((p   0)  0xff);
-  }
-  return;
-   case PIPE_FORMAT_B8G8R8X8_UNORM:
-  {
- uint p = uc-ui;
- *r = (ubyte) ((p   8)  0xff);
- *g = (ubyte) ((p  16)  0xff);
- *b = (ubyte) ((p  24)  0xff);
- *a = (ubyte) 0xff;
-  }
-  return;
-   case PIPE_FORMAT_R5G6B5_UNORM:
-  {
- ushort p = uc-us;
- *r = (ubyte) (((p  8)  0xf8) | ((p  13)  0x7));
- *g = (ubyte) (((p  3)  0xfc) | ((p   9)  0x3));
- *b = (ubyte) (((p  3)  0xf8) | ((p   2)  0x7));
- *a = (ubyte) 0xff;
-  }
-  return;
-   case

Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread michal

Roland Scheidegger pisze:
 On 15.12.2009 14:14, michal wrote:
   
 Guys,

 Does the attached patch make sense to you?

 I replaced the incomplete switch-cases with calls to u_format_access 
 functions that are complete but are going to be a bit more expensive to 
 call. Since they are used not very often in mesa state tracker, I 
 thought it's a good compromise.
 

 They are not only used in state trackers, but drivers for instance as
 well. That said, it's probably not really a performance critical path.
 Though I'm not sure it makes sense to keep these functions even around
 if they'll just do a single function call. Also, I'm pretty sure your
 usage of the union isn't strict aliasing compliant (as far as I can tell
 you could just go back and remove that ugly union again), though it's
 probably one of the cases gcc won't complain (and hopefully won't
 miscompile).

   
I am casting to (void *) and then u_format casts it back to whatever it 
needs to. I think I am innocent.

Anyway, I will go after Keith's suggestion and fill in only the 
switch-default case. We can always nuke the special cases later when/if 
we realise the performance impact can be neglected.

Thanks.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers 
 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.

2009-12-14 Thread michal

To fully support geometry shaders, we need some means to declare a 
two-dimensional register file. The following declaration


DCL IN[3][0]

would declare an input register with index 0 (first dimension) and size 
3 (second dimension). Since the second dimension is a size, not an index 
(or, for that matter, an index range), a new token has been added that 
specifies the declared size of the register.



Thanks.
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 588ca5e..e5a723f 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -107,10 +107,11 @@ struct tgsi_declaration
unsigned File: 4;  /** one of TGSI_FILE_x */
unsigned UsageMask   : 4;  /** bitmask of TGSI_WRITEMASK_x flags */
unsigned Interpolate : 4;  /** one of TGSI_INTERPOLATE_x */
+   unsigned Dimension   : 1;  /** BOOL, any second dimension info? */
unsigned Semantic: 1;  /** BOOL, any semantic info? */
unsigned Centroid: 1;  /** centroid sampling? */
unsigned Invariant   : 1;  /** invariant optimization? */
-   unsigned Padding : 5;
+   unsigned Padding : 4;
 };
 
 struct tgsi_declaration_range
@@ -119,6 +120,12 @@ struct tgsi_declaration_range
unsigned Last: 16; /** UINT */
 };
 
+struct tgsi_declaration_dimension
+{
+   unsigned Size2D  : 16;  /** Size of the second dimension */
+   unsigned Padding : 16;
+};
+
 #define TGSI_SEMANTIC_POSITION 0
 #define TGSI_SEMANTIC_COLOR1
 #define TGSI_SEMANTIC_BCOLOR   2 /** back-face color */
--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.

2009-12-14 Thread michal

Zack Rusin pisze:
 On Monday 14 December 2009 09:29:03 Keith Whitwell wrote:
   
 On Mon, 2009-12-14 at 06:23 -0800, michal wrote:
 
 To fully support geometry shaders, we need some means to declare a
 two-dimensional register file. The following declaration

 DCL IN[3][0]

 would declare an input register with index 0 (first dimension) and size
 3 (second dimension). Since the second dimension is a size, not an index
 (or, for that matter, an index range), a new token has been added that
 specifies the declared size of the register.
   
 Is this a good representation?  What would happen if there was:

 DCL IN[4][0]
 DCL IN[3][1]

 Presumably the 3 is always going to be 3, and it's a property of the
 geometry shader - I think Zack has a patch which adds something like:

 PROP GS_VERTICES_IN 3

 Then couldn't we just have the equivalent of:

 DCL IN[][0]
 DCL IN[][1]

 with the size of the first dimension specified by the property?
 

 Yea, that's what I thought the dimensional arrays should look like for GS in 
 TGSI (they already do in GLSL and HLSL).
   
Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property.

GL_ARB_geometry_shader4 has this mapping:



 Value of built-in
Input primitive type gl_VerticesIn
---  -
POINTS  1
LINES   2
LINES_ADJACENCY_ARB 4
TRIANGLES   3
TRIANGLES_ADJACENCY_ARB 6



But that also defeats the purpose of this patch -- INPUT registers would 
have implied two-dimensionality when declared inside GS.

  
   
 Are there going to be cases where this doesn't work?
 

 I don't think so. 
 Also if we decide to go with DCL IN[x][1] notation then it probably should be 
 DCL IN[a..b][1] because otherwise it just looks weird that one component 
 declares a range while the other the index.

   


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium: add blitter

2009-12-14 Thread michal

Keith Whitwell pisze:
 On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
   
 +static INLINE
 +void util_blitter_save_fragment_sampler_states(
 +  struct blitter_context *blitter,
 +  int num_sampler_states,
 +  void **sampler_states)
 +{
 +   assert(num_textures = 32);
 +
 +   blitter-saved_num_sampler_states = num_sampler_states;
 +   memcpy(blitter-saved_sampler_states, sampler_states,
 +  num_sampler_states * sizeof(void *));
 +}
 + 
 

 Have you tried compiling with debug enabled?  The assert above fails to
 compile.  Also, can you use Elements() or similar instead of the
 hard-coded 32?

 Maybe we can figure out how to go back to having asserts keep exposing
 their contents to the compiler even on non-debug builds.  This used to
 work without problem on linux and helped a lot to avoid these type of
 problems.

   
Precisely. Recently I've been thinking about mapping assert() to 
__assume() for non-debug builds on windows and MSVC.

http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium: add blitter

2009-12-14 Thread michal

José Fonseca pisze:
 On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
   
 On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
 
 On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
   
 On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
 
 On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
   
 +static INLINE
 +void util_blitter_save_fragment_sampler_states(
 +  struct blitter_context *blitter,
 +  int num_sampler_states,
 +  void **sampler_states)
 +{
 +   assert(num_textures = 32);
 +
 +   blitter-saved_num_sampler_states = num_sampler_states;
 +   memcpy(blitter-saved_sampler_states, sampler_states,
 +  num_sampler_states * sizeof(void *));
 +}
 + 
 
 Have you tried compiling with debug enabled?  The assert above fails to
 compile.  Also, can you use Elements() or similar instead of the
 hard-coded 32?

 Maybe we can figure out how to go back to having asserts keep exposing
 their contents to the compiler even on non-debug builds.  This used to
 work without problem on linux and helped a lot to avoid these type of
 problems.
   
 I wouldn't say without a problem: defining assert(expr) as (void)0
 instead of (void)(expr) on release builds yielded a non-negligible
 performance improvement. I don't recall the exact figure, but I believe
 it was the 3-5% for the driver I was benchmarking at the time. YMMV.
 Different drivers will give different results, but there's nothing
 platform specific about this.
 
 It's not hard to avoid excuting code...  For instance we could always
 have it translated to something like:

   if (0) {
 (void)(expr);
   }
   (void)(0)

   
 Obviously I would have meant to say something cleaner like:

  do {
if (0) { (void)(expr);  }
  }
  while (0)
 

 This only works if expr has no calls, or just inline calls. Using my
 earlier example, if very_expensive_check() is in another file then the
 compiler has to assume the function will have side effects, and the call
 can't be removed.

 I'm not sure __assume keyword that Michal mentioned helps. It's more a
 hint to the compiler to help him optimize code around the assertion, but
 perhaps it helps with the warnings too.

   
If I try to compile this:

__assume(lalala);

I get:

error C2065: 'lalala' : undeclared identifier

On the other side, the compiler is going to be serious about the 
assumptions inside __assume(), and if they happen to be false, the 
application can behave not as expected. This is against current gallium 
paradigm, where we put assertions, but also do the same check in 
non-debug builds to early out from a function or provide default values 
(e.g. in switch-case statements).

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium: add blitter

2009-12-14 Thread michal

José Fonseca pisze:
 On Mon, 2009-12-14 at 08:58 -0800, michal wrote:
   
 José Fonseca pisze:
 
 On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
   
   
 On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
 
 
 On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
   
   
 On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
 
 
 On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
   
   
 +static INLINE
 +void util_blitter_save_fragment_sampler_states(
 +  struct blitter_context *blitter,
 +  int num_sampler_states,
 +  void **sampler_states)
 +{
 +   assert(num_textures = 32);
 +
 +   blitter-saved_num_sampler_states = num_sampler_states;
 +   memcpy(blitter-saved_sampler_states, sampler_states,
 +  num_sampler_states * sizeof(void *));
 +}
 + 
 
 
 Have you tried compiling with debug enabled?  The assert above fails to
 compile.  Also, can you use Elements() or similar instead of the
 hard-coded 32?

 Maybe we can figure out how to go back to having asserts keep exposing
 their contents to the compiler even on non-debug builds.  This used to
 work without problem on linux and helped a lot to avoid these type of
 problems.
   
   
 I wouldn't say without a problem: defining assert(expr) as (void)0
 instead of (void)(expr) on release builds yielded a non-negligible
 performance improvement. I don't recall the exact figure, but I believe
 it was the 3-5% for the driver I was benchmarking at the time. YMMV.
 Different drivers will give different results, but there's nothing
 platform specific about this.
 
 
 It's not hard to avoid excuting code...  For instance we could always
 have it translated to something like:

   if (0) {
 (void)(expr);
   }
   (void)(0)

   
   
 Obviously I would have meant to say something cleaner like:

  do {
if (0) { (void)(expr);  }
  }
  while (0)
 
 
 This only works if expr has no calls, or just inline calls. Using my
 earlier example, if very_expensive_check() is in another file then the
 compiler has to assume the function will have side effects, and the call
 can't be removed.

 I'm not sure __assume keyword that Michal mentioned helps. It's more a
 hint to the compiler to help him optimize code around the assertion, but
 perhaps it helps with the warnings too.

   
   
 If I try to compile this:

 __assume(lalala);

 I get:

 error C2065: 'lalala' : undeclared identifier

 On the other side, the compiler is going to be serious about the 
 assumptions inside __assume(), and if they happen to be false, the 
 application can behave not as expected. This is against current gallium 
 paradigm, where we put assertions, but also do the same check in 
 non-debug builds to early out from a function or provide default values 
 (e.g. in switch-case statements).
 

 Bummer... that's no good.


   
On the third hand, we could transform the following idiom

switch (foo) {
case 1:
   bar = 22;
default:
   assert(0);
   bar = 11;   /* Safe value. */
}

to use some flavour of assert() that doesn't get substituted with 
__assume() on non-debug builds. Something like weak_assert() or 
warning(). Then assert() could be used in places where there is no 
backup plan and the app is going to crash anyway.

Or... do the opposite and introduce strong_assert() that translates to 
__assume() and leave assert() as it is now.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.

2009-12-14 Thread michal

Keith Whitwell pisze:
 On Mon, 2009-12-14 at 06:51 -0800, michal wrote: 
   
 Zack Rusin pisze:
 
 On Monday 14 December 2009 09:29:03 Keith Whitwell wrote:
   
   
 On Mon, 2009-12-14 at 06:23 -0800, michal wrote:
 
 
 To fully support geometry shaders, we need some means to declare a
 two-dimensional register file. The following declaration

 DCL IN[3][0]

 would declare an input register with index 0 (first dimension) and size
 3 (second dimension). Since the second dimension is a size, not an index
 (or, for that matter, an index range), a new token has been added that
 specifies the declared size of the register.
   
   
 Is this a good representation?  What would happen if there was:

 DCL IN[4][0]
 DCL IN[3][1]

 Presumably the 3 is always going to be 3, and it's a property of the
 geometry shader - I think Zack has a patch which adds something like:

 PROP GS_VERTICES_IN 3

 Then couldn't we just have the equivalent of:

 DCL IN[][0]
 DCL IN[][1]

 with the size of the first dimension specified by the property?
 
 
 Yea, that's what I thought the dimensional arrays should look like for GS 
 in 
 TGSI (they already do in GLSL and HLSL).
   
   
 Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property.

 GL_ARB_geometry_shader4 has this mapping:

 

  Value of built-in
 Input primitive type gl_VerticesIn
 ---  -
 POINTS  1
 LINES   2
 LINES_ADJACENCY_ARB 4
 TRIANGLES   3
 TRIANGLES_ADJACENCY_ARB 6

 

 But that also defeats the purpose of this patch -- INPUT registers would 
 have implied two-dimensionality when declared inside GS.
 

 We have agreed that, its true...

 So is this patch necessary?  Is it sufficient to simply make the
 statements that:

 a) Geometry shader INPUTs are always two dimensional
 b) The first dimension is determined by the input primitive type?

   
Yes, thanks.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add extra dimension info to TGSI declarations.

2009-12-14 Thread michal

Zack Rusin pisze:
 On Monday 14 December 2009 12:49:53 michal wrote:
   
 Keith Whitwell pisze:
 
 On Mon, 2009-12-14 at 06:51 -0800, michal wrote:
   
 Zack Rusin pisze:
 
 On Monday 14 December 2009 09:29:03 Keith Whitwell wrote:
   
 On Mon, 2009-12-14 at 06:23 -0800, michal wrote:
 
 To fully support geometry shaders, we need some means to declare a
 two-dimensional register file. The following declaration

 DCL IN[3][0]

 would declare an input register with index 0 (first dimension) and
 size 3 (second dimension). Since the second dimension is a size, not
 an index (or, for that matter, an index range), a new token has been
 added that specifies the declared size of the register.
   
 Is this a good representation?  What would happen if there was:

 DCL IN[4][0]
 DCL IN[3][1]

 Presumably the 3 is always going to be 3, and it's a property of
 the geometry shader - I think Zack has a patch which adds something
 like:

 PROP GS_VERTICES_IN 3

 Then couldn't we just have the equivalent of:

 DCL IN[][0]
 DCL IN[][1]

 with the size of the first dimension specified by the property?
 
 Yea, that's what I thought the dimensional arrays should look like for
 GS in TGSI (they already do in GLSL and HLSL).
   
 Actually, GS_VERTICES_IN could be derived from GS_INPUT_PRIM property.

 GL_ARB_geometry_shader4 has this mapping:

 

  Value of built-in
 Input primitive type gl_VerticesIn
 ---  -
 POINTS  1
 LINES   2
 LINES_ADJACENCY_ARB 4
 TRIANGLES   3
 TRIANGLES_ADJACENCY_ARB 6

 

 But that also defeats the purpose of this patch -- INPUT registers would
 have implied two-dimensionality when declared inside GS.
 
 We have agreed that, its true...

 So is this patch necessary?  Is it sufficient to simply make the
 statements that:

 a) Geometry shader INPUTs are always two dimensional
 b) The first dimension is determined by the input primitive type?
   
 Yes, thanks.
 

 k, i'm a bit confused. i can't say it's very pretty but it works so i'm cool 
 with any form of declarations but where does that leave the problem of 
 actually accessing those inputs? i mean how will we access the color of the 
 second vertex if multidimensional arrays don't exist.
 will it be
 GEOM
 PROPERTY GS_INPUT_PRIMITIVE TRIANGLES
   
This basically says: The first dimension of IN is 3.

 DCL IN[0], POSITION
   
This should read

DCL IN[][0], POSITION

 DCL OUT[0], POSITION
  MOV OUT[0], IN[0][0]
  EMIT_VERTEX
  MOV OUT[0], IN[1][0]
  EMIT_VERTEX
  MOV OUT[0], IN[2][0]
  EMIT_VERTEX
  END_PRIMITIVE
 END

   
And the above can be already expressed with what's inside 
p_shader_token.h, including 2D INPUT access.



--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium3d shader declarations

2009-12-11 Thread michal

Zack Rusin pisze:
 On Wednesday 09 December 2009 15:07:45 michal wrote:
   
 Keith Whitwell pisze:
 
 On Wed, 2009-12-09 at 10:19 -0800, Keith Whitwell wrote:
   
 On Wed, 2009-12-09 at 07:18 -0800, Zack Rusin wrote:
 
 On Wednesday 09 December 2009 10:05:13 michal wrote:
   
 Zack Rusin pisze:
 
 On Wednesday 09 December 2009 08:55:09 michal wrote:
   
 Zack Rusin pisze:
 
 On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote:
   
 On Wed, 2009-12-09 at 04:41 -0800, michal wrote:
 
 Zack Rusin pisze:
   
 Hi,

 currently Gallium3d shaders predefine all their inputs/outputs.
 We've handled all inputs/outputs the same way. e.g.
 VERT
 DCL IN[0]
 DCL OUT[0], POSITION
 DCL OUT[1], COLOR
 DCL CONST[0..9]
 DCL TEMP[0..3]
 or
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR

 There are certain inputs/output which don't really follow the
 typical rules for inputs/outputs though and we've been imitating
 those with extra normal semantics (e.g. front face).

 It all falls apart a bit on anything with shader model 4.x and
 up. That's because in there we've got what Microsoft calls
 system-value semantics. (
 http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#Sys
 tem_ Va l ue ). They all represent system-generated
 inputs/outputs for shaders. And while so far we only really had
 to handle front-face since shader model 4.x we have to deal with
 lots of them (geometry shaders, domain shaders, computer
 shaders... they all have system generated inputs/outputs)

 I'm thinking of adding something similar to what D3D does to
 Gallium3d. So just adding a new DCL type, e.g. DCL_SV which
 takes the vector name and the system-value semantic as its
 inputs, so FRAG DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL IN[2], FACE, CONSTANT
 would become
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL_SV IN[2], FACE

 It likely could be done in a more generic fashion though.
 Opinions?
 
 Zack,

 What would be the difference between

 DCL IN[2], FACE, CONSTANT

 and

 DCL_SV IN[2], FACE

 then? Maybe the example is bad, but I don't see what DCL_SV would
 give us the existing DCL doesn't.
   
 I'd have proposed something slightly different where the SV values
 don't land in the INPUT file but some new register file.

 The reason is that when we start looking at geometry shaders, the
 INPUT register file becomes two-dimensional, but these SV values
 remain single-dimensional.  That means that for current TGSI we'd
 have stuff like:

 DCL IN[0..3][0] POSITION
 DCL IN[0..3][1] COLOR
 DCL IN[2] SOME_SYSTEM_VALUE

 Which is pretty nasty - half of the input file is one dimensional,
 half two-dimensional, and you need to look at the index of the
 first dimension to figure out whether the input reg is legal or
 not.

 So, I'm think some new register file to handle these
 system-generated values is one possiblility, as in:

 DCL SV[0], FACE

 or

 DCL SV[1],  PRIMITIVE_ID

 Thoughts?
 
 Yea, I like that.

 And then separate syntax to handle the properties or overloading
 DCL? i.e. DCL GS_INFO  PRIM_IN TRIANGLES
 vs
 PROPERTY GS_INFO PRIM_IN TRIANGLES
 ?
   
 I think a geometry shader should have its own GS_INFO token that
 would convey the information it needs, i.e. no overloading of the
 DCL token.

 GS_INFO PRIM_IN TRIANGLES
 GS_INFO PRIM_OUT TRIANGLE_STRIP
 GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */
 
 We'll be adding more of those then. Basically we'll need an extra
 token for every shader we have.

 COMPUTE_INFO WORK_GROUP_SIZE 4 4 4 /*x, y, z*/
 DS_INFO DOMAIN 3 /*domain shader*/
 HS_INFO MAXTESSFACTOR 3 /*hull shader*/
 FS_INFO EARLYDEPTSTENCIL 1
 etc.

 To me it looks uglier than a special decleration token that could
 handle all of them.
   
 Can you propose a patch against p_shader_tokens.h that introduces a
 PROPERTY token?
 
 I could do that but only if we agree it's in the name of love.

 So is everyone ok with a new register SV for system generated values
 and new declaration token called PROPERTY for shader specific
 properties (btw, d3d calls those attributes, but since attributes
 already have a meaning in glsl I think it'd probably wise to not try to
 redefine it).
   
 I'm OK with this general plan, though I'm not sure about these FS
 properties - early depth/stencil depends on more than just the shader as
 long as we continue to support legacy alphatest, for instance.  This is
 probably something the driver has to figure out for itself based on the
 peculiarities of the hardware - some hardware may not even have such a
 concept.

 In terms of the SV register file, what do we do with the existing system
 values -- I'm guessing things like the FACE input semantic in fragment
 shaders is now a SV, right?

 Also, how

Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-11 Thread michal

Roland Scheidegger pisze:
 On 09.12.2009 18:58, michal wrote:
   
 Keith Whitwell pisze:
 
 On Wed, 2009-12-09 at 09:16 -0800, michal wrote:
   
   
 Hi all,

 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.

 Thanks.
 
 
 Michal,

 Can you detail what testing you've done on this branch and which
 environments you have/haven't built on?


   
   
 Testing:

 * Capture the output of the old syntax parser and comapre with the 
 output of the new parser. No regressions found. Use a set of over 400 
 shaders to perform the comparison.

 * Run GLSL Parser Test to see if the new parser successfully intergrates 
 with the rest of Mesa. No regressions found.

 So far I have been building that with scons on windows. I am planning to 
 fix the build with make and scons on linux.
 
 Seems to compile just fine now with make.
 Too bad all the strict-aliasing violations are still there (in
 grammar.c), I'll give this a look (but don't wait for it for merging).
 Also, there seems to be some char/byte uncleanliness, I get a gazillion
 warnings like:
 shader/grammar/grammar.c: In function ‘get_spec’:
 shader/grammar/grammar.c:1978: warning: pointer targets in passing
 argument 1 of ‘strlen’ differ in signedness
 /usr/include/string.h:397: note: expected ‘const char *’ but argument is
 of type ‘byte *’
 shader/grammar/grammar.c:1978: warning: pointer targets in passing
 argument 1 of ‘__builtin_strcmp’ differ in signedness
 shader/grammar/grammar.c:1978: note: expected ‘const char *’ but
 argument is of type ‘byte *’
 shader/grammar/grammar.c:1978: warning: pointer targets in passing
 argument 1 of ‘strlen’ differ in signedness

   
Don't worry about it, Roland. This will go away once we merge (there's 
still grammar dependency in the branch, but not any more in master).

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (glsl-pp-rework-2): scons: Get GLSL code building correctly when cross compiling.

2009-12-10 Thread michal

José Fonseca pisze:
On Thu, 2009-12-10 at 08:31 -0800, Jose Fonseca wrote:

Module: Mesa
Branch: glsl-pp-rework-2
Commit: 491f384c3958067e6c4c994041f5d8d413b806bc
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=491f384c3958067e6c4c994041f5d8d413b806bc

Author: José Fonseca jfons...@vmware.com
Date: Thu Dec 10 16:29:04 2009 +

scons: Get GLSL code building correctly when cross compiling.

This is quite messy. GLSL code has to be built twice: one for the
host OS, another for the target OS.

Michal,

I managed to get linux-windows cross compilation working again, but it
was *very* complicated, because src/glsl/{pp,cl} has to be built twice
-- one for the host os, another for the target os --, therefore we must
ensure the .o and .a files are stored in different places.

It's really messy and ugly, and any Linux distribution which uses cross
compilers as part of their build process will have a hard time to
package this.

Is this absolutely necessary?

No, it isn't. Another option is to write a python script that converts
the .gc files into headers containing string literals and compile the
library from that on runtime. I should be able to produce that script
when I get some free time on my shoulders, and in the meantime use the
existing solution. Thanks for helping me with that!

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium3d shader declarations

2009-12-09 Thread michal

Zack Rusin pisze:
 Hi,

 currently Gallium3d shaders predefine all their inputs/outputs. We've handled 
 all inputs/outputs the same way. e.g.
 VERT
 DCL IN[0]
 DCL OUT[0], POSITION
 DCL OUT[1], COLOR   
 DCL CONST[0..9] 
 DCL TEMP[0..3] 
 or 
 FRAG
 DCL IN[0], COLOR, LINEAR   
 DCL OUT[0], COLOR

 There are certain inputs/output which don't really follow the typical rules 
 for inputs/outputs though and we've been imitating those with extra normal 
 semantics (e.g. front face).

 It all falls apart a bit on anything with shader model 4.x and up. That's 
 because in there we've got what Microsoft calls system-value semantics.
 ( http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Value 
 ). 
 They all represent system-generated inputs/outputs for shaders. And while so 
 far we only really had to handle front-face since shader model 4.x we have to 
 deal with lots of them (geometry shaders, domain shaders, computer shaders... 
 they all have system generated inputs/outputs)

 I'm thinking of adding something similar to what D3D does to Gallium3d. So 
 just adding a new DCL type, e.g. DCL_SV which takes the vector name and the 
 system-value semantic as its inputs, so
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR 
 DCL IN[2], FACE, CONSTANT 
 would become
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR 
 DCL_SV IN[2], FACE

 It likely could be done in a more generic fashion though. Opinions?

   
Zack,

What would be the difference between

DCL IN[2], FACE, CONSTANT

and

DCL_SV IN[2], FACE

then? Maybe the example is bad, but I don't see what DCL_SV would give 
us the existing DCL doesn't.

Thanks.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (pipe-format-simplify): Simplify the redundant meaning of format layout.

2009-12-09 Thread michal

José Fonseca pisze:
 This is not true. UTIL_FORMAT_LAYOUT_* are needed for code generation in
 u_format_access.py and llvmpipe.

 It seems here what you want is a is-compressed or not flag. If so add
 that flag to util_format_description, modify u_format_table.py to
 generate that flag, and leave UTIL_FORMAT_LAYOUT untouched.

   
Agreed -- I had no idea you are going to use UTIL_FORMAT_LAYOUT 
information in llvmpipe without using the .csv file.

I think I am going to provide a convenience function that return the 
information I want.

Thanks.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium3d shader declarations

2009-12-09 Thread michal

Zack Rusin pisze:
 On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote:
   
 On Wed, 2009-12-09 at 04:41 -0800, michal wrote:
 
 Zack Rusin pisze:
   
 Hi,

 currently Gallium3d shaders predefine all their inputs/outputs. We've
 handled all inputs/outputs the same way. e.g.
 VERT
 DCL IN[0]
 DCL OUT[0], POSITION
 DCL OUT[1], COLOR
 DCL CONST[0..9]
 DCL TEMP[0..3]
 or
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR

 There are certain inputs/output which don't really follow the typical
 rules for inputs/outputs though and we've been imitating those with
 extra normal semantics (e.g. front face).

 It all falls apart a bit on anything with shader model 4.x and up.
 That's because in there we've got what Microsoft calls system-value
 semantics. (
 http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Val
 ue ). They all represent system-generated inputs/outputs for shaders.
 And while so far we only really had to handle front-face since shader
 model 4.x we have to deal with lots of them (geometry shaders, domain
 shaders, computer shaders... they all have system generated
 inputs/outputs)

 I'm thinking of adding something similar to what D3D does to Gallium3d.
 So just adding a new DCL type, e.g. DCL_SV which takes the vector name
 and the system-value semantic as its inputs, so
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL IN[2], FACE, CONSTANT
 would become
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL_SV IN[2], FACE

 It likely could be done in a more generic fashion though. Opinions?
 
 Zack,

 What would be the difference between

 DCL IN[2], FACE, CONSTANT

 and

 DCL_SV IN[2], FACE

 then? Maybe the example is bad, but I don't see what DCL_SV would give
 us the existing DCL doesn't.
   
 I'd have proposed something slightly different where the SV values don't
 land in the INPUT file but some new register file.

 The reason is that when we start looking at geometry shaders, the INPUT
 register file becomes two-dimensional, but these SV values remain
 single-dimensional.  That means that for current TGSI we'd have stuff
 like:

 DCL IN[0..3][0] POSITION
 DCL IN[0..3][1] COLOR
 DCL IN[2] SOME_SYSTEM_VALUE

 Which is pretty nasty - half of the input file is one dimensional, half
 two-dimensional, and you need to look at the index of the first
 dimension to figure out whether the input reg is legal or not.

 So, I'm think some new register file to handle these system-generated
 values is one possiblility, as in:

 DCL SV[0], FACE

 or

 DCL SV[1],  PRIMITIVE_ID

 Thoughts?
 

 Yea, I like that.

 And then separate syntax to handle the properties or overloading DCL? i.e.
 DCL GS_INFO  PRIM_IN TRIANGLES
 vs
 PROPERTY GS_INFO PRIM_IN TRIANGLES
 ?
   
I think a geometry shader should have its own GS_INFO token that would 
convey the information it needs, i.e. no overloading of the DCL token.

GS_INFO PRIM_IN TRIANGLES
GS_INFO PRIM_OUT TRIANGLE_STRIP
GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */




--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium3d shader declarations

2009-12-09 Thread michal

Zack Rusin pisze:
 On Wednesday 09 December 2009 08:55:09 michal wrote:
   
 Zack Rusin pisze:
 
 On Wednesday 09 December 2009 08:44:20 Keith Whitwell wrote:
   
 On Wed, 2009-12-09 at 04:41 -0800, michal wrote:
 
 Zack Rusin pisze:
   
 Hi,

 currently Gallium3d shaders predefine all their inputs/outputs. We've
 handled all inputs/outputs the same way. e.g.
 VERT
 DCL IN[0]
 DCL OUT[0], POSITION
 DCL OUT[1], COLOR
 DCL CONST[0..9]
 DCL TEMP[0..3]
 or
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR

 There are certain inputs/output which don't really follow the typical
 rules for inputs/outputs though and we've been imitating those with
 extra normal semantics (e.g. front face).

 It all falls apart a bit on anything with shader model 4.x and up.
 That's because in there we've got what Microsoft calls system-value
 semantics. (
 http://msdn.microsoft.com/en-us/library/ee418355(VS.85).aspx#System_Va
 l ue ). They all represent system-generated inputs/outputs for shaders.
 And while so far we only really had to handle front-face since shader
 model 4.x we have to deal with lots of them (geometry shaders, domain
 shaders, computer shaders... they all have system generated
 inputs/outputs)

 I'm thinking of adding something similar to what D3D does to
 Gallium3d. So just adding a new DCL type, e.g. DCL_SV which takes the
 vector name and the system-value semantic as its inputs, so
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL IN[2], FACE, CONSTANT
 would become
 FRAG
 DCL IN[0], COLOR, LINEAR
 DCL IN[1], COLOR[1], LINEAR
 DCL_SV IN[2], FACE

 It likely could be done in a more generic fashion though. Opinions?
 
 Zack,

 What would be the difference between

 DCL IN[2], FACE, CONSTANT

 and

 DCL_SV IN[2], FACE

 then? Maybe the example is bad, but I don't see what DCL_SV would give
 us the existing DCL doesn't.
   
 I'd have proposed something slightly different where the SV values don't
 land in the INPUT file but some new register file.

 The reason is that when we start looking at geometry shaders, the INPUT
 register file becomes two-dimensional, but these SV values remain
 single-dimensional.  That means that for current TGSI we'd have stuff
 like:

 DCL IN[0..3][0] POSITION
 DCL IN[0..3][1] COLOR
 DCL IN[2] SOME_SYSTEM_VALUE

 Which is pretty nasty - half of the input file is one dimensional, half
 two-dimensional, and you need to look at the index of the first
 dimension to figure out whether the input reg is legal or not.

 So, I'm think some new register file to handle these system-generated
 values is one possiblility, as in:

 DCL SV[0], FACE

 or

 DCL SV[1],  PRIMITIVE_ID

 Thoughts?
 
 Yea, I like that.

 And then separate syntax to handle the properties or overloading DCL?
 i.e. DCL GS_INFO  PRIM_IN TRIANGLES
 vs
 PROPERTY GS_INFO PRIM_IN TRIANGLES
 ?
   
 I think a geometry shader should have its own GS_INFO token that would
 convey the information it needs, i.e. no overloading of the DCL token.

 GS_INFO PRIM_IN TRIANGLES
 GS_INFO PRIM_OUT TRIANGLE_STRIP
 GS_INFO MAX_VERTEX_COUNT 3 /* vertices_out for gl */
 

 We'll be adding more of those then. Basically we'll need an extra token for 
 every shader we have.

 COMPUTE_INFO WORK_GROUP_SIZE 4 4 4 /*x, y, z*/
 DS_INFO DOMAIN 3 /*domain shader*/
 HS_INFO MAXTESSFACTOR 3 /*hull shader*/
 FS_INFO EARLYDEPTSTENCIL 1
 etc.

 To me it looks uglier than a special decleration token that could handle all 
 of them.
   

Can you propose a patch against p_shader_tokens.h that introduces a 
PROPERTY token?


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-09 Thread michal

Hi all,

I would like to merge this branch back to master this week. If anoyone 
could test if the build works on his/her system, it would be nice.

Thanks.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-09 Thread michal

Keith Whitwell pisze:
 On Wed, 2009-12-09 at 09:16 -0800, michal wrote:
   
 Hi all,

 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.

 Thanks.
 

 Michal,

 Can you detail what testing you've done on this branch and which
 environments you have/haven't built on?


   
Testing:

* Capture the output of the old syntax parser and comapre with the 
output of the new parser. No regressions found. Use a set of over 400 
shaders to perform the comparison.

* Run GLSL Parser Test to see if the new parser successfully intergrates 
with the rest of Mesa. No regressions found.

So far I have been building that with scons on windows. I am planning to 
fix the build with make and scons on linux.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-09 Thread michal

Kenneth Graunke pisze:
 On Wednesday 09 December 2009 09:16:57 michal wrote:
   
 Hi all,

 I would like to merge this branch back to master this week. If anoyone
 could test if the build works on his/her system, it would be nice.

 Thanks.
 

 Hi Michal,

 I don't see any code in the new branch that handles the #extension directive. 
  
 In particular, the old branch had code to support ARB_draw_buffers and 
 ARB_texture_rectangle.  What's the status of these in glsl-pp-rework-2?

   
This is handled in sl_pp_extension.c.

 Also, according to the spec for those extensions, they're supposed to #define 
 their name (i.e. #define GL_ARB_draw_buffers 1)...but I don't see code in 
 either branch to do this.

   
Thanks, this bit is missing. I will fix that.

 --Kenneth

 --
 Return on Information:
 Google Enterprise Search pays you back
 Get the facts.
 http://p.sf.net/sfu/google-dev2dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
   


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] Branch pipe-format-simplify open for review

2009-12-08 Thread michal

This branch simplifies pipe/p_format.h by making enum pipe_format what 
it should have been -- an enum.

As a result there is no extra information encoded in it and one needs to 
use auxiliary/util/u_format.h to get that info instead. Linking to the 
auxiliary/util lib is necessary.

Please review and if you can test if it doesn't break your setup, I will 
appreciate it.

I would like to hear from r300 and nouveau guys, as those drivers were 
using some internal macros and I weren't 100% sure I got the conversion 
right.

Thanks!

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Branch pipe-format-simplify open for review

2009-12-08 Thread michal

Christoph Bumiller pisze:
 michal schrieb:
   
 This branch simplifies pipe/p_format.h by making enum pipe_format what 
 it should have been -- an enum.

 ...

 I would like to hear from r300 and nouveau guys, as those drivers were 
 using some internal macros and I weren't 100% sure I got the conversion 
 right.
   
 
 Hi !
 In nv50_vbo.c/nv50_vbo_type_to_hw you imply that  UTIL_FORMAT_LAYOUT_ARITH
 means normalized (UNORM,  SNORM) and LAYOUT_ARRAY means SCALED, which
 seems to be not the case.

 PIPE_FORMAT_R32G32B32A32_SNORM for instance also has layout ARRAY.
 I'm not sure what ARRAY/ARITH are supposed to mean ...

   
Thanks, I will fix that.

 Anyway, you could probably base the check on channel[0].normalized,
 since the formats used
 for vertex elements are not mixed.
 I still don't see how to distinguish SCALED and INT though, which at
 some point will have
 to indicate integer attributes ...

   
Aren't those the same? What's the distinction between SCALED and INT on 
NV hardware?


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] adds glsl-shader-loader which is a framework that loads glsl shaders from a .ini file. The files can include test requirements, uniforms to pass to the shaders and expected va

2009-12-02 Thread michal

Ian Romanick pisze:
 + }
 +}
 +
 +void
 +testFile_parse()
 +{
 + FILE* filePointer;
 + int i=0, secLength=0, fileLength=0, state=0, currentTest=0;
 + char c;
 + char word[32];
 + char *cP;
 +
 + filePointer = fopen(filename, rt);
 + if(!filePointer)
 + piglit_report_result(PIGLIT_FAILURE);
 +
 +
 + while(fgetc(filePointer)!=EOF)
 + fileLength++;
 +
 + if(fileLength 1)
 + piglit_report_result(PIGLIT_FAILURE);
 +
 + fclose(filePointer);
 +
 + filePointer = fopen(filename, rt);
 + buffer = (char*) malloc(fileLength+1);
 +
 + c = fgetc(filePointer);
 + while(c != EOF)
 + {
 + buffer[i] = c;
 + ++i;
 + c = fgetc(filePointer);
 + }
 +
 + buffer[i] = '\0';
 + fclose(filePointer);
 

 The code above made my eyes bleed.

 fp = fopen(filename, r);
 if (fp == NULL)
 /* error */ ;

 fseek(fp, 0, SEEK_END);
 fileLength = ftell(fp);
 fseek(fp, 0, SEEK_SET);

 buffer = malloc(fileLength + 1);
 fread(buffer, fileLength, 1, fp);
 fclose(fp);

 buffer[fileLength] = '\0';

   
This won't always work on Windows due to newline conversion taking place 
on fread(), but not being taken into account when calculating file size 
using fseek()/ftell() pair.

 Or just use piglit_load_text_file.

 buffer = pitlit_load_text_file(filename,  fileLength);

   
Yes, use that one or open the file in binary mode.



--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI simplification branch

2009-12-01 Thread michal

Keith Whitwell pisze:
 On Thu, 2009-11-26 at 10:42 -0800, michal wrote:
   
 Keith Whitwell pisze:
 
 On Wed, 2009-11-25 at 08:51 -0800, michal wrote:
   
   
 michal pisze:
 
 
 Keith Whitwell pisze:
   
   
   
 On Wed, 2009-11-25 at 06:28 -0800, michal wrote:
   
 
 
 
 Keith Whitwell pisze:
 
   
   
   
 I've pushed a feature branch with some tgsi simplifications in it.  
 With
 these I've removed the biggest remaining oddities of that language, and
 it's getting to a place where I'm starting to be happy with it as a
 foundation for future work.

 Most of the surprising stuff like multiple negate flags, etc, is gone
 now, and the core tokens are quite a bit easier to understand than in
 previous iterations.

 I've still got my eye on reducing the verbosity of the names in the
 tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
 into p_shader_tokens.h.

 It would be good if people can review the interface changes and provide
 feedback, and also test out their drivers on this branch.  I've done
 minimal softpipe testing so far but will do more over the next few 
 days.

   
   
 
 
 
 All looks good to me, I'm happy somebody had the guts to cut off all 
 the 
 cruft in one shot.

 I see some compile errors on windows build -- I will fix those along 
 with other minor bugs I have spotted.

 Now, looking at the interface, I'm thinking about removing some more 
 tokens.

 1) Remove tgsi_dimension and use tgsi_src_register directly with some 
 well-defined constraints.

 2) Do the same to tgsi_instruction_predicate. Really, it's just an 
 optional src operand with some restrictions.
 
   
   
   
 Interesting.  I'd be keen to see a patch.


   
 
 
 
 Attached. But the more I look at it the more lame it gets.

 Another option would be to define tgsi_any_register that would have 
 File, Index, Indirect and Dimension fields. Then there would be more 
 specialised tgsi_*_register tokens, that would be binary compatible with 
 the first one. One could cast them using a union and avoid more mistakes 
 at compile time. That way we don't have to put the constraints in 
 comments, but be more strict and use the compiler to enforce them. I 
 will follow up with a patch.
   
   
   
 Attached.
 
 
 This makes me wonder about a couple of other things, like whether 16
 bits is sufficient for the index value.  Probably its fine, but it's not
 beyond belief to consider a constant buffer of 256k or larger.

 I'd consider dropping the generic_register struct and any idea of a
 union of these registers.  I'm not really sure we want to encourage the
 idea of people casting between these registers -- for the most part they
 should be building these things with ureg-style functions rather than
 messing around with the tokens directly.  

 If you can easily cast between registers, that defeats any static
 constraints you attempt to impose via the type system, and you may as
 well just use src_register for predicates and dimensions.  An
 interpreter which might benefit from being able to share some code paths
 for the different registers doesn't need the union to be public.

 Basically, this looks like a good regularization/cleanup, but let's drop
 generic_register and not create any public union of these register
 structs.

   
   
 Attached an updated patch.

 One thing to note in general is that by removing the Extended flags and 
 the fact that some of the tokens already use up all the available 32 
 bits, the only way to extend the language may be by incrementing the 
 version number in shader's header. This can be a good or a bad thing, 
 depending on the direction Gallium is heading, but with a bit of 
 discipline that should be a good thing.
 


 Michal,

 What's the status of this change?  Are you working on building up a full
 change based on this patch?

 I'd like to merge this branch sooner rather than later, so if you
 haven't got something that's pretty much ready to go, let's handle this
 change in a branch of its own.

   
Nothing more has been done for it, so go ahead and merge.

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

2009-11-30 Thread michal


Attached an updated patch.

* Increased PIPE_MAX_VERTEX_SAMPLERS to 16.
* Removed PIPE_CAP_MAX_VERTEX_TEXTURES since there's already an 
equivalent PIPE_CAP_MAX_VERTEX_TEXTURE_UNITS.
* Added PIPE_CAP_MAX_COMBINED_SAMPLERS to query maximum texture image 
units accessible from vertex and fragment shaders combined.



Michal Krol pisze:

That means we need an additional cap bit to support 
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS because it's no longer a simple sum of max 
vertex and fragment samplers. For i965 max vertex/fragment/combined samplers 
would be then 16.

--
Michal Krol

Od: Keith Whitwell
Wysłano: 28 listopada 2009 00:40
Do: Michal Krol; Roland Scheidegger
DW: mesa3d-dev
Temat: RE: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

The i965 can surely do 16, though maybe shared with the fragment shaders.

Keith

From: michal [mic...@vmware.com]
Sent: Friday, November 27, 2009 2:20 PM
To: Roland Scheidegger
Cc: Keith Whitwell; mesa3d-dev
Subject: Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture 
state

Roland Scheidegger pisze:
  

On 27.11.2009 19:32, michal wrote:



Why is the MAX here smaller than for fragment samplers?  Doesn't GL
require them to be the same, because GL effectively binds the same set
of sampler states in both cases?

Can you take a closer look at what the GL state tracker would have to do
to expose this functionality and make sure it's valid?





It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many
samplers can be used in a vertex shader. Anything above that is used
only with fragment shaders and ignored for vertex shaders.

  

I fail to see though why the limit needs to be that low. All modern
hardware nowadays can use the same number of texture samplers for both
fragment and vertex shading (it's the same sampler hardware, after all).
Some older hardware (typically non-unified, D3D9 shader model 3
compliant) though indeed only had limited support for this (like the
GeForce 6/7 series) probably only supporting 4 (can't remember exactly),
though other hardware never implemented it despite d3d9 sm3 requiring it
(thanks to a api loophole).





Wow, it looks like I need to upgrade my hardware. I thought 4 vertex
texture units is generous. I have no problem with setting that limit to,
say, 16.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with

Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
  


diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 5569001..4456620 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -123,7 +123,12 @@ struct pipe_context {
 
void * (*create_sampler_state)(struct pipe_context *,
   const struct pipe_sampler_state *);
-   void   (*bind_sampler_states)(struct pipe_context *, unsigned num, void **);
+   void   (*bind_fragment_sampler_states)(struct pipe_context *,
+  unsigned num_samplers,
+  void **samplers);
+   void   (*bind_vertex_sampler_states)(struct pipe_context *,
+unsigned num_samplers,
+void **samplers);
void   (*delete_sampler_state)(struct pipe_context *, void *);
 
void * (*create_rasterizer_state)(struct pipe_context *,
@@ -173,9 +178,13 @@ struct pipe_context {
void (*set_viewport_state)( struct pipe_context *,
const struct pipe_viewport_state * );
 
-   void (*set_sampler_textures)( struct pipe_context *,
- unsigned num_textures,
- struct pipe_texture ** );
+   void (*set_fragment_sampler_textures)(struct pipe_context *,
+ unsigned num_textures,
+ struct pipe_texture ** );
+
+   void (*set_vertex_sampler_textures)(struct pipe_context *,
+   unsigned num_textures,
+   struct pipe_texture **);
 
void (*set_vertex_buffers)( struct pipe_context *,
unsigned num_buffers,
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index fd14dc8..69a0970 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -390,6 +390,8 @@ enum

[Mesa3d-dev] ODP: [PATCH] Add entrypoints for setting vertex texture state

2009-11-28 Thread Michal Krol

That means we need an additional cap bit to support 
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS because it's no longer a simple sum of max 
vertex and fragment samplers. For i965 max vertex/fragment/combined samplers 
would be then 16.

--
Michal Krol

Od: Keith Whitwell
Wysłano: 28 listopada 2009 00:40
Do: Michal Krol; Roland Scheidegger
DW: mesa3d-dev
Temat: RE: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

The i965 can surely do 16, though maybe shared with the fragment shaders.

Keith

From: michal [mic...@vmware.com]
Sent: Friday, November 27, 2009 2:20 PM
To: Roland Scheidegger
Cc: Keith Whitwell; mesa3d-dev
Subject: Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture 
state

Roland Scheidegger pisze:
 On 27.11.2009 19:32, michal wrote:

 Why is the MAX here smaller than for fragment samplers?  Doesn't GL
 require them to be the same, because GL effectively binds the same set
 of sampler states in both cases?

 Can you take a closer look at what the GL state tracker would have to do
 to expose this functionality and make sure it's valid?



 It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many
 samplers can be used in a vertex shader. Anything above that is used
 only with fragment shaders and ignored for vertex shaders.

 I fail to see though why the limit needs to be that low. All modern
 hardware nowadays can use the same number of texture samplers for both
 fragment and vertex shading (it's the same sampler hardware, after all).
 Some older hardware (typically non-unified, D3D9 shader model 3
 compliant) though indeed only had limited support for this (like the
 GeForce 6/7 series) probably only supporting 4 (can't remember exactly),
 though other hardware never implemented it despite d3d9 sm3 requiring it
 (thanks to a api loophole).



Wow, it looks like I need to upgrade my hardware. I thought 4 vertex
texture units is generous. I have no problem with setting that limit to,
say, 16.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-27 Thread michal

Keith Whitwell pisze:
 Michal,

 It's really not the job of the shader representation to do this type of 
 versioning between halves of a driver.  If we ever get to a point where we 
 want to do versioning in gallium, we'll want the version control to cover the 
 entire interface, not just the shaders.

 Given that, and given we won't want to have 1 version of TGSI active within 
 a particular version of gallium, there's no purpose for separate versioning 
 of the shader token stream -- it's just one aspect of the total interface.  
 If we want to do such a sanity check, it should be done at context or screen 
 creation - well before we ever get around to creating shaders.  

 So I still don't see any point in a shader version token.  The fact that TGSI 
 has been through dramatic changes in its lifetime and still advertises itself 
 as 1.1 illustrates this - it's redundant currently and I don't see any use 
 for it in the future either...

   
That's done. It even didn't hurt that much.

Should I go ahead with the patch I sent you earlier?

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

2009-11-27 Thread michal

Hello,

Please review the patch below. It extends the gallium interface to allow 
setting vertex texture sampler states.

This is an optional feature -- drivers not wishing to implement it 
return 0 for PIPE_CAP_MAX_VERTEX_TEXTURES capability query. Drivers may 
also choose to support it, but always fallback to software 
implementation from the draw auxiliary module.




diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 5569001..70f9c8b 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -124,6 +124,9 @@ struct pipe_context {
void * (*create_sampler_state)(struct pipe_context *,
   const struct pipe_sampler_state *);
void   (*bind_sampler_states)(struct pipe_context *, unsigned num, 
void **);
+   void   (*bind_vertex_sampler_states)(struct pipe_context *,
+unsigned num_samplers,
+void **samplers);
void   (*delete_sampler_state)(struct pipe_context *, void *);
 
void * (*create_rasterizer_state)(struct pipe_context *,
@@ -184,6 +187,10 @@ struct pipe_context {
void (*set_vertex_elements)( struct pipe_context *,
 unsigned num_elements,
 const struct pipe_vertex_element * );
+
+   void (*set_vertex_sampler_textures)(struct pipe_context *,
+   unsigned num_textures,
+   struct pipe_texture **);
/*...@}*/
 
 
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index fd14dc8..eac6904 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -390,6 +390,7 @@ enum pipe_transfer_usage {
 #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28
 #define PIPE_CAP_SM3 29  /* Shader Model 3 
supported */
 #define PIPE_CAP_MAX_PREDICATE_REGISTERS 30
+#define PIPE_CAP_MAX_VERTEX_TEXTURES 31
 
 
 /**
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 287b424..ce22f89 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -60,6 +60,7 @@ extern C {
 #define PIPE_MAX_COLOR_BUFS8
 #define PIPE_MAX_CONSTANT 32
 #define PIPE_MAX_SAMPLERS 16
+#define PIPE_MAX_VERTEX_SAMPLERS   4
 #define PIPE_MAX_SHADER_INPUTS16
 #define PIPE_MAX_SHADER_OUTPUTS   16
 #define PIPE_MAX_TEXTURE_LEVELS   16


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

2009-11-27 Thread michal


Keith Whitwell pisze:

On Fri, 2009-11-27 at 10:10 -0800, michal wrote:
  

Hello,

Please review the patch below. It extends the gallium interface to allow 
setting vertex texture sampler states.


This is an optional feature -- drivers not wishing to implement it 
return 0 for PIPE_CAP_MAX_VERTEX_TEXTURES capability query. Drivers may 
also choose to support it, but always fallback to software 
implementation from the draw auxiliary module.





Michal,

couple of comments inline.


  
diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h

index 5569001..70f9c8b 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -124,6 +124,9 @@ struct pipe_context {
void * (*create_sampler_state)(struct pipe_context *,
   const struct pipe_sampler_state *);
void   (*bind_sampler_states)(struct pipe_context *, unsigned num, 
void **);

+   void   (*bind_vertex_sampler_states)(struct pipe_context *,
+unsigned num_samplers,
+void **samplers);
void   (*delete_sampler_state)(struct pipe_context *, void *);
 
void * (*create_rasterizer_state)(struct pipe_context *,

@@ -184,6 +187,10 @@ struct pipe_context {
void (*set_vertex_elements)( struct pipe_context *,
 unsigned num_elements,
 const struct pipe_vertex_element * );
+
+   void (*set_vertex_sampler_textures)(struct pipe_context *,
+   unsigned num_textures,
+   struct pipe_texture **);
/*...@}*/




If we're adding these functions, can the old ones be renamed to
fragment_sampler_states/textures for clarity?

  

Right, forgot about this one. Attached an updated version.


diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h

index fd14dc8..eac6904 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -390,6 +390,7 @@ enum pipe_transfer_usage {
 #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28
 #define PIPE_CAP_SM3 29  /* Shader Model 3 
supported */

 #define PIPE_CAP_MAX_PREDICATE_REGISTERS 30
+#define PIPE_CAP_MAX_VERTEX_TEXTURES 31
 


 /**
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h

index 287b424..ce22f89 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -60,6 +60,7 @@ extern C {
 #define PIPE_MAX_COLOR_BUFS8
 #define PIPE_MAX_CONSTANT 32
 #define PIPE_MAX_SAMPLERS 16
+#define PIPE_MAX_VERTEX_SAMPLERS   4
 #define PIPE_MAX_SHADER_INPUTS16
 #define PIPE_MAX_SHADER_OUTPUTS   16
 #define PIPE_MAX_TEXTURE_LEVELS   16




Why is the MAX here smaller than for fragment samplers?  Doesn't GL
require them to be the same, because GL effectively binds the same set
of sampler states in both cases?  


Can you take a closer look at what the GL state tracker would have to do
to expose this functionality and make sure it's valid?

  


It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many 
samplers can be used in a vertex shader. Anything above that is used 
only with fragment shaders and ignored for vertex shaders.


diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 5569001..b0f13e6 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -123,7 +123,12 @@ struct pipe_context {
 
void * (*create_sampler_state)(struct pipe_context *,
   const struct pipe_sampler_state *);
-   void   (*bind_sampler_states)(struct pipe_context *, unsigned num, void **);
+   void   (*bind_fragment_sampler_states)(struct pipe_context *,
+  unsigned numsamplers,
+  void **samplers);
+   void   (*bind_vertex_sampler_states)(struct pipe_context *,
+unsigned num_samplers,
+void **samplers);
void   (*delete_sampler_state)(struct pipe_context *, void *);
 
void * (*create_rasterizer_state)(struct pipe_context *,
@@ -173,9 +178,9 @@ struct pipe_context {
void (*set_viewport_state)( struct pipe_context *,
const struct pipe_viewport_state * );
 
-   void (*set_sampler_textures)( struct pipe_context *,
- unsigned num_textures,
- struct pipe_texture ** );
+   void (*set_fragment_sampler_textures)(struct pipe_context *,
+ unsigned num_textures,
+ struct pipe_texture ** );
 
void (*set_vertex_buffers)( struct pipe_context *,
unsigned

Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

2009-11-27 Thread michal

Roland Scheidegger pisze:
 On 27.11.2009 19:32, michal wrote:
   
 Why is the MAX here smaller than for fragment samplers?  Doesn't GL
 require them to be the same, because GL effectively binds the same set
 of sampler states in both cases?  

 Can you take a closer look at what the GL state tracker would have to do
 to expose this functionality and make sure it's valid?

   
   
 It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many 
 samplers can be used in a vertex shader. Anything above that is used 
 only with fragment shaders and ignored for vertex shaders.
 
 I fail to see though why the limit needs to be that low. All modern
 hardware nowadays can use the same number of texture samplers for both
 fragment and vertex shading (it's the same sampler hardware, after all).
 Some older hardware (typically non-unified, D3D9 shader model 3
 compliant) though indeed only had limited support for this (like the
 GeForce 6/7 series) probably only supporting 4 (can't remember exactly),
 though other hardware never implemented it despite d3d9 sm3 requiring it
 (thanks to a api loophole).

   

Wow, it looks like I need to upgrade my hardware. I thought 4 vertex 
texture units is generous. I have no problem with setting that limit to, 
say, 16.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-26 Thread michal


Keith Whitwell pisze:

On Wed, 2009-11-25 at 08:51 -0800, michal wrote:
  

michal pisze:


Keith Whitwell pisze:
  
  

On Wed, 2009-11-25 at 06:28 -0800, michal wrote:
  



Keith Whitwell pisze:

  
  

I've pushed a feature branch with some tgsi simplifications in it.  With
these I've removed the biggest remaining oddities of that language, and
it's getting to a place where I'm starting to be happy with it as a
foundation for future work.

Most of the surprising stuff like multiple negate flags, etc, is gone
now, and the core tokens are quite a bit easier to understand than in
previous iterations.

I've still got my eye on reducing the verbosity of the names in the
tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
into p_shader_tokens.h.

It would be good if people can review the interface changes and provide
feedback, and also test out their drivers on this branch.  I've done
minimal softpipe testing so far but will do more over the next few days.

  
  


All looks good to me, I'm happy somebody had the guts to cut off all the 
cruft in one shot.


I see some compile errors on windows build -- I will fix those along 
with other minor bugs I have spotted.


Now, looking at the interface, I'm thinking about removing some more tokens.

1) Remove tgsi_dimension and use tgsi_src_register directly with some 
well-defined constraints.


2) Do the same to tgsi_instruction_predicate. Really, it's just an 
optional src operand with some restrictions.

  
  

Interesting.  I'd be keen to see a patch.


  



Attached. But the more I look at it the more lame it gets.

Another option would be to define tgsi_any_register that would have 
File, Index, Indirect and Dimension fields. Then there would be more 
specialised tgsi_*_register tokens, that would be binary compatible with 
the first one. One could cast them using a union and avoid more mistakes 
at compile time. That way we don't have to put the constraints in 
comments, but be more strict and use the compiler to enforce them. I 
will follow up with a patch.
  
  

Attached.



This makes me wonder about a couple of other things, like whether 16
bits is sufficient for the index value.  Probably its fine, but it's not
beyond belief to consider a constant buffer of 256k or larger.

I'd consider dropping the generic_register struct and any idea of a
union of these registers.  I'm not really sure we want to encourage the
idea of people casting between these registers -- for the most part they
should be building these things with ureg-style functions rather than
messing around with the tokens directly.  


If you can easily cast between registers, that defeats any static
constraints you attempt to impose via the type system, and you may as
well just use src_register for predicates and dimensions.  An
interpreter which might benefit from being able to share some code paths
for the different registers doesn't need the union to be public.

Basically, this looks like a good regularization/cleanup, but let's drop
generic_register and not create any public union of these register
structs.

  

Attached an updated patch.

One thing to note in general is that by removing the Extended flags and 
the fact that some of the tokens already use up all the available 32 
bits, the only way to extend the language may be by incrementing the 
version number in shader's header. This can be a good or a bad thing, 
depending on the direction Gallium is heading, but with a bit of 
discipline that should be a good thing.
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 7d73d7d..18eed97 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -290,7 +290,7 @@ union tgsi_immediate_data
  * respectively. For a given operation code, those numbers are fixed and are
  * present here only for convenience.
  *
- * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows.
+ * If Predicate is TRUE, tgsi_predicate_register token immediately follows.
  *
  * Saturate controls how are final results in destination registers modified.
  */
@@ -350,77 +350,88 @@ struct tgsi_instruction_texture
unsigned Padding  : 24;
 };
 
-/*
- * For SM3, the following constraint applies.
- *   - Swizzle is either set to identity or replicate.
- */
-struct tgsi_instruction_predicate
-{
-   int  Index: 16; /* SINT */
-   unsigned SwizzleX : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleY : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleZ : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleW : 2;  /* TGSI_SWIZZLE_x */
-   unsigned Negate   : 1;  /* BOOL */
-   unsigned Padding  : 7;
-};
-
 /**
  * File specifies the register array to access.
  *
- * Index specifies the element number of a register in the register file.
+ * Index specifies the register number in the specified

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-26 Thread michal

Keith Whitwell pisze:
 On Thu, 2009-11-26 at 10:42 -0800, michal wrote:
   
 Keith Whitwell pisze:
 
 On Wed, 2009-11-25 at 08:51 -0800, michal wrote:
   
   
 michal pisze:
 
 
 Keith Whitwell pisze:
   
   
   
 On Wed, 2009-11-25 at 06:28 -0800, michal wrote:
   
 
 
 
 Keith Whitwell pisze:
 
   
   
   
 I've pushed a feature branch with some tgsi simplifications in it.  
 With
 these I've removed the biggest remaining oddities of that language, and
 it's getting to a place where I'm starting to be happy with it as a
 foundation for future work.

 Most of the surprising stuff like multiple negate flags, etc, is gone
 now, and the core tokens are quite a bit easier to understand than in
 previous iterations.

 I've still got my eye on reducing the verbosity of the names in the
 tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
 into p_shader_tokens.h.

 It would be good if people can review the interface changes and provide
 feedback, and also test out their drivers on this branch.  I've done
 minimal softpipe testing so far but will do more over the next few 
 days.

   
   
 
 
 
 All looks good to me, I'm happy somebody had the guts to cut off all 
 the 
 cruft in one shot.

 I see some compile errors on windows build -- I will fix those along 
 with other minor bugs I have spotted.

 Now, looking at the interface, I'm thinking about removing some more 
 tokens.

 1) Remove tgsi_dimension and use tgsi_src_register directly with some 
 well-defined constraints.

 2) Do the same to tgsi_instruction_predicate. Really, it's just an 
 optional src operand with some restrictions.
 
   
   
   
 Interesting.  I'd be keen to see a patch.


   
 
 
 
 Attached. But the more I look at it the more lame it gets.

 Another option would be to define tgsi_any_register that would have 
 File, Index, Indirect and Dimension fields. Then there would be more 
 specialised tgsi_*_register tokens, that would be binary compatible with 
 the first one. One could cast them using a union and avoid more mistakes 
 at compile time. That way we don't have to put the constraints in 
 comments, but be more strict and use the compiler to enforce them. I 
 will follow up with a patch.
   
   
   
 Attached.
 
 
 This makes me wonder about a couple of other things, like whether 16
 bits is sufficient for the index value.  Probably its fine, but it's not
 beyond belief to consider a constant buffer of 256k or larger.

 I'd consider dropping the generic_register struct and any idea of a
 union of these registers.  I'm not really sure we want to encourage the
 idea of people casting between these registers -- for the most part they
 should be building these things with ureg-style functions rather than
 messing around with the tokens directly.  

 If you can easily cast between registers, that defeats any static
 constraints you attempt to impose via the type system, and you may as
 well just use src_register for predicates and dimensions.  An
 interpreter which might benefit from being able to share some code paths
 for the different registers doesn't need the union to be public.

 Basically, this looks like a good regularization/cleanup, but let's drop
 generic_register and not create any public union of these register
 structs.

   
   
 Attached an updated patch.

 One thing to note in general is that by removing the Extended flags and 
 the fact that some of the tokens already use up all the available 32 
 bits, the only way to extend the language may be by incrementing the 
 version number in shader's header. This can be a good or a bad thing, 
 depending on the direction Gallium is heading, but with a bit of 
 discipline that should be a good thing.
 

 I don't see that as an issue.  First and foremost, TGSI is part of
 gallium, which itself makes no binary compatibility guarantees from one
 build to the next.  In terms of tracing and replay, or any other use of
 TGSI to communicate shaders between components that weren't necessarily
 built at the same time, then yes a version number would be nice.  But
 those shaders won't exist in isolation and the rest of the 3d commands 
 state will need to establish compatibility.

 It's not the job of the shader representation to do versioning between
 two gallium-speaking entities.  

 From that point of view I'm really not sure what the purpose of the
 version number, is in our representation, unless we want to be able to
 support multiple versions of TGSI simultaneously in one gallium
 instance.

 And in turn, I can't really think why we'd want to do that...

 So -- lets remove the version token while we're here.

   
One scenario is a sanity check done in the gallium driver. Check if 
version number matches (exact match) -- there can be changes in the 
interface

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-25 Thread michal

Keith Whitwell pisze:
 I've pushed a feature branch with some tgsi simplifications in it.  With
 these I've removed the biggest remaining oddities of that language, and
 it's getting to a place where I'm starting to be happy with it as a
 foundation for future work.

 Most of the surprising stuff like multiple negate flags, etc, is gone
 now, and the core tokens are quite a bit easier to understand than in
 previous iterations.

 I've still got my eye on reducing the verbosity of the names in the
 tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
 into p_shader_tokens.h.

 It would be good if people can review the interface changes and provide
 feedback, and also test out their drivers on this branch.  I've done
 minimal softpipe testing so far but will do more over the next few days.

   

All looks good to me, I'm happy somebody had the guts to cut off all the 
cruft in one shot.

I see some compile errors on windows build -- I will fix those along 
with other minor bugs I have spotted.

Now, looking at the interface, I'm thinking about removing some more tokens.

1) Remove tgsi_dimension and use tgsi_src_register directly with some 
well-defined constraints.

2) Do the same to tgsi_instruction_predicate. Really, it's just an 
optional src operand with some restrictions.

Thanks.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-25 Thread michal


Keith Whitwell pisze:

On Wed, 2009-11-25 at 06:28 -0800, michal wrote:
  

Keith Whitwell pisze:


I've pushed a feature branch with some tgsi simplifications in it.  With
these I've removed the biggest remaining oddities of that language, and
it's getting to a place where I'm starting to be happy with it as a
foundation for future work.

Most of the surprising stuff like multiple negate flags, etc, is gone
now, and the core tokens are quite a bit easier to understand than in
previous iterations.

I've still got my eye on reducing the verbosity of the names in the
tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
into p_shader_tokens.h.

It would be good if people can review the interface changes and provide
feedback, and also test out their drivers on this branch.  I've done
minimal softpipe testing so far but will do more over the next few days.

  
  
All looks good to me, I'm happy somebody had the guts to cut off all the 
cruft in one shot.


I see some compile errors on windows build -- I will fix those along 
with other minor bugs I have spotted.


Now, looking at the interface, I'm thinking about removing some more tokens.

1) Remove tgsi_dimension and use tgsi_src_register directly with some 
well-defined constraints.


2) Do the same to tgsi_instruction_predicate. Really, it's just an 
optional src operand with some restrictions.



Interesting.  I'd be keen to see a patch.


  

Attached. But the more I look at it the more lame it gets.

Another option would be to define tgsi_any_register that would have 
File, Index, Indirect and Dimension fields. Then there would be more 
specialised tgsi_*_register tokens, that would be binary compatible with 
the first one. One could cast them using a union and avoid more mistakes 
at compile time. That way we don't have to put the constraints in 
comments, but be more strict and use the compiler to enforce them. I 
will follow up with a patch.
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 7d73d7d..7bea99a 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -290,7 +290,9 @@ union tgsi_immediate_data
  * respectively. For a given operation code, those numbers are fixed and are
  * present here only for convenience.
  *
- * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows.
+ * If Predicate is TRUE, tgsi_src_register token immediately follows. Only
+ * the File, Index, Negate and Swizzle* fields are valid. File must be set
+ * to TGSI_FILE_PREDICATE and Swizzle is either set to identity or replicate.
  *
  * Saturate controls how are final results in destination registers modified.
  */
@@ -350,21 +352,6 @@ struct tgsi_instruction_texture
unsigned Padding  : 24;
 };
 
-/*
- * For SM3, the following constraint applies.
- *   - Swizzle is either set to identity or replicate.
- */
-struct tgsi_instruction_predicate
-{
-   int  Index: 16; /* SINT */
-   unsigned SwizzleX : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleY : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleZ : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleW : 2;  /* TGSI_SWIZZLE_x */
-   unsigned Negate   : 1;  /* BOOL */
-   unsigned Padding  : 7;
-};
-
 /**
  * File specifies the register array to access.
  *
@@ -396,23 +383,12 @@ struct tgsi_src_register
 };
 
 /**
- * If tgsi_src_register::Modifier is TRUE, tgsi_src_register_modifier follows.
- * 
- * Then, if tgsi_src_register::Indirect is TRUE, another tgsi_src_register
- * follows.
+ * If tgsi_src_register::Indirect is TRUE, tgsi_src_register follows.
  *
- * Then, if tgsi_src_register::Dimension is TRUE, tgsi_dimension follows.
+ * If tgsi_src_register::Dimension is TRUE, tgsi_src_register follows.
+ * Only the Indirect, Dimension and Index fields are valid.
  */
 
-
-struct tgsi_dimension
-{
-   unsigned Indirect: 1;  /* BOOL */
-   unsigned Dimension   : 1;  /* BOOL */
-   unsigned Padding : 14;
-   int  Index   : 16; /* SINT */
-};
-
 struct tgsi_dst_register
 {
unsigned File: 4;  /* TGSI_FILE_ */
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] TGSI simplification branch

2009-11-25 Thread michal


michal pisze:

Keith Whitwell pisze:
  

On Wed, 2009-11-25 at 06:28 -0800, michal wrote:
  


Keith Whitwell pisze:

  

I've pushed a feature branch with some tgsi simplifications in it.  With
these I've removed the biggest remaining oddities of that language, and
it's getting to a place where I'm starting to be happy with it as a
foundation for future work.

Most of the surprising stuff like multiple negate flags, etc, is gone
now, and the core tokens are quite a bit easier to understand than in
previous iterations.

I've still got my eye on reducing the verbosity of the names in the
tgsi_parse.h FullToken world, and promoting the tgsi_any_token union
into p_shader_tokens.h.

It would be good if people can review the interface changes and provide
feedback, and also test out their drivers on this branch.  I've done
minimal softpipe testing so far but will do more over the next few days.

  
  

All looks good to me, I'm happy somebody had the guts to cut off all the 
cruft in one shot.


I see some compile errors on windows build -- I will fix those along 
with other minor bugs I have spotted.


Now, looking at the interface, I'm thinking about removing some more tokens.

1) Remove tgsi_dimension and use tgsi_src_register directly with some 
well-defined constraints.


2) Do the same to tgsi_instruction_predicate. Really, it's just an 
optional src operand with some restrictions.

  

Interesting.  I'd be keen to see a patch.


  


Attached. But the more I look at it the more lame it gets.

Another option would be to define tgsi_any_register that would have 
File, Index, Indirect and Dimension fields. Then there would be more 
specialised tgsi_*_register tokens, that would be binary compatible with 
the first one. One could cast them using a union and avoid more mistakes 
at compile time. That way we don't have to put the constraints in 
comments, but be more strict and use the compiler to enforce them. I 
will follow up with a patch.
  

Attached.
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 7d73d7d..2be8fbc 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -290,7 +290,7 @@ union tgsi_immediate_data
  * respectively. For a given operation code, those numbers are fixed and are
  * present here only for convenience.
  *
- * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows.
+ * If Predicate is TRUE, tgsi_predicate_register token immediately follows.
  *
  * Saturate controls how are final results in destination registers modified.
  */
@@ -350,77 +350,99 @@ struct tgsi_instruction_texture
unsigned Padding  : 24;
 };
 
-/*
- * For SM3, the following constraint applies.
- *   - Swizzle is either set to identity or replicate.
+/**
+ * File specifies the register array to access.
+ *
+ * Index specifies the register number in the specified register file.
+ *
+ * If Indirect is TRUE, Index should be offset by the tgsi_indirect_register
+ * that follows.
+ *
+ * If Dimension is TRUE, tgsi_dimension_register follows.
  */
-struct tgsi_instruction_predicate
+
+struct tgsi_generic_register
 {
int  Index: 16; /* SINT */
-   unsigned SwizzleX : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleY : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleZ : 2;  /* TGSI_SWIZZLE_x */
-   unsigned SwizzleW : 2;  /* TGSI_SWIZZLE_x */
-   unsigned Negate   : 1;  /* BOOL */
-   unsigned Padding  : 7;
+   unsigned File : 4;  /* TGSI_FILE_ */
+   unsigned Indirect : 1;  /* BOOL */
+   unsigned Dimension: 1;  /* BOOL */
+   unsigned Reserved : 10;
 };
 
 /**
- * File specifies the register array to access.
- *
- * Index specifies the element number of a register in the register file.
+ * If Absolute is TRUE, all components of the register get their signs
+ * cleared.
  *
- * If Indirect is TRUE, Index should be offset by the X component of a source
- * register that follows. The register can be now fetched into local storage
- * for further processing.
+ * If Negate is TRUE, all components of the register are negated.
  *
- * If Negate is TRUE, all components of the fetched register are negated.
- *
- * The fetched register components are swizzled according to SwizzleX, 
SwizzleY,
+ * The register components are swizzled according to SwizzleX, SwizzleY,
  * SwizzleZ and SwizzleW.
- *
  */
 
 struct tgsi_src_register
 {
-   unsigned File: 4;  /* TGSI_FILE_ */
-   unsigned Indirect: 1;  /* BOOL */
-   unsigned Dimension   : 1;  /* BOOL */
-   int  Index   : 16; /* SINT */
-   unsigned SwizzleX: 2;  /* TGSI_SWIZZLE_ */
-   unsigned SwizzleY: 2;  /* TGSI_SWIZZLE_ */
-   unsigned SwizzleZ: 2;  /* TGSI_SWIZZLE_ */
-   unsigned SwizzleW: 2;  /* TGSI_SWIZZLE_ */
-   unsigned Absolute: 1;/* BOOL */
-   unsigned Negate  : 1;/* BOOL */
+   int  Index: 16; /* SINT */
+   unsigned File : 4

Re: [Mesa3d-dev] Mesa (master): tgsi: Fix POSITION and FACE fragment shader inputs.

2009-11-24 Thread michal

Keith Whitwell pisze:
 On Mon, 2009-11-23 at 17:28 -0800, Brian Paul wrote:
   
 For OpenGL, the front-facing attribute is either 0 (back) or 1 (front) 
 rather than +/-1.

 I think we'll need to do some additional work (insert a MAD instr?) in the 
 Mesa-TGSI translation to account for this difference.  I could dig into 
 that someday...
 

 I'm assuming DX or some other API uses +/-1?  

 If we define tgsi to use +/-1, then the GL 0/1 version can be reached by
 just saturating.  Getting from 0/1 to +/-1 looks like it would be a MAD
 as you say.  Probably +/-1 is easy to calculate as the sign of the
 determinant, which would be an intermediate step to calculate GL's
 version.

 If it's OK, let's define TGSI's face reg as +/-1, and have Mesa insert
 the saturate if necessary.

   
OK, I have documented that as negative/positive since it gives us more 
freedom in the future. If it is a problem, we can explicitly say it's 
either -1 or +1 later.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] GLSL compiler performance

2009-11-12 Thread michal

This is a heads up for what is going on on the glsl-pp-rework-2 feature 
branch.

The goal was to rewrite the existing preprocessor for GLSL compiler as a 
stepping stone for a better GLSL compiler in general. Make it faster, 
easy to understand and maintain. But the most important thing was to 
make it easy to plug a new syntax parser in the future (Ian has started 
some work towards this). That's done.

The next step was to integrate the new preprocessor with the existing 
syntax parser to allow me to measure compiler performance. The results 
were not very satisfying -- it turned out the syntax parser was such a 
huge bottleneck, it did not matter how fast the preprocessor was.

So I just hacked up a simple and fast syntax parser that basically 
emulates the old one, so that we don't have to touch too much code, 
spend too much time on it and intruduce regressions. It's not perfect, 
it looks ugly but it works well. I don't mind scrapping it and replacing 
with a new, bison-based parser. And for the time being, here are the 
numbers.

The benchmark takes CorrectConstFolding2.vert shader from the GLSL 
Compiler Test suite. It's one of the biggest files with around 400 lines 
of code. What is being timed is preprocessing + syntax parsing, so no 
further semantic checking and code generation is taken into account.

old  128,157 us
new  4,719 us
improvement  27 x

After that the new preprocessor becomes the bottleneck. That means it's 
worthwile to keep on improving it. If the preprocessing step is taken 
out of the equation and we only measure the syntax parsing stage, we get 
the following numbers.

old  124,671
new  1,016
improvement  122 x

The code for the new parse should be in mesa repository in a week or so.

Thanks.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.

2009-10-31 Thread michal


michal pisze:

Keith Whitwell pisze:
  

On Fri, 2009-10-30 at 11:24 -0700, michal wrote:
  


+/*
+ * Currently, the following constraints apply.
+ *
+ * - PredSwizzleXYZW is either set to identity or replicate.
+ * - PredSrcIndex is 0.
+ */

  

Michal,

This is looking a lot better.  In terms of the above comment, is this
talking about the semantics of PIPE_CAP_GPU3 ?  Or is GPU3 supposed to
do full PredSwizzle/PredSrcIndex, just we haven't implemented it
somewhere (eg in tgsi_exec.c)?

I'd think we want to either:
- remove fields from the token so that the comment isn't necessary,
- remove the comment and have GPU3 mean that the full semantics are
available
- come up with yet another cap bit to say whether or not full predicate
semantics are implemented by a particular driver.

Needless to say I don't like the last option, so I guess that means we
need to decide now whether the full semantics in the token are in or
out.  


How does SM3 fall on these issues?


  

The SM3 specification explicitly states that the predicate swizzle needs 
to be either .xyzw or component replicate. The GL_MESA_gpu_program3 spec 
allows arbitrary swizzles (there's nothing in the document that would 
say otherwise).


I say, rename PIPE_CAP_GPU3 to PIPE_CAP_SM3 to indicate predicates are 
supported with the mentioned swizzle constraints. When the dust settles 
on gpu_program3 spec, the state tracker will compensate for the lack of 
arbitrary swizzles if needed.


Also, add PIPE_CAP_MAX_PREDICATES to query the number of predicate 
registers supported by the driver. That will allow us to remove the 
`PredSrcIndex is 0' constraint.


  

Attached a proposed patch for that.
From 085a5e8c33b2ed6347de2e86d9e972d70438c014 Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Sat, 31 Oct 2009 09:09:26 +
Subject: [PATCH] gallium: Cleanup predicate and condition code TGSI tokens.

There is little point in having a special TGSI token just to handle
predicate register updates. Remove tgsi_dst_register_ext_predicate token
and instead use a new PREDICATE register file to update predicates.
Actually, the contents of the obsolete token are being moved
to tgsi_instruction_ext_predicate, where they should be
from the very beginning.

Remove the NVIDIA-specific condition code tokens -- nobody uses them
and they can be emulated with predicates if needed.

Introduce PIPE_CAP_SM3 that indicates whether a driver supports
SM3-level instructions, and in particular predicates.

Add PIPE_CAP_MAX_PREDICATE_REGISTERS that can be used to query the driver
how many predicate registers it supports (currently it would be 1).
---
 src/gallium/include/pipe/p_defines.h   |2 +
 src/gallium/include/pipe/p_shader_tokens.h |  117 ---
 2 files changed, 20 insertions(+), 99 deletions(-)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 52887ea..6a61aea 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -333,6 +333,8 @@ enum pipe_transfer_usage {
 #define PIPE_CAP_MAX_VERTEX_TEXTURE_UNITS 26
 #define PIPE_CAP_TGSI_CONT_SUPPORTED 27
 #define PIPE_CAP_BLEND_EQUATION_SEPARATE 28
+#define PIPE_CAP_SM3 29  /* Shader Model 3 supported */
+#define PIPE_CAP_MAX_PREDICATE_REGISTERS 30
 
 
 /**
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index de338c4..d4c8aad 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -1,6 +1,7 @@
 /**
  * 
  * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas.
+ * Copyright 2009 VMware, Inc.
  * All Rights Reserved.
  * 
  * Permission is hereby granted, free of charge, to any person obtaining a
@@ -25,8 +26,8 @@
  * 
  **/
 
-#ifndef TGSI_TOKEN_H
-#define TGSI_TOKEN_H
+#ifndef P_SHADER_TOKENS_H
+#define P_SHADER_TOKENS_H
 
 #ifdef __cplusplus
 extern C {
@@ -79,6 +80,7 @@ enum tgsi_file_type {
TGSI_FILE_ADDRESS =6,
TGSI_FILE_IMMEDIATE   =7,
TGSI_FILE_LOOP=8,
+   TGSI_FILE_PREDICATE   =9,
TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
 };
 
@@ -319,7 +321,6 @@ struct tgsi_instruction
  * instruction, including the instruction word.
  */
 
-#define TGSI_INSTRUCTION_EXT_TYPE_NV0
 #define TGSI_INSTRUCTION_EXT_TYPE_LABEL 1
 #define TGSI_INSTRUCTION_EXT_TYPE_TEXTURE   2
 #define TGSI_INSTRUCTION_EXT_TYPE_PREDICATE 3
@@ -332,9 +333,6 @@ struct tgsi_instruction_ext
 };
 
 /*
- * If tgsi_instruction_ext::Type is TGSI_INSTRUCTION_EXT_TYPE_NV, it should
- * be cast to tgsi_instruction_ext_nv.
- * 
  * If tgsi_instruction_ext::Type is TGSI_INSTRUCTION_EXT_TYPE_LABEL, it
  * should be cast to tgsi_instruction_ext_label.
  * 
@@ -348,56 +346,11

[Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.

2009-10-30 Thread michal

gallium: Add a PREDICATE register file.

There's already a shader token that allows composition of predicated
instructions (tgsi_instruction_ext_predicate). However, there is no way
one can write to thos predicate registers in the first place.
---
 src/gallium/include/pipe/p_shader_tokens.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index de338c4..6aa8b27 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -79,6 +79,7 @@ enum tgsi_file_type {
TGSI_FILE_ADDRESS =6,
TGSI_FILE_IMMEDIATE   =7,
TGSI_FILE_LOOP=8,
+   TGSI_FILE_PREDICATE   =9,
TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
 };
 
-- 
1.6.4.msysgit.0


--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.

2009-10-30 Thread michal

Keith Whitwell pisze:
 On Fri, 2009-10-30 at 03:43 -0700, michal wrote:
   
 gallium: Add a PREDICATE register file.

 There's already a shader token that allows composition of predicated
 instructions (tgsi_instruction_ext_predicate). However, there is no way
 one can write to thos predicate registers in the first place.
 ---
  src/gallium/include/pipe/p_shader_tokens.h |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
 b/src/gallium/include/pipe/p_shader_tokens.h
 index de338c4..6aa8b27 100644
 --- a/src/gallium/include/pipe/p_shader_tokens.h
 +++ b/src/gallium/include/pipe/p_shader_tokens.h
 @@ -79,6 +79,7 @@ enum tgsi_file_type {
 TGSI_FILE_ADDRESS =6,
 TGSI_FILE_IMMEDIATE   =7,
 TGSI_FILE_LOOP=8,
 +   TGSI_FILE_PREDICATE   =9,
 TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
  };
 

 Michal,

 Is your expectation that all drivers become able to understand
 instructions with predicates?  That seems unreasonable.

 What is the expected way of setting a predicate register?  What
 functionality will use this?

   
For example:

DECL IN[0..1]
DECL OUT[0]
DECL PRED[0]

1: MOV OUT[0], IN[0]
2: SGT PRED[0], IN[0], IN[1]
3: (PRED[0]) MOV OUT[0], IN[1]

In (2) we set each component of PRED[0] to 1.0 if the corresponding 
components of IN[0] are greater than IN[1], and to 0.0 otherwise.
In (3) we write IN[1] to only those components of OUT[0], for which the 
respective components of PRED[0] are non-zero.

 It seems there are three ways to do conditional execution in TGSI
 currently -- predicates, condition codes and IF/THEN/ELSE instructions.

 I'd really prefer to have at most two, and in fact preferably just one.
 Can you take a look at the three alternatives and figure out if one can
 be amputated?

   
We could kill off the condition codes -- no driver uses that, and it's 
easier for us to emulate them with predicates than the other way round.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.

2009-10-30 Thread michal


Brian Paul pisze:

Keith Whitwell wrote:
  

On Fri, 2009-10-30 at 04:36 -0700, michal wrote:


Keith Whitwell pisze:
  

On Fri, 2009-10-30 at 03:43 -0700, michal wrote:
  


gallium: Add a PREDICATE register file.

There's already a shader token that allows composition of predicated
instructions (tgsi_instruction_ext_predicate). However, there is no way
one can write to thos predicate registers in the first place.
---
 src/gallium/include/pipe/p_shader_tokens.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h

index de338c4..6aa8b27 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -79,6 +79,7 @@ enum tgsi_file_type {
TGSI_FILE_ADDRESS =6,
TGSI_FILE_IMMEDIATE   =7,
TGSI_FILE_LOOP=8,
+   TGSI_FILE_PREDICATE   =9,
TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
 };

  

Michal,

Is your expectation that all drivers become able to understand
instructions with predicates?  That seems unreasonable.

What is the expected way of setting a predicate register?  What
functionality will use this?

  


For example:

DECL IN[0..1]
DECL OUT[0]
DECL PRED[0]

1: MOV OUT[0], IN[0]
2: SGT PRED[0], IN[0], IN[1]
3: (PRED[0]) MOV OUT[0], IN[1]

In (2) we set each component of PRED[0] to 1.0 if the corresponding 
components of IN[0] are greater than IN[1], and to 0.0 otherwise.
In (3) we write IN[1] to only those components of OUT[0], for which the 
respective components of PRED[0] are non-zero.


  

It seems there are three ways to do conditional execution in TGSI
currently -- predicates, condition codes and IF/THEN/ELSE instructions.

I'd really prefer to have at most two, and in fact preferably just one.
Can you take a look at the three alternatives and figure out if one can
be amputated?

  

We could kill off the condition codes -- no driver uses that, and it's 
easier for us to emulate them with predicates than the other way round.
  

I think I agree with that.  Condition codes are pretty wierd, the only
reason I'd keep them around is that there is the NV GPU4 extension
sitting there as a ready-made definition of a high-end SM4-level
assembly language.

I don't know if Ian plans to introduce a MESA version of the program4
extension that more closely matches his program3 extension (ie
predicates instead of condition codes).



Just FYI: GL_NV_fragment_program uses condition codes but we haven't 
supported that extension with Gallium; only the ARB versions.


  


We could always remove condition codes later, when Ian decides about 
their future.


Attached is an updated patch that obsoletes one TGSI token and fixes the 
other one, so we can specify swizzles and negation of predicate 
registers, per GL_MESA_gpu_program3.


Thanks for comments.
From d1efdb692ae99871585554ddeaff75e700349d70 Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Fri, 30 Oct 2009 13:55:14 +
Subject: [PATCH] gallium: Add a PREDICATE register file.

There is little point in having a special TGSI token just to handle
predicate register updates.

Remove tgsi_dst_register_ext_predicate token and instead use
a new PREDICATE register file to update predicates. Actually, the contents
of the obsolete token are being moved to tgsi_instruction_ext_predicate,
where they should be from the very beginning.
---
 src/gallium/include/pipe/p_shader_tokens.h |   40 +++-
 1 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index de338c4..f3b8a7b 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -79,6 +79,7 @@ enum tgsi_file_type {
TGSI_FILE_ADDRESS =6,
TGSI_FILE_IMMEDIATE   =7,
TGSI_FILE_LOOP=8,
+   TGSI_FILE_PREDICATE   =9,
TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
 };
 
@@ -427,11 +428,15 @@ struct tgsi_instruction_ext_texture
 
 struct tgsi_instruction_ext_predicate
 {
-   unsigned Type : 4;/* TGSI_INSTRUCTION_EXT_TYPE_PREDICATE */
-   unsigned PredDstIndex : 4;/* UINT */
-   unsigned PredWriteMask: 4;/* TGSI_WRITEMASK_ */
-   unsigned Padding  : 19;
-   unsigned Extended : 1;/* BOOL */
+   unsigned Type : 4;/* TGSI_INSTRUCTION_EXT_TYPE_PREDICATE */
+   unsigned PredSwizzleX : 2;/* TGSI_SWIZZLE_ */
+   unsigned PredSwizzleY : 2;/* TGSI_SWIZZLE_ */
+   unsigned PredSwizzleZ : 2;/* TGSI_SWIZZLE_ */
+   unsigned PredSwizzleW : 2;/* TGSI_SWIZZLE_ */
+   unsigned PredSrcIndex : 4;/* UINT */
+   unsigned Negate   : 1;/* BOOL */
+   unsigned Padding  : 14;
+   unsigned Extended : 1;/* BOOL */
 };
 
 /**
@@ -548,7 +553,6 @@ struct tgsi_dst_register
 
 #define

Re: [Mesa3d-dev] [PATCH] gallium: Add a PREDICATE register file.

2009-10-30 Thread michal


Keith Whitwell pisze:

On Fri, 2009-10-30 at 10:19 -0700, michal wrote:
  

Brian Paul pisze:


Keith Whitwell wrote:
  
  

On Fri, 2009-10-30 at 04:36 -0700, michal wrote:



Keith Whitwell pisze:
  
  

On Fri, 2009-10-30 at 03:43 -0700, michal wrote:
  



gallium: Add a PREDICATE register file.

There's already a shader token that allows composition of predicated
instructions (tgsi_instruction_ext_predicate). However, there is no way
one can write to thos predicate registers in the first place.
---
 src/gallium/include/pipe/p_shader_tokens.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h

index de338c4..6aa8b27 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -79,6 +79,7 @@ enum tgsi_file_type {
TGSI_FILE_ADDRESS =6,
TGSI_FILE_IMMEDIATE   =7,
TGSI_FILE_LOOP=8,
+   TGSI_FILE_PREDICATE   =9,
TGSI_FILE_COUNT  /** how many TGSI_FILE_ types */
 };

  
  

Michal,

Is your expectation that all drivers become able to understand
instructions with predicates?  That seems unreasonable.

What is the expected way of setting a predicate register?  What
functionality will use this?

  



For example:

DECL IN[0..1]
DECL OUT[0]
DECL PRED[0]

1: MOV OUT[0], IN[0]
2: SGT PRED[0], IN[0], IN[1]
3: (PRED[0]) MOV OUT[0], IN[1]

In (2) we set each component of PRED[0] to 1.0 if the corresponding 
components of IN[0] are greater than IN[1], and to 0.0 otherwise.
In (3) we write IN[1] to only those components of OUT[0], for which the 
respective components of PRED[0] are non-zero.


  
  

It seems there are three ways to do conditional execution in TGSI
currently -- predicates, condition codes and IF/THEN/ELSE instructions.

I'd really prefer to have at most two, and in fact preferably just one.
Can you take a look at the three alternatives and figure out if one can
be amputated?

  


We could kill off the condition codes -- no driver uses that, and it's 
easier for us to emulate them with predicates than the other way round.
  
  

I think I agree with that.  Condition codes are pretty wierd, the only
reason I'd keep them around is that there is the NV GPU4 extension
sitting there as a ready-made definition of a high-end SM4-level
assembly language.

I don't know if Ian plans to introduce a MESA version of the program4
extension that more closely matches his program3 extension (ie
predicates instead of condition codes).


Just FYI: GL_NV_fragment_program uses condition codes but we haven't 
supported that extension with Gallium; only the ARB versions.


  
  
We could always remove condition codes later, when Ian decides about 
their future.


Attached is an updated patch that obsoletes one TGSI token and fixes the 
other one, so we can specify swizzles and negation of predicate 
registers, per GL_MESA_gpu_program3.


Thanks for comments.



OK, I think I'd prefer to remove condition codes as part of this --
there are no users for them (that we care about), and we don't want
drivers to have to implement both techniques.

If in the future we want condition codes in the mesa state tracker,
we'll have to do the work of converting them to predicates and/or
IF/THEN/ELSE, but it will probably less effort than trying to teach all
the drivers about condition codes.

Can I ask for a third version that removes condition codes?

In terms of drivers supporting this, we probably want another pipe_cap
flag, probably PIPE_CAP_GPU3, to indicate that a particular driver has
GPU3/SM3 support.  Can you add that to the interface as well?

  


Attached third version.
From d3102528484decff0a6d1effb27545c4d76976d1 Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Fri, 30 Oct 2009 18:19:52 +
Subject: [PATCH] gallium: Cleanup predicate and condition code TGSI tokens.

There is little point in having a special TGSI token just to handle
predicate register updates. Remove tgsi_dst_register_ext_predicate token
and instead use a new PREDICATE register file to update predicates.
Actually, the contents of the obsolete token are being moved
to tgsi_instruction_ext_predicate, where they should be
from the very beginning.

Remove the NVIDIA-specific condition code tokens -- nobody uses them
and they can be emulated with predicates if needed.

Introduce PIPE_CAP_GPU3 that indicates whether a driver supports
SM3-level instructions, and in particular predicates.
---
 src/gallium/include/pipe/p_defines.h   |1 +
 src/gallium/include/pipe/p_shader_tokens.h |  111 
 2 files changed, 17 insertions(+), 95 deletions(-)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 52887ea..98cb9e8 100644
--- a/src

[Mesa3d-dev] drawing with elements out of range

2009-10-09 Thread michal

I've been able to crash my app that uses a gallium driver by feeding the 
draw module an index buffer with garbage contents.

Is there a desire to add out-of-bounds checking of every index element, 
or is it being ignored on purpose for performance reasons?

Thanks.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] drawing with elements out of range

2009-10-09 Thread michal

Keith Whitwell pisze:
 On Fri, 2009-10-09 at 04:10 -0700, michal wrote:
   
 I've been able to crash my app that uses a gallium driver by feeding the 
 draw module an index buffer with garbage contents.

 Is there a desire to add out-of-bounds checking of every index element, 
 or is it being ignored on purpose for performance reasons?
 

 Michal,

 There is code that should prevent this, but it probably doesn't get
 heaps of testing.

 Probably the best thing to do is provide a trivial/ example that
 exercises the problem you're seeing.


   
While I would agree otherwise, there is a high chance the test app won't 
trigger a segfault.

I am happy the intent is to check element indirections and should be 
able to provide a patch for it for a review instead.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [PATCH] draw: Do an out-of-bounds check on array elements.

2009-10-09 Thread michal

 From 5ebc14fc47a5e31b3c6be54142550bdf2ac093df Mon Sep 17 00:00:00 2001
From: Michal Krol mic...@vmware.com
Date: Fri, 9 Oct 2009 13:30:52 +0100
Subject: [PATCH] draw: Do an out-of-bounds check on array elements.

Do not draw a reduced primitive if any of its vertices
reaches outside of the vertex array.
---
 src/gallium/auxiliary/draw/draw_pipe.c |   68 
++-
 1 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe.c 
b/src/gallium/auxiliary/draw/draw_pipe.c
index 1c6d657..5b88f00 100644
--- a/src/gallium/auxiliary/draw/draw_pipe.c
+++ b/src/gallium/auxiliary/draw/draw_pipe.c
@@ -158,37 +158,64 @@ static void do_triangle( struct draw_context *draw,
 
 
 
-#define QUAD(i0,i1,i2,i3)   \
+#define QUAD(i0,i1,i2,i3) do 
{\
+   uint e0 = 
(uint)elts[i0];  \
+   uint e1 = 
(uint)elts[i1];  \
+   uint e2 = 
(uint)elts[i2];  \
+   uint e3 = 
(uint)elts[i3];  \
+   if (e0 = vertex_count || e1 = vertex_count || e2 = vertex_count 
||  \
+   e3 = vertex_count) 
{  \
+  
break;  \
+   
}  \
do_triangle( draw,   \
 ( DRAW_PIPE_RESET_STIPPLE | \
   DRAW_PIPE_EDGE_FLAG_0 |   \
   DRAW_PIPE_EDGE_FLAG_2 ),  \
-verts + stride * elts[i0],  \
-verts + stride * elts[i1],  \
-verts + stride * elts[i3]); \
+   verts + stride * 
e0,   \
+   verts + stride * 
e1,   \
+   verts + stride * 
e3);  \
do_triangle( draw,   \
 ( DRAW_PIPE_EDGE_FLAG_0 |   \
   DRAW_PIPE_EDGE_FLAG_1 ),  \
-verts + stride * elts[i1],  \
-verts + stride * elts[i2],  \
-verts + stride * elts[i3])
-
-#define TRIANGLE(flags,i0,i1,i2)\
+   verts + stride * 
e1,   \
+   verts + stride * 
e2,   \
+   verts + stride * 
e3);  \
+} while (0)
+
+#define TRIANGLE(flags,i0,i1,i2) do 
{ \
+   uint e0 = (uint)elts[i0]  
~DRAW_PIPE_FLAG_MASK;   \
+   uint e1 = 
(uint)elts[i1];  \
+   uint e2 = 
(uint)elts[i2];  \
+   if (e0 = vertex_count || e1 = vertex_count || e2 = vertex_count) 
{  \
+  
break;  \
+   
}  \
do_triangle( draw,   \
 elts[i0],  /* flags */  \
-verts + stride * (elts[i0]  ~DRAW_PIPE_FLAG_MASK), \
-verts + stride * elts[i1],  \
-verts + stride * elts[i2])
-
-#define LINE(flags,i0,i1)   \
+   verts + stride * 
e0,   \
+   verts + stride * 
e1,   \
+   verts + stride * 
e2);  \
+} while (0)
+
+#define LINE(flags,i0,i1) do 
{\
+   uint e0 = (uint)elts[i0]  
~DRAW_PIPE_FLAG_MASK;   \
+   uint e1 = 
(uint)elts[i1];  \
+   if (e0 = vertex_count || e1 = vertex_count) 
{\
+  
break;  \
+   
}  \
do_line( draw,   \
 elts[i0],   \
-verts + stride * (elts[i0]  ~DRAW_PIPE_FLAG_MASK), \
-verts + stride * elts[i1])
-
-#define POINT(i0)   \
+   verts + stride * 
e0,   \
+   verts + stride * 
e1);  \
+} while (0)
+
+#define POINT(i0) do

Re: [Mesa3d-dev] [PATCH] draw: Do an out-of-bounds check on array elements.

2009-10-09 Thread michal

Keith Whitwell pisze:
 Michal,

 Sorry, this isn't a great way to do this.  This can usually be caught
 much earlier in the pipeline and with much less overhead by validating
 the incoming index list.

   
OK, so we scan the whole element array beforehand, and if any element is 
out of range, we kill the while primitive, right?

 We normally do that in Mesa or the state tracker, if that helps.

   
Does this mean we actually don't want to check that in the draw module 
and we should deal with it on the state tracker level?

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

1 2 >

1 - 100 of 103 matches

Mail list logo