On Thu, 2010-01-07 at 05:25 -0800, Zack Rusin wrote: > On Thursday 07 January 2010 06:50:36 José Fonseca wrote: > > I wonder if storage size of registers is such a big issue. Knowing the > > storage size of a register matters mostly for indexable temps. For > > regular assignments and intermediate computations storage everything > > gets transformed in SSA form, and the register size can be determined > > from the instructions where it is generated/used and there is no need > > for consistency. > > > > For example, imagine a shader that has: > > > > TEX TEMP[0], SAMP[0], IN[0] // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT > > --> use 4x32bit float registers MAX ?? > > ... > > TEX TEMP[0], SAMP[1], IN[0] // SAMP[1] is a > > PIPE_FORMAT_R64G64B64A64_FLOAT --> use 4x64bit double registers DMAX ????, > > TEMP[0], ??? > > That's not an issue because such a format doesn't exist. There's no 256bit > sampling in any api. It's one of the self-inflicted wounds that we have. > R64G64 > is the most you'll get right now.
That's interesting. Never realized that. > > TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both > > PIPE_FORMAT_R8G8B8A8_UNORM --> use 4x8bit unorm registers MOV OUT[0], > > TEMP[0] > > > > etc. > > > > There is actually programmable 3d hardware out there that has special > > 4x8bit registers, and for performance the compiler has to deduct where > > to use those 4xbit. llvmpipe will need to do similar thing, as the > > smaller the bit-width the higher the throughput. And at least current > > gallium statetrackers will reuse temps with no attempt to maintain > > consistency in use. > > > > So if the compilers already need to deal with this, if this notion that > > registers are 128bits is really necessary, and will prevail in the long > > term. > > Somehow this is the core issue it's the fact that TGSI is untyped anything > but > "register size" is constant implies "TGSI is typed but the actual types have > to be deduced by the drivers" which goes against what Gallium was about (we > put the complexity in the driver). > > The question of 8bit vs 32bit and 64bit vs 32bit are really different > questions. The first one is about optimization - it will work perfectly well > if > the 128bit registers will be used, the second one is about correctness - it > will not work if 128bit registers will be used for doubles and it will not > work if 256bit registers will be used for floats. True. > Also we don't have a 4x8bit > instructions, they're all 4x32bit instructions (float, unsigned ints, signed > ints), so doubles will be the first differently sized instructions. Which in > turn will mean that either TGSI will have to be actually statically typed, > but > not typed declared i.e. D_ADD will only be able to take two 256bit registers > as inputs and if anything else is passed it has to throw an error, which is > especially difficult that those registers didn't have a size declared but it > would have to be inferred from previous instructions, or we'd have to allow > mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 which > simply moves the problem from above into the driver. > > Really, unless we'll say "the entire pipeline can run in 4x64" like we did > for > floats then I don't see an easier way of dealing with this than the xy, zw, > swizzle form. Ok. I didn't felt strongly either way, but now I'm more convinced that restricting xy zw swizzles is less painful. Thanks for explaining this Zack. Jose ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev