On Thu, 2010-01-07 at 05:25 -0800, Zack Rusin wrote:
> On Thursday 07 January 2010 06:50:36 José Fonseca wrote:
> > I wonder if storage size of registers is such a big issue. Knowing the
> > storage size of a register matters mostly for indexable temps. For
> > regular assignments and intermediate computations storage everything
> > gets transformed in SSA form, and the register size can be determined
> > from the instructions where it is generated/used and there is no need
> > for consistency.
> > 
> > For example, imagine a shader that has:
> > 
> >    TEX TEMP[0], SAMP[0], IN[0]  // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT
> >  --> use 4x32bit float registers MAX ??
> >    ...
> >    TEX TEMP[0], SAMP[1], IN[0]  // SAMP[1] is a
> >  PIPE_FORMAT_R64G64B64A64_FLOAT --> use 4x64bit double registers DMAX ????,
> >  TEMP[0], ???
> 
> That's not an issue because such a format doesn't exist. There's no 256bit 
> sampling in any api. It's one of the self-inflicted wounds that we have. 
> R64G64 
> is the most you'll get right now.

That's interesting. Never realized that.

> >    TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both 
> >  PIPE_FORMAT_R8G8B8A8_UNORM  --> use 4x8bit unorm registers MOV OUT[0],
> >  TEMP[0]
> > 
> > etc.
> > 
> > There is actually programmable 3d hardware out there that has special
> > 4x8bit registers, and for performance the compiler has to deduct where
> > to use those 4xbit. llvmpipe will need to do similar thing, as the
> > smaller the bit-width the higher the throughput. And at least current
> > gallium statetrackers will reuse temps with no attempt to maintain
> > consistency in use.
> > 
> > So if the compilers already need to deal with this, if this notion that
> > registers are 128bits is really necessary, and will prevail in the long
> > term.
> 
> Somehow this is the core issue it's the fact that TGSI is untyped anything 
> but 
> "register size" is constant implies "TGSI is typed but the actual types have 
> to be deduced by the drivers" which goes against what Gallium was about (we 
> put the complexity in the driver). 
> 
> The question of 8bit vs 32bit and 64bit vs 32bit are really different 
> questions. The first one is about optimization - it will work perfectly well 
> if 
> the 128bit registers will be used, the second one is about correctness - it 
> will not work if 128bit registers will be used for doubles and it will not 
> work if 256bit registers will be used for floats. 

True.

> Also we don't have a 4x8bit 
> instructions, they're all 4x32bit instructions (float, unsigned ints, signed 
> ints), so doubles will be the first differently sized instructions. Which in 
> turn will mean that either TGSI will have to be actually statically typed, 
> but 
> not typed declared i.e. D_ADD will only be able to take two 256bit registers 
> as inputs and if anything else is passed it has to throw an error, which is 
> especially difficult that those registers didn't have a size declared but it 
> would have to be inferred from previous instructions, or we'd have to allow 
> mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 which 
> simply moves the problem from above into the driver.
> 
> Really, unless we'll say "the entire pipeline can run in 4x64" like we did 
> for 
> floats then I don't see an easier way of dealing with this than the xy, zw, 
> swizzle form.

Ok. I didn't felt strongly either way, but now I'm more convinced that
restricting xy zw swizzles is less painful. Thanks for explaining this
Zack.

Jose


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to