How many local variables should we need per shader kernel?  And by
local variables, I mean 32-bit registers.

If we reserve 6-bit fields, that gives us 64 scalars or 16 vectors.  I
think that that's not enough.

8-bit fields gives us 256 scalars or 64 vectors, but I'm afraid of
them going unused, wasting tons of chip area.

We need to design for the worst case, though.  Say we has 32 registers
but also included "scratch memory", like as a global dcache that
spills into graphics memory.  The problem is that if the demand for
access to this exceeds its size, it'll start thrashing, and
performance will bog down badly.  Moreover, since we have a global
dcache for surfaces anyhow, we might as well just use that.  The
overhead for all of these threads accessing their own dynamically
allocated memory would be massive, though.

Another option is to have a limited global scratch memory that threads
can semi-dynamically access.  Or in other words, since all kernels
have the same demand, take the number of words in the global scratch
memory and divide by the number of threads.  Or another way to look at
it, if threads demand X variables beyond their local register file,
then divide the number of words in scratch space by X, and that's the
max number of kernels that can be running at one time.

-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to